IBM_Computation_Seminar_Dec49 IBM Computation Seminar Dec49

IBM_Computation_Seminar_Dec49 IBM_Computation_Seminar_Dec49

User Manual: IBM_Computation_Seminar_Dec49
Open the PDF directly: View PDF .
Page Count: 173
Download
Open PDF In Browser	View PDF
PROCEEDINGS

Computation
Seminar
DECEMBER

1949

EDITED BY IBM APPLIED SCIENCE DEPARTMENT
CUTHBERT C. HURD,

INTERN ATION AL

BUSINESS

NEW

YORK

+

Director

MACHINES
NEW

YORK

CORPORATION

Copyright 1951
International Business Machines Corporation
590 Madison Avenu(!, New York 22, N. Y.
Form 22-8342-0

P

R

I

N

TED

I

N

THE

U

NIT

E

D

S

TAT

E

S

o

F

AMERICA

FOREWORD

ACOMPUTATION SEMINAR, sponsored by the Inter-

.n

national Business Machines Corporation, was

held in the IBM Department of Education, Endicott,
New York, from December 5 to 9, 1949. Attending the
Seminar were one hundred and seven research engineers
and scientists who are experienced both in applying
mathematical methods to the solution of physical problems and in the associated punched card methods of
computation. Consequently, these Proceedings represent
a valuable contribution to the computing art. The International Business Machines Corporation wishes to express its appreciation to all those who participated in
this Seminar.

CONTENTS
The Future of High-Speed Computing

-JOHN VON NEUMANN

Some Methods of Solving Hyperbolic and Parabolic
Partial Differential Equations

-RICHARD

Numerical Solution of Partial Differential Equations

-EVERETT C. YOWELL.

An Eigenvalue Problem of the Laplace Operator

-HARRY H. HUMMEL

29

A Numerical Solution for Systems of Linear Differential
Eq~ations Occu"ing in Problems of Structures

- P AUL E. BISCH

35

Matrix Methods

-KAISER S. KUNZ

37

Inversion of an Alternant Matrix

-BONALYN A. LUCKEY

43

Matrix Multiplication on the IBM Card-Programmed
Electronic Calculator

-JOHN P. KELLY

Machine Methods for Finding Characteristic Roots
of a Matrix

-FRANZ

Solution of Simultaneous Linear Algebraic Equations
Using the IBM Type 604 ElectronicCa/culating Punch

-JOHN LOWE.

54

Rational Approximation in High-Speed Computing

--CECIL HASTINGS, JR.

57

The Construction of Tables

-PAUL HERGET.

62

A Description of Several Optimum Interval Tables

-STUART L. CROSSMAN

67

Table Interpolation Employing the IBM Type 604
Electronic Ca/Culating Punch

-EVERETT KIMBALL, JR.

69

W.

L.

13

14

HAMMING

•

.

.

.

•

.

.

.

•

.

.

•

•

- F . N. FRENKIEL

The Monte Carlo Method and Its Applications

-MARK KAC

71

H. POLACHEK

74

M. D. DONSKER •

A Punched Card Application of the Monte Carlo Method

47
49

ALT

An Algorithm for Fitting a Polynomial through n
Given Points

24

-Po C. JOHNSON
F. C. UFFELMAN

82

A Monte Carlo Method of Solving Laplace's Equation

-EVERETT C. YOWELL

89

Further Remarks on Stochastic Methods in Quantum Mechanics

-GILBERT

Standard Methods of AnalyZing Data

-JOHN

(presented by

EDWARD

W.

BAILEY)

The Applications of Machine Methods to Analysis of
Variance and Multiple Regression

W.

W.

92

KING.

95

TUKEY .

-ROBERT J. MONROE .

.

.

.

•

.

113

Examples of Enumeration Statistics

- W . WAYNE COULTER.

117

Transforming Theodolite Data

-HENR Y SCHUTZBERGER

119

Minimum Volume Calculations with Many Operations
on the IBM Type 604 Electronic Calculating Punch

-WILLIAM

D.

Transition from Problem to Card Program

-GREGORY

J.

Best Starting Values for an Iterative Process of Taking Roots

-PRESTON C. HAMMER

132

Improvement in the Convergence of MethodJ of
Successive Approximation

-

L. RICHARD TURNER .

135

Single Order Reduction of a Complex Matrix

-RANDALL

E.

138

Simplification of Statistical Computations as Adapted
to a Pttnched Card Service Bureau

- W . T. SOUTHWORTH

J. E.

124

BELL .

128

TOBEN

PORTER .

BACHELDER

......

141

Forms of Analysis for Either Measurement or
Enumeration Data Amenable to Machine Methods

-A. E.

Remarks on the IBM Relay Calculator

-MARK LOTKIN.

An Improved Punched Card Method for Crystal Structure
Factor Calculations

-MANDALAY

The Calculation of Complex Hypergeometric Functions
with the IBM T)lpe 602~A Calculating Punch

-HARVEY GELLMAN

The Calculation of Roots of Complex Polynomials
Using the IBM Type 602-A Calculating Punch

-JOHN LOWE

.....

169

Practical Inversion of Matrices of High Order

-WILLIAM

D.

GUTSHALL .

171

149

BRANDT

D.

154

.

158

......

161

GREMS

.

.

.

.

PARTICIPANTS
Associate Chief

ALT, FRANZ L.,

Computation Laboratory, National Bureau of Standards
Washington, D. C.

Assistant Professor of Mathematics

ARNOLD, KENNETH J.,

University of Wisconsin
Madison, Wisconsin

Statistician

Y-12 Plant, Carbide and Carbon Chemicals Corporation
Oak Ridge, Tennessee
BARBER, E. A.

H., Professor of Chemistry

Cornell University
Ithaca, New York

Senior Physicist

Mathematician

DISMUKE, NANCY M.,

Oak Ridge National Laboratory
Oak Ridge, Tennessee

Vice-President

Telecomputing Corporation
Burbank, California

Mathematician

BENNETT, CARL A.,

Statistician

General Electric Company
Richland, Washington
BERMAN, JULIAN H.,

Flutter Analyst

BINGHAM, RONALD H.,

Research Specialist

Ansco Division of General Aniline and Film Corporation
Binghamton, New York

Engineer

In Charge of Special Structures, North American Aviation, Incorporated
Los Angeles, California

Professor of Chemistry

Cornell University
Ithaca, New York

Biometrician

Atomic Energy Commission
New York, New York
BRILLOUlN, LEON,

Analytical EnJ!,ineer

Director

Electronic Education, IBM Corporation
New York, New York
CLARK, H. KENNETH

Engineer

III, Supervisor

Tabulating Division, The Pennsylvania State College
State College, Pennsylvania
ECKERT, WALLACE J.,

Director

FERBER, BENJAMIN,

Research Engineer

Consolidated V ultee Aircraft Corporation
San Diego, California
FINLAYSON,

L. D., Process Control and Product Engineer

Corning Glass Works
Corning, New York
GELLMAN, HARVEY,

Staff Mathematician

Computation Centre, McLennan Laboratory, University of Toronto
Toronto, Ontario
GOODMAN,

L.

E.,

Assistant Professor of Civil Engineering

Graduate College, University of Illinois
Urbana, Illinois
GOTLIEB, CALVIN C.,

Acting Director

Computation Centre, McLennan Laboratory, University of Toronto
Toronto, Ontario
GREMS, MANDALAY D.,

Analytical Engineer

General Electric Company
Schenectady, New York

Senior Staff Member

Watson Scientific Computing Laboratory, IBM Corporation
New York, New York
HAMMER, PRESTON C.,

Staff Member

Los Alamos Scientific Laboratory, University of California
Los Alamos, New Mexico

General Electric Company
Schenectady, New York
COULTER, W. WAYNE,

DYE, WILLIAM S.

GROSCH, H. R. ].,

Department of Pure Science, IBM Corporation
New York, New York
CONCORDIA, CHARLES,

L., Astronomer

DUNCOMBE, RAYNOR

Department of Pure Science, IBM Corporation
New York, New York

Fairchild Aircraft Corporation
Hagerstown, Maryland

BRANDT, A. E.,

DUKE, JAMES B.,

Nautical Almanac Division, U. S. Naval Observatory
Washington, D. C.

Cryogenic Laboratory, Ohio State University
Columbus, Ohio

BISCH, PAUL E.,

Research Physicist

Corning Glass Works
Corning, New York

Hamilton Standard Division, United Aircraft Corporation
East Hartford, Connecticut

BELL, WILLIAM D.,

BRAGG, JOHN,

CURL, GILBERT H.,

DOCKERTY, STUART M.,

Engineering Laboratory, IBM Corporation
Endicott, New York

BELZER, JACK,

Computing Laboratory, United Aircraft Corporation
East Hartford, Connecticut

Navy Electronics Laboratory
San Diego, California

BAILEY, EDWARD W.,

BAUER, S.

Group Supervisor

CROSSMAN, STUART L.,

Assistant Director of Research

International Chiropractors Association
Davenport, Iowa

HAMMING, RICHARD W.,

Mathematician

Bell Telephone Laboratories, Incorporated
Murray Hill, New Jersey

V.

HANKAM, ERIC

KING, GILBERT W.,

Watson Scientific Computing Laboratory, IBM Corporation
New York, New York
HARDER, EDWIN L.,

Consulting Transmission Engineer

T., Project Engineer

Computation Branch, Air Materiel Command
Wright Field, Dayton, Ohio
HASTINGS, CECIL JR.,

Associate Mathematician

The RAND Corporation
Santa Monica, California
HEISER, DONALD H.,

Mathematician

Chief

Director

Cincinnati Observatory, University of Cincinnati
Cincinnati, Ohio
HORNER, JOHN

T., Project Engineer

Allison Division, General Motors Corporation
Indianapolis, Indiana
HUMMEL, HARRY H.,

Associate Physicist

Argonne National Laboratory
Lemont, Illinois
HUNTER,

J. STUART, Assistant Statistician
Director

JOHNSON, PHYLLIS C.,

Statistician

Y-12 Plant, Carbide and Carbon Chemicals Corporation

Oak Ridge, Tennessee

.

JOHNSON , WALTER H.

Applied Science Department, IBM Corporation
New York, New York

Professor of Mathematics

Cornell University
Ithaca, New York

KUNZ, KAISER S.,

Associate Profeuor of Electrical Engineering

LEVIN , JOSEPH

Computation Laboratory, National Bureau of Standards
Washington, D. C.
LOTKIN, MARK,

Mathematician

Ballistic Research Laboratories, Aberdeen Proving Ground
Aberdeen, Maryland
LOWE, JOHN,

Staff Assistant

Engineering Tabulating, Douglas Aircraft Company, Incorporated
Santa Monica, California

Engineering Assistant.

LUCKEY, BONALYN A.,

MADDEN, JOHN D.,

Mathematician

MALONEY, CLIFFORD J.,

Chief

Statistical Branch, Camp Detrick
Frederick, Maryland
MARSH, H. WYSOR, JR.,

u. S.

Chief Mathematics Consultant

Navy Underwater Sound Laboratory
New London, Connecticut

MCPHERSON, JOHN C.,

Vice-President

IBM Corporation
New YorK, New York
MITCHELL, WILBUR L.,

Mathematician

Holloman Air Force Base
Alamogordo, New Mexico

KEAST, FRANCIS H.,

Chief Aerodynamicist

Gas Turbine Division, A. V. Roe, Canada, Limited
Malton, Ontario
KELLER, ALLEN

Institute of Statistics, University of North Carolina
Raleigh, North Carolina

T., Technical Assistant to the Chief

Statistical Division, U. S. Air Force, Wright Field
Dayton, Ohio

Head

Central Statistical Laboratory,
K-25 Plant, Carbide and Carbon Chemicals Corporation
Oak Ridge, Tennessee
KIMBALL, EVERETT, JR.,

MONROE, ROBERT J.

MORRIS, PERCY

General Electric Company
Lynn, Massachusetts
KELLY, JOHN P.,

KRAWITZ, ELEANOR

The RAND Corporation
Santa Monica, California

Applied Science Department, IBM Corporation
New York, New York

KAC, MARK,

Aerodynamicist

Turbine Engineering Division, General Electric Company
Schenectady, New York

General Electric Company
Schenectady, New York

University of North Carolina
Raleigh, North Carolina
HURD, CUTHBERT C.,

KRAFT, HANS,

Case Institute of Technology
Cleveland, Ohio

Office of Air Research, Air Materiel Command, Wright Field
Dayton, Ohio
HERGET, PAUL,

Aeronautical Engineer

Watson Scientific Computing Laboratory. IBM Corporation
New York, New York

U. S. Naval Proving Ground
Dahlgren, Virginia
HENRY, HARRY C.,

KOCH, WARREN B.,

Glenn L. Martin Company
Baltimore, Maryland

Westinghouse Electric Company
East Pittsburgh, Pennsylvania
HASTINGS, BRIAN

Research Chemi.rt

Arthur D. Little, Incorporated, and Research Laboratory for Electronics
Massachusetts Institute of Technology, Cambridge, Massachusetts

Research Associate

Massachusetts Institute of Technology
Cambridge, Massachusetts

MORRISON, WINIFRED

The Texas Company
Beacon, New York
MORTON,

J. E.

New York State School of Industrial and Labor Relations
Cornell University, Ithaca, New York

MOSHMAN, JACK,

Statistician

u. S. Atomic Energy Commission

Oak Ridge, Tennessee
MYERS, FRANKLIN G.,

Design Specialist

Glenn L. Martin Company
Baltimore, Maryland

College of Engineering, University of California
Berkeley, California

Applied Science Department, IBM Corporation
New York, New York

Mathematician

Naval Ordnance Laboratory, White Oak,
Silver Springs, Maryland

SPENCER, ROBERT S.,

Physical Research Unit, Boeing Airplane Company
Seattle, Washington

Research Physicist

Dow Chemical Company
Midland, Michigan
STEWART, ELIZABETH A.

Department of Pure Science, IBM Corporation
New York, New York

PORTER, RANDALL E.

RICE, REX, JR.,

Director

SOUTHWORTH, W. T.,

Punched Card Applications, The State College of Washington
Pullman, Washington

PENDERY, DONALD W.

POLACHEK, H.,

Associate Professor of Engineering Design

SOROKA, WALTER W.,

STULEN, FOSTER

B., Chief Structures Engineer

Propeller Division, Curtiss Wright Corporation
Caldwell, New Jersey

Research Engineer

Northrop Aircraft Company
Hawthorne, California

THOMPSON, PHILIP M.,

Mathematician

RICH, KENNETH C.,

Naval Ordnance Test Station
Inyokern, California
RIDER, WILLIAM

Physicist

Hanford Works, General Electric Company
Richland, Washington
TOBEN, GREGORY

B.

Applied Science Department, IBM Corporation
St. Louis, Missouri

Mathematician

RINALDI, LEONARD D.,

j., Supervisor

IBM Group, Northrop Aircraft, Incorporated
Hawthorne, California

Cornell Aeronautical Laboratory, Incorporated
Buffalo, New York
ROCHESTER, NATHANIEL

Engineering Laboratory, IBM Corporation
Poughkeepsie, New York

TUKEY, JOHN W.,

Associate Professor of Mathematics

Princeton University
Princeton, New Jersey
TURNER, L. RICHARD,

Coordinator of Computing Techniques

Lewis Flight Propulsion Laboratory, NACA
Cleveland, Ohio
VERZUH, FRANK M.,

Research Associate

Electrical Engineering, Massachusetts Institute of Technology
Cambridge, Massachusetts

SAMUEL, ARTHUR L.

Engineering Laboratory, IBM Corporation
Poughkeepsie, New York
SCHMIDT, CARL A., JR.,

IBM Supervisor and Coordinator

Fairchild Engine and Airplane Corporation
Hagerstown, Maryland
ScHUMACKER, LLOYD E.,

Flight Research Engineer

Flight Test Division, Headquarters Air Materiel Command
Wright Field, Dayton, Ohio

WAHL, AR'!HUR M.,

Advisory Engineer

Westinghouse Electric Company
Pittsburgh, Pennsylvania
WETMORE, WARREN L.,

Physicist

Research Laboratory, Corning Glass Works
Corning, New York
WHEELER, BYRON W., JR.

ScHUTZBERGER, HENRY,

Division Leader

Test Data Division, Sandia Corporation
Albuquerque, New Mexico

Corning Glass Works
Corning, New York
WILSON, LEWIS R., JR.

SHELDON, JOHN

Applied Science Department, IBM Corporation
New York, New York

WOLANSKI, HENRY S.,

SHREVE, DARREL R.

Research Laboratory, The Carter Oil Company
Tulsa, Oklahoma
SMITH, ALBERT E.,

Chemist

Physics Department, Shell Development Corporation
Emeryville, California
SMITH, ROBERT W.,

Tabulating Department, Consolidated Vultee Aircraft Corporation
Fort Worth, Texas

Mathematician

U. S. Bureau of Mines
Pittsburgh, Pennsylvania
SONHEIM, DANIEL W.,

Research Analyst

Ordnance Aerophysics Laboratory, Consolidated Vultee
Aircraft Corporation, Daingerfield, Texas

Aerodynamicist

Consolidated Vultee Aircraft Corporation
Fort Worth, Texas
WOMBLE, AETNA K.

Department of Pure Science, IBM Corporation
New York, New York
YORKE, GREGORY

B., Statistician

A. V. Roe, Canada, Limited
Malton, Ontario
YOWELL, EVERETT C.,

Mathematician

Institute for Numerical Analysis, National Bureau of Standards
Los Angeles, California

The Future of High-Speed Computing*
JOHN

VON NEUMANN

Institute for Advanced Study

A M A J 0 Reo NeE R N which is frequently voiced in
connection with very fast computing machines, particularly in view of the extremely high speeds which may now
be hoped for, is that they will do themselves out of business rapidly; that is, that they will out-run the planning
and coding which they require and, therefore, run out of
work.

are in "equilibrium" with the speed of the machine, other
and smaller, "subliminal" problems, which one may want
to do on a fast machine, although the planning and programming time is longer than the solution time, simply
because it is not worthwhile to build a slower machine for
smaller problems, after the faster machine for larger
problems is already available. It is, however, not these
"subliminal" problems, but those of the "right" size which
justify the existence and the characteristics of the fast
machines.
Some problem classes which are likely to be of the
"right" size for fast machines are of the following:

I do not believe that this objection will prove to be valid
in actual fact. It is quite true that for problems of those
sizes which in the past~and even in the nearest pasthave been the normal ones for computing machines, planning and coding required much more time than the actual
solution of the problem would require on one of the hopedfor, extremely fast future machines. It must be considered,
however, that in these cases the problem-size was dictated
by the speed of the computing machines then available.
In other words, the size essentially adjusted itself automatically so that the problem-solution time became longer,
but not prohibitively longer, than the planning and coding
time.

1. In hydrodynamics, problems involving two and three
dimensions. In the important field of turbulence, in particular, three-dimensional problems will have to be primarily considered.
2. Problems involving the· more difficult parts of compressible hydrodynamics, especially shock wave formation
and interaction.
3. Problems involving the interaction of hydrodynamics
with various forms of chemical or nuclear reaction
kinetics.

For faster machines, the same automatic mechanism
will exert pressure toward problems of larger size, and the
equilibrium between planning and coding time on one
hand, and problem-solution time on the other, will again
restore itself on a reasonable level once it will have been
really understood how to use these faster machines. This
will, of course, take some time. There will be a year or
two, perhaps, during which extremely fast machines will
have to be used relatively inefficiently while we are finding
the right type and size problems for them. I do not believe, however, that this period will be a very long one, and
it is likely to be a very interesting and fruitful one. In
addition, the problem types which lead to these larger sizes
can already now be discerned, even before the extreme
machine types to which I refer are available.

4. Quantum mechanical wave function determinations
-when two or more particles are involved and the problem is, therefore, one of a high dimensionality.
In connection with the two last-mentioned categories of
problems, as well as with various other ones, certain new
statistical methods, collectively described as "Monte Carlo
procedures," have recently come to the fore. These require
the calculation of large numbers of individual case histories, effected with the use of artificially produced "random numbers." The number of such case histories is necessarily large, because it is then desired to obtain the really
relevant physical results by analyzing significantly large
samples of those histories. This, again, is a complex of
problems that is very hard to treat without fast, automatic
means of computation, which justifies the use of machines
of extremely high speed.

Another point deserving mention is this. There will
probably arise, together with the large-size problems which
*This is a digest of an address presented at the IBM Seminar on
Scientific Computation, November, 1949.

13

Some Methods of Solving Hyperholic and Paraholic
Partial Differential Equations
RICHARD

w.

HAMMING

Bell Telefrhone Laboratories

THE M A IN PUR P 0 S E of this paper is to present
a broad, non-mathematical introduction to the general field
of computing the solutions of partial differential equations
of the hyperbolic and parabolic types, as well as some related classes of equations. I hope to show that there exist
methods for reducing such problems to a form suitable for
formal computation, with a reasonable expectation of arriving at a usable answer.
I have selected four particular problems to discuss.
These have been chosen and arranged to bring out certain
points which I feel are important. The first problem is
almost trivial as there exist well-known analytical methods
for solving it, while the last is a rather complicated partial
differential-integral equation for which there is practically
no known mathematical theory.
To avoid details, I shall give only a brief introduction to
the physical situation from which the equations came. Nor
shall I dwell at all on the importance or meaning of the
solutions obtained.
Lastly, I have chosen only equations having two independent variables, usually a space variable and a time
variable. Similar methods apply to equations having three
and more independent variables.
I have not attempted to define rigorously what is meant
by hyperbolic or parabolic partial differential equations,
nor shall I later. Instead, I intend to bring out certain
common properties, and inferentially these properties define the classes of equations. In fact, from a computer's
point of view it is the class of problems which is amenable
to the same type of attack that provides the natural classification. It is on this basis that I have included a partial
differenial-integral equation as the last example.
Each of the four problems is carried successively further
toward its solution until, in the last example, I have given
the detailed steps which were actually used.
If, in the rest of the paper, I do not mention any names"
it should not be inferred that I did everything alone; on
the contrary, I have at times played but a minor part in
the entire effort.

THE WAVE EQUATION

The classic and best known example of a hyperbolic
partial differential equation in two independent variables
is the wave equation:

aw
ax
2

2 -

1
c2

aw
at
2

2

•

This is the equation which describes the propagation of
signals, w) in one dimension, x. The signals progress in
time, t) with a velocity, c. This equation is a linear equation
and, as such, there is a large body of theory available for
use in solving it. Thus, it is not likely that anyone would
be called upon to solve it numerically except in very unusual circumstances. Nevertheless, I have chosen it as my
first example, since I hope its simplicity and familiarity to
you will aid in bringing out the main points I wish to make.
In solving partial differential equations it is customary
to replace the given equations with corresponding difference equations, and then to solve these difference equations. Whether one looks at the approximation as being
made once and for all and then solving the difference
equations as exactly as possible, or whether one looks at
the difference equations as being an approximation at every
stage is a matter of viewpoint only. I personally tend to
the latter view.
In the case at hand, the second differences are clearly
used as approximations to the second derivatives. Such a
choice immediately dictates an equally spaced rectangular
net of points at which the problem is to be solved. Such a
net is shown in Figure 1. The space coordinate, x) is vertical while the time coordinate, t) is horizontal. Thus, at a
fixed time, t) we look along the corresponding vertical line
to see what the solution is in space, %.
Suppose for the moment that a disturbance occurs at
the upper point at time t. As time goes on the disturbance
will spread out in space as shown in the figure. The space
covered by the disturbance at any later time is indicated
by the length of the corresponding vertical shading line at
that time. The area of this disturbance in the figure is

14

15

S E MI.N A R

permissible spacing in the t direction, beyond which it is
impossible to go and still expect any reasonable answer.
In this simple case the condition may be written

//
//
~

/

/

6.t < 6.x .

/

//
/

~,

t-b.t

~:

",
V

/

/

~t

~

a2w
FIGURE 1. WAV}<~ EQUATION ~
uX-

1 aw
= 9"--;---9
c- ut2

called the "cone of disturbance." The slopes of the bounding lines indicate the velocity of propagation c~ and in this
simple case they are straight lines. In the mathematical
theory of partial differential equations the lines are called
"characteristics. "
The figure shows a second disturbance started at the
same time, t, but at a lower point. This, too, spreads out as
time goes on, and there finally occurs a time when the two
cones overlap.
Consider, again, the given equation. The second difference in the x direction is calculated from the three points
which are connected by the vertical line. This is to be
equated to 1/c2 times the second difference in the time
direction, which naturally uses the three solid black points.
Suppose that the solution of the problem, up to the time t,
is known, then we have an equation giving an estimate of
the solution at one point at a later time, t
6.t.
Suppose, now, that the spacing in the x direction is kept
as before, but the spacing in t is increased so as to predict as
far as possible into the future. It should be obvious that the
spacing in t cannot increase so far that the advanced point
falls into the cones of disturbance of the first two points
which are neglected. To do so is to neglect effects that
could clearly alter the estimate of what is going to happen
at such a point. Thus, it is found that, for a given spacing
in the x direction, and a formula of a given span for estimating derivatives in the x direction, there is a maximum

+

c
Supposing that this condition has been satisfied, and
also, the solution up to some time t is known, the above
method may be used to advance the solution a distance 6.t
at all interior points. A point on the boundary cannot be so
advanced. There must be independent information as to
what is happening on the boundary. Such conditions are
called "boundary conditions" and are usually given with
the problem. The simplest kind of -boundary condition
gives the values of the dependent variable w by means of
a definite formula. More complex situations may give only
a combination of a function wand its derivative dw/ dx.
Such situations may require special handling when solving
the problem in a computing machine, but in principle are
straightforward.
A step forward at all points x is usually called a "cycle,"
and the solution of a problem consists of many, many
repetitions of the basic cycle. The sole remaining question
is that of starting the solution for the first cycle or two.
Just as in the case of ordinary differential equations, this
usually requires special handling and is based on some
simple Taylor's series expansion of the solution. In practice, this step is often done by hand before the problem is
put onto whatever machines are available.
As remarked before, this problem -is not realistic, so
having made a few points about spacing, boundary conditions, and initial, or starting, conditions, let us turn to a
more complex problem.

THE Two-BEAM TUBE
The two-beam tube is a tube with two electron beams
going down the length of it together. Upon one beam of
electrons a signal is imposed. The second beam, which has
a slightly greater velocity, interacts with the first beam
through their electric fields, and may be regarded as giving
up some of its energy to the first beam. This in turn produces, one hopes, an amplification of the signal on the
first beam.
The equations describing one particular idealization of
such a tube are:

a: + a~
ti

(PiVi)

=0

l

i = 1,2

aVi

a
at + Vi ax (Vi)

~!

=

+
2"

= k2q,

acI> = +

ax

+ (PI + P2)

16

COMPUTATION

where the solution is to be periodic in time of period 1,
and we are given information as to the state of affairs at
the beginning of the tube, Z = O. The upper two equations
for i = 1 describe one of the beams, while for i = 2 they
describe the other beam. The lower two equations describe
the interaction between the two beams of electrons.
I shall gloss over any questions of existence theorems
for such a system and merely suppose that there is a solution. The information needed to start the problem at X' = 0
comes from the" linear" theory which is not hard to find
from a "linearized" form of the equations. We are here
called upon to calculate the essentially nonlinear aspects
of the tube.
The first reduction of the equations has already been
performed before they were written as above, namely, that
of transforming out of the equations all of the various
constants and parameters of the problem that we could.
1n their present state the Vi of the equations give the velocities of the two beams measured in units of the mean
velocity, the Pi the corresponding charge densities of the
beams measured in mean charge density units, while the
cp and 'i1 describe the electric field in suitable dimensionless units.
Since we are expecting a "wave-like" solution, it is
convenient to transform to a moving coordinate system
which moves with the expected mean velocity of the two
beams. In such a coordinate system, the dependent variables Pi, Vi, CP and 'i1 may be expected to change slowly.
The equations obtained by such a transformation,
or

T=t-X'
are

a

a

aO" [PiVi] = aT

a

aO"

[Vi

2]

=

a

aT

[pi (Vi -

1)]

1)2]

[(Vi -

+ 'i1

aw _ aW
aO" - a:; + (Pt + P2) + k
acp

aO"

2

CP

acp

=

a; + 'i1

,

where the solution is still periodic in time T with period 1.
In solving the usual hyperbolic type of equation, one
advances step by step in time, but in this problem a periodic condition in time is given on the solution, and were
the time to be advanced, it would be difficult to make the
solution come out periodic. There would also be difficulty
in finding suitable starting conditions. Instead of advancing
in time, advancement is step by step down the length of
the tube in the 0" direction, using the periodic condition in
T to help estimate the derivatives in the T direction at the
ends of the interval. Thus, the periodic condition in effect
supplies the boundary conditions.

One may calculate, if one wishes, the characteristic lines
and determine the cones of disturbance, but in this case it
must be looked at sidewise, as it were. Assuming that the
solution is known for an interval of time, how far in space
may the solution be predicted at a time corresponding to
the mid-point of the time interval? If the cones of disturbance were to be calctflated, it would be found, as is usual in
nonlinear problems, that the velocity of propagation depends on the solution which is to be calculated. For example, a large shock wave of an explosion travels at a velocity
that depends not only on the medium through which it
passes, but also upon the amplitude of the shock wave itself.
Let us turn to the question of choosing a net of points
at which we shall try to calculate an approximate solution.
The use of a two-point formula in the T direction for estimating the T derivativ~s requires a great many points and
produces a very fine spacing in the 0" direction. If a fourpoint formula is chosen (a three-point one is hardly better
than a two-point one for estimating first derivatives), the
following is obtained,
~'(O) _ f( -3/2) - 27f( -1/2) +27f( 1/2) - f(3/2)
J
24 L}. T
with an error term of the order of
3
€,-..; 640 (L}.T)4f(V) (8)

+

€ ,

A formula like this is easiest to obtain by expanding each
term about the mid-point of the interval in a Taylor's
series with ~ remainder. Since in this moving coordinate
system we expect a sinusoidal variation in time, the fifth
derivative is estimated from the function
f = sin 271'T •
In order to obtain an accuracy of about 1 part in 104 as
the maximum error, it is necessary to choose 24 points in
the T direction-a most fortunate choice in view of the
product 24L}.T in the denominator of the estimate.
The statement that the maximum error at anyone point
is at most one part in 104 tells very little about the accumulated error due to many, many small errors, but as far
as I know, there are no general methods which are both
practical and close enough to help out in this type of situation. My solution to this dilemma is twofold:
1. To the person who proposed the problem, I pose
questions such as, "If the solution is disturbed at a
given point, will it tend to grow, remain the same,
or die out?" At the same time, I try to ariswer the
question independently, keeping an eye on the routine to be used in the solution of the problem.
2. Hope that a properly posed problem does have a
solution, and that human ingenuity is adequate to
the problem at hand.
Such a method may lead one astray on occasions, but with
nothing else to fall back on, I feel that it is better than

17

SEMINAR

(1

x

o

p and Vi known
x""'- at x points

---x;

x

-----1

o
K

o

67'.

x

Let us drop this problem at this point and take up the
next example.

--0-----

o

x

0

x

0

x

o

x;

0

-r-/::'-r

-r

' (0) = f( - 3/2) - 27f( -1/2)

where f"'"

*

o'+-  and known
at 0 pomts

--:-----]

-3/2

f

o

---~

0

-1/2

x

--:------1

3/2

1/2

o

To estimate the derivatives on the (1' direction to advance
one step, a simple two-point formula is used. Since both
the four- and two-point formulas give estimates at the
mid-points of the intervals, one is led to a "shifting net"
of points as shown in Figure 2. Such a net leads to some
slight troubles in the identification of points, but gives
probably the least calculation where it is necessary to deal
with many odd order derivatives. At least in this case, it
certainly does. I have glossed over the accuracy of the
estimate of the derivative in the (1' direction, but in this
case it was adequate, due to the fineness in the spacing
necessary to satisfy the net spacing condition in 6 ( 1 ' and

640

4j(V)

PARABOLIC PARTIAL DIFFERENTIAL EQUATION

The most common parabolic partial differential equation
in two independent variables has the form

dB _ 4rry d2 H
at - c2 a%2 •

+ 27f( 1/2)

24(/::'%)

~ (/::'x)

A

- f( 3/2)

+

f

(8)

FIGURE

2

inactivity. One should, of course, make a diligent effort to
resolve this dilemma, but pending that time, go ahead and
hOope for the best, ready at any time to suspect the solutions obtained.
Returning to our problem, some Oof the advantages of a
four-point' formula for estimating the derivatives in the 7'
direction have been listed. Let us look at the disadvantages
in general. In the first place, except in periodic cases such
as this one, where one can double back the solution from
the top of the interval to add on to the bottom, the difficult
problem of estimating derivatives near the boundaries
arises. In the second place, one faces the task of assembling information from four different points to estimate
the derivative at anyone point. If, for example, the four
values lie on four different IBM cards, it is not easy to
get the information together. One method would be to
calculate both 1 and 27 times the value of the function on
each card and then on one pass through the accounting
machine-summary punch equipment, using selectors and
four counters tOo accumulate the running sums, punch out
the proper totals along with suitable identification on each
summary punched card.

Such an equation describes the flow of heat, the diffusion
of material, and the magnetization of iron.
In the particular case we shall discuss, a thin slab of
permalloy is given, 2 mils thick and of infinite extent in
the other two directions. This slab is subjected to an external field H which is changing in a sinusoidal fashion
with frequency f. The question posed is that of determining the frequency of the external field such that B at the
center of the slab rises to 90 per cent of its maximum
possible value.
I would like to digress here for a moment to remark
that it appears to me to be frequently the case that one is
asked to determine a constant in the boundary conditions,
or a parameter of the problem, such that the solution will
have a given form. This is often a way of measuring a
physical constant; indeed, when one finds a problem whose
solution is sensitive to some parameter, then this may well
provide a way of measuring that quantity with a high
degree of precision.
Returning to the problem, it is immediately noted that
in heat flow or diffusion there is no concept of velocity of
propagation; hence the ideas of characteristics and cones
of disturbance are of little help. Nevertheless, there is a
condition on the spacing in % and t. To arrive at a necessary condition, suppose that at some point an error £ in H
is committed due, perhaps, to roundoff. This produces in
turn an error of 2£ in the second difference. Following this
through it is found that there is an error of
1

c2 6t

18

COMPUTATION

in the estimate of B'!;1I2, since a difference equation of the
form
Bi+1I2
n

=

Bi-l/2
n

+ 4,ryc2(6.6.%t )

1\ 2

2

~

Hi

n

is used. When the value of B is extrapolated to the point
B'!;l for the next cycle the error becomes
c26.t
4,ry(6.%)2 ·3£ .
Using this to calculate the new H of the next cycle, it is
found, on expanding in a Taylor's series and keeping only
two terms,

I f the original error £ is to produce an effect in the next
cycle at the same point that is not greater than itself, then
the following condition must be met,
At < 4,ry(6.X)2 dB
3c 2
dH'

~ =

This condition differs from that of hyperbolic equations
in that it depends on the square of 6.x. Thus, if the spacing in 6.x is halved, the 6.t must be divided by 4. This is
typical of parabolic equations. The inequality takes care
of a single roundoff, while if a roundoff at each point is
assumed, an additional factor of 7/10 is needed on the
right-hand side.

B

H

FIGURE

3.

HYSTERESIS

Loop,

MOLYBDENUM PERMALLOY

This condition is clearly necessary in order to expect
anything like a reasonable solution; its sufficiency will be
discussed later. Note the derivative dB / dH on the righthand side.
The particular sample of permalloy discussed had a
B-H curve, as shown in Figure 3. Recalling the importance
of the derivative dB / dH in the net spacing condition, it is
seen that as the problem progresses a very tight spacing
must be accepted throughout the problem or else the spacing
at various stages must be modified to take advantage of
those places where the derivative is large. The latter was
chosen.
In the early stages of the computation, while an attempt
was made'to obtain an idea of the order of magnitude of
the frequency f, a crude subdivision of the slab into four
zones marked by five points was made. By symmetry one
half could be ignored so that, in fact, only three points
needed to be followed. The outer point was driven by the
boundary condition, while the two inner points followed
according to the difference equations.
To test the method, first a B-H curve was used which
was a straight line. The comparison with the analytical
solution was excellent. To show the reality of the net
spacing condition the· problem was deliberately allowed to
run past a point where the net should have been tightened.
The results are shown in Figure 4. This oscillation is typical of what happens when a net spacing condition is violated, although sometimes it takes the form of a sudden
exponential growth instead. Indeed, when such phenomena
occur, one may look for a violation of some net spacing
condition.
I have emphasized that the condition just derived is a
necessary condition. There has been a lot of discussion
lately as to whether this is a sufficient condition. Unfortunately, I do not have time here to go into this matter
more fully. Instead, let me present some of the results
obtained several years ago when we did this problem.
Figures 5, 6, and 7 show a smooth transition on the
solution as the frequency f was changed. Any errors are
clearly systematic. The jumps in the inner points are due
both to the shape of the B-H curve and the extremely
coarse spacing of three points. When a finer spacing of
five points was used (eight sections of the slab instead of
four), much the same picture was found. The labor, of
course, was eight times as much since there were twice as
many points, and the 6. t was decreased by a factor of
four. This crude spacing should indicate how much valuable information may be obtained from even the simplest
calculations when coupled with a little imagination and
insight into the computation.
There seems to me to be no great difficulty in setting
up such a problem for machine computation; so I shall not
go further except to note that in the original setup of the
problem we provided for the fact that we would have to

~ MIL

MO- Pp

TAPE

2

f = 18 X 10 5
H

2 INTERVAL

- 1.0

- 1.1
- 1.2
- 1.3

Ho
- 1.4

A ~\

- 1.5

- 1.6
-1.7

--- ~ ~

- 1.8

- 1.9

IH c;

I-i

--~

-::;~/
-- - - '><...../
v

~

.IT
X
~!x7 V v
v

\
\

~

V

- --=- .:..--;;;:.=-==-

HI

- 2.0

T X 10 8
28

3 6

44

40

FIGURE

H

4

= 10 5

2

20

INT E RVA L

16
12
08

,/'

:/

04

o
04
08

12

/

16
20

f\ o

---

V

~

64

60

1..
z MIL TAPE

MO- Py ,

f

52

48

~

~

/

~17

V
~

I

~
HO

_

10
)

~

~~

50

T X 10
100

150

200

250

FIGURE

19

8
300

5

350

400

500

MO- Py

...L
MIL TAPE
2

I

f = 1.8 X 10 5
H

2 INTERVAL

2.0
1.6
1.2

0.6

V

0.4

- 0.6
- 1.2
- 1.6

~~ V

- 2.0

~

,

/

~

/

~
HI

/"

/r
~

_J HJ

/ v----

o
- 0.4

V

H2

~

V

T X 10 6

o

60

40

120

FIGURE

260

240

200

160

320

6

+

~AIL lAPE

flAo- Py ,

f=25X10 5

H

2 I NTE RVAL

2.0

/-

1.6

/

1.2

0.8

/

o
-0.4
-0.8
-1.2
-1.6

/

~
r--

V

~

/!/

H2

HI

V

0.4

i--

l\
/

H0

V--~

rv

~

V~

- 2.0

T X 10 8

o

40

80

120

160

200

FIGURE

20

240

7

280

320

360

21

SEMINAR

consult not one but a family Oof B-H curves, the one chosen
for each point depending on its maximum saturation. This
refinement was nOot included in the results shown before,
and in any case it produces only a small effect.

THE TRAVELING WAVE TUBE

The last example I wish to consider is that of the traveling wave tube. A traveling wave tube consists Oof a helix
of wire wound around an evacuated cylinder. The pitch
of the helix reduces the effective velocity of the field due
to an impressed electric current in the helix to around 1/10
that of light. Inside the helix is a beam of electrons going
at a velocity slightly greater than the electromagnetic wave
frOom the helix. As in the two-beam tube, the stream of
electrons interacts with the field and gives up some of its
energy to magnify the signal impressed on the helix.
The equations describing one particular idealization of
such a tube are

d~~) = _
'YJ(y),

a~
a

q(O, y)

ay cf>(O, y)

= -

Lf2"
1

27rEA(y)

= A(:V)

sin (9, y)d9

f21T
0

cos cf>(O,y)dO

sin cf>(O, y)

= k + 'YJ(Y) + 2Eq(O, y)

.

These equations have already been transformed over to a
coordinate system mOoving with the wave. The y coordinate
measures, in a sense, the length down the tube, while the
measures the time.
I f the equations are examined more closely, it is seen
that, for each 0, the lower two equations must be solved
in order to move the solution Oof q and cf> one step down
the tube in the y direction. The sine and cosine of cf> are
then summed to obtain numbers depending on the fundamental frequency. The higher harmonics are clearly being
neglected. These upper equations in turn supply the coefficients for the lower equations. This neglect of the
higher harmonics was justified on the physical grounds
that the helix damped them out very strongly. As a check,
the amount of the second, third, fourth, and fifth harmonics in the beam was calculated later, and it was found
that they could indeed be neglected.
The first step is to make a transformatiOon so that the
parameters E and k drop out of the equations.
Proceeding much as in the two-beam tube, it was decided that 16 points would provide an adequate picture in
the direction. Thus, there are 16 pairs of equations like
the lower ones to solve. In addition, it was desired to

°

°

solve eight such problems for different parameter values
(which appear in the initial conditions and enter in the
"linear" solution used to start the nonlinear problems).
This gives 128 pairs of equations to be solved at each
cycle, a situation very suitable for IBM equipment. On
the other hand, the upper two equations only occur once
per problem, or eight times in all, which makes them unsuitable for machine solution. Thus, the solution of the
lower equations and calculation of the sums corresponding
to the integrals of the upper equations were accomplished
on IBM machines, while the rest of the upper equations
were solved by hand calculations. Included in the hand
calculations was a running check that may be found by
integrating the equations over a period and finding the
ordinary equations that govern the mean values.
With a spacing chosen in one variable, how is the spacing to be chosen in the other? In this case, there is no
theory of characteristics, in fact, very little mathematical
theory available at all. The obvious was done. A number
of spacings were tried, with a crude net space in 60, and
the maximum permissible 6y, at that stage of the problem,
was determined experimentally. Then a spacing 6y was
chosen, comfortably below this limit, although not so far
as to make too much work, and the calculation started
with the hope that this would either be adequate for the
entire problem or that the effect would show up in a noticeable manner in the computations. No obvious anomalies
appeared; so presumably the spacing of 1/10 unit in y was
adequate. The net chosen was rectangular with every other
y value being used to calculate the cf> and q, while at the
other values the A and 'I] were evaluated. This produces
central difference formulas which are the most accurate.
The old cycle in cf> - and q- was labeled by a -, the current
values of AO and '1]0 by a 0, and the predicted values of cf>+
and q+ by a
The new values of A++ and '1]++ were
labeled
To set up a system of difference equations corresponding
to the lower equations: first, an estimate of q+ was obtained, then this was used to find a reliable value of cf> + ;
finally, using this cf>+, a better estimate of q+ was obtained.
The difference equations describing this situation are

++.

+.

1 ( .'I]°+q-+1O
A sm
Ocf>.)
cf>+=-+1O
q+ = q-

AO ( sm
. cf>- +
+ 10

.cf>+
)
sm

L =

~

M =
N =

~

sin cf>+
q+ sin cf>+
~ cos cf>+ •

The difference equations corresponding to the upper equations have not been shown, but it has been indicated that
the solution of both equations was made to depend on
three sums labeled LJ M J and N.
To simplify matters in finding the sine and cosine of cf>, the
units of measurement of angle were changed from radians
to 1/1000 part of a circle. The trigonometric functions for
such angles can be found by consulting tables in decigrades

22

COMPUTATION

at every fourth decigrade. The advantage of such a unit is
that the integral part of the angle automatically gives the
number of rotations, and the fractional part gives the value
at which to enter the table.
Consider the basic cycle of computation. It is obvious
that the accounting machine-summary punch will be best
adapted to the summing of the quantities leading to L, M,
and N. This is the natural point to start a cycle, since the
cards from the summary punch will have the minimum
-amount of information, leaving the rest of the space on the
cards for future calculations. These cards will be called
detail cards.
Each detail card needs to be identified uniquely. To do
this the problem number, the card number which is its ()
value, and the cycle number which is its y value, are given.
The information that the detail cards must carry at this
stage to describe the problem is the current values of cpand q-. In addition, it is convenient to have the value of
the sine of the old angle, sin cp-.
The master, or rate, cards-the information for which
comes from the hand calculations-must have identification
consisting of the problem number and the cycle number, and
the values of the two dependent variables A ° and 'YJ 0 • The
procedure is :
1. Key punch the eight master cards and sort them
into their appropriate places, a matter of one sort on one
column.
2. Multiply with crossfooting to obtain the quantity,
q+(estimate) - '

AO
10
sin cp- + q-

,

which is an estimate of the q at the neW cycle.
3. Another multiplier-crossfoot operation produces
° AO.
qcp+ = cp100 sm cp10

+ io +

+

which is the new value of cp +. Now the sine and cosine of cp
must be found.
4. Sort on cp for three digits,
5. Collate in the table values of the trigonometric
functions,
6. and 7. Using the multiplier, linearly interpolate the
values of sine and cosine of cpo Each may be obtained with a
single pass through the multiplier, provided there are only
five figures in the table values and three in their first differences. The algebraic signs may be picked up from the
master cards and held up for the detail cards which follow, so that with a suitable complement punching circuit
the value and its algebraic sign may be punched.

8. Collate again to remove the table cards and at the
same time put the table back in proper order. (Incidentally, the same control panel is used for both operations
on the collator.)

9. Resort the detail cards so that they are again in
order, both as to -the card number and the problem number, a matter of a three-column sort.
10. Multiply-crossfoot to obtain the new value of q
from the formula

q+

=

. cp- +.
q- + -AO ( sm
sm cpT')
10

11. Multiply to obtain q+ sin cp+.
12. List the calculated values, the three sums L, M, and
N, and summary punch the cards for the next cycle.
If these operations are gathered together, it is found
that there are 6 passes through a 601 type multiplier, three
sorts for a total of 7 columns, two passes through the collator, a key punching of 8 cards, and one pass through an
accounting machine-summary punch for each cycle.
We used our own accounting department machines with
a 601 multiplier modified to have sign control, and three
selectors. We operated only at times when they were not
doing their main task of getting out the pay checks! N eedless to say, no arguments ever arose as to priority on the
use of the machines.
CONCLUSION

Let me summarize the points I hope I have made. First
and foremost, there is a large class of problems where the
relative size of the net spacing chosen is of fundamental
importance. Where there is no known mathematical theory, or where one is ignorant of it, one may still proceed
on an experimental basis and watch for either violent
oscillations or sudden exponential growths to indicate
where the going is not safe.
Second, it is not hard to set up a method of computation
for a given problem, and one can estimate the accuracy at
any step by some such device as using a Taylor's expansion
with a remainder. The harder problem of propagation and
compounding of errors I have not answered at all definitely, but have suggested that prudence, physical intuition,
and faith will provide one with a suitable guide.
Lastly, it is not hard to work out the details of a basic
cycle if one keeps in mind the amount of information that
must be available at certain stages, watches the flow of information, and has the courage to try to work out the
details of a plan. When it comes to comparing alternate
methods I presume that one can count operations, judge
reliabilities, etc., of the various alternates. There may be
better ways than you have thought of, but don't let that
stop you! I f the method is sound, economically worth
while, then you are justified in going ahead. You don't
need super computing machines, although they are nice to
have; you can go ahead with what you have at the moment
and obtain useful and valuable results.

SEMINAR

DISCUSSION
Dr. Hammer: The choosing of networks in comparison
to the interval of the variables sometimes can be avoided
by using a different system of integration; that is, an implicit calculation in which perhaps some of the variables
are first found by explicit integration, and then they are
recalculated.
Dr. Hamming: You are thinking of the Riemann
method, no doubt, or the von Neumann method of getting
a difference equation which involves the present values
and simply adjacent values one cycle forward.
Dr. Hammer: Yes. One essentially calculates all the
values at the same time, and then the condition you mentioned can be violated to some extent.
Dr. Herget: The graphical way in which you portray
the effect of (6. t) 2 to 6.% is very good, and I think it has
been stated in some of these meetings before that to be
safe for the convergence involved in this process, 6.t
should be about half of 6.%. Isn't that right?
Dr. Hamming: You can't say any 6.% and 6.t. It depends on the scale of the variables used. If the variable is
multiplied by 10, the spacing would be changed numerically. The condition is stated in terms of the velocity of
propagation of signal in the hyperbolic case, and in the
parabolic case one considers the derivative dB / dH.
Dr. Grosch: I would like to ask a question about the
nature of the oscillations encountered when the condition
is violated. Have you made any effort to see, if you will
pardon a nontechnical term, what the mean curve through
those oscillations does? Does it follow the solution?

23
Dr. Hamming: Yes, it does.
Dr. Grosch: That is an interesting point.
Dr. Hamming: I f you examine Figure 4, you can see
this is true.
Dr. Grosch: We had a situation like that arise back in
1946 when we were using 601's, and in our case the condition on the very short 6.% and 6.t was not a simple
constant but a sort of variable of the column. We had this
oscillation happen just a few % intervals from the end; we
tried a fudging method of this sort, and it seemed to work
out all right.
Dr. Alt: I think the situation concerning propagation of
the local errors is not as hopeless as you indicate. One can
use Green's function in order to study the propagation of
errors whenever Green's function is available. If it is not
in there, at least one can try to linearize it. We have tried
that in a nonlinear problem, too, and it worked out. It can
be done. I felt that the problem was simple enough that
one didn't even consider publishing it, because it was just
a straightforward application of the Green's function.
Dr. Hamming: I agree with what you say completely if
your problem is either linear or your solution is reasonably
close to linear with a perturbation, but when you encounter
essentially nonlinear effects, Green's function will tie you
up hopelessly when it is essentially the nonlinear part you
want. That was the problem in all the examples that I
showed; not to get the linear probJem with the perturbation, but to get the essential nonlinear effects and see
where they entered, how much they entered, and where
they cut off.

Numerical Solution of Partial Differential Equations
EVERETT

C.

YOWELL

National Bureau of Standards

this mesh distance h. Then the difference equation at the
point % = %i, Y = Yj will become:
1
h 2 [f(%i-l,Yj) - 2f(%i,Yj)
f(%i+l1Yj)

THE US U A L MET HOD S for determining numerical solutions of partial differential equations with specified
boundary conditions are based on the approximation of
the differential equation by a difference equation. In the
case of the two-dimensional Laplace's equation, which
is the only one I will speak of today, the differential

2
equation is a Z + aa 2Z = o.
a%Y~2xf

(~%)2

+ (~y)2
/:}yf =

0

+

+f(%i,Yj-l) - 2f( %i,Yj)

2

2

af + a
f _
ay2 -

-

1 1::/f
1 6 6f
12
x + 90
x

+ (~y)2 (~2yf -

1
h2

4
}12 6 yf

-

.••

+ ~ ~~f -

1~ ( ~~

2

=0

There is one such equation for each point of the grid.
Each equation may involve only interior points, or interior
points and boundary points. I f the boundary points are
considered as known and transferred to the right sides of
the equations in which they occur, then there is a system
of n equations, one for each grid point, involving n unknowns, the values of the function at each grid point,
which completely defines the function at the grid points
within the boundary. In the present case, it can easily be
shown that a unique solution of these equations always
exists.
There is one great drawback to this approach to the
numerical solution of a differential equation, and that is
the number of equations in the system. Consider a relatively simple heat conduction problem. \Ve have a cube,
}O cm. on each edge, at a uniform temperature of O°C.
We place this cube in contact with a heat source along one
face. The temperature of the source is some function of
both coordinates. We insulate one of the adjacent faces
and then inquire as to the distribution of temperature over
the free faces ten seconds after contact is made. This
problem will reduce to a four-dimensional case, three
space dimensions and one time dimension. I f a ten-point
grid is introduced in each dimension, a system of 10,000
equations in 10,000 unknowns results. And, while a large
number of coefficients will be zero, this is not a problem
to be approached with equanimity. Although this example

)

... )

and the standard approximation amouhts to cutting off
this series after the first term in % and in y. If only second
differences are to be used, it is obviously advisable that the
interval ~% should be chosen sufficiently small so that
the term

=0

The standard approximation is

h
1\2f' h
d d'ff
were
Dx IS t e secon
1 erence

1
(/::/f
(6%) 2
!lJ

f(%i,Yj+l)]

or symbolically

in the % direction and ~~f is the second difference in the
y direction.
The exact relation between the differential operator and
the difference operator is an infinite series in the difference
operator,
a.r2

+

becomes negligible. I f this leads to too

small an interval, we try to recover the accuracy by including more terms of the series in the approximation.
The validity of this procedure needs rigorous justification,
but it presents a practical computational approach.
N ow differences are linear relations between function
values at adjacent points. Hence, any method which works
basically on the difference equation will be a method dealing with values of the function at specified points within
the boundaries. These points are generally chosen systematically to cover the interior of the region with a regular grid. We .shall consider only a square grid, as it is more
easily adapted to machine computations than are the triangular or hexagonal grids.
The direct approach to the problem is to write out the
difference equation for each point of the grid. Since we
are dealing with a square grid, ~% = ~y, we shall call

24

25

SEMINAR

was designed to show how rapidly the number of equations can increase, and is not the type of problem that
would be solved in practice, problems of the same order
of magnitude are available in the physically interesting
problems whose solutions are being sought today.
A second method of attack is the relaxation method of
Sir Richard Southwell. Here one guesses at the value of
the function at each grid point and then systematically
improves the guess. The values of the function are substituted into Laplace's difference equation, and the result
will in general differ from zero. This difference, or
residual, is computed for each grid point. The largest residual in the entire field is now located, and the value of
the function at that point altered in such a way that the
residual becomes zero. This is equivalent to adding one
quarter of the residual to the residual at each of the four
adjacent points, leaving the rest of the field unaffected.
The field is again scanned for the largest residual, and it
is reduced to zero by changing the value of the function
at that point. The process is continued until all the residuals become small, one or two units in the last place.
As a hand computing method, relaxation has many advantages. It deals with only a few points at a time, it involves very simple operations, and it converges to the
solution of the difference equation rather rapidly. And
there are variations-over-relaxing and under-relaxing,
group relaxing and block relaxing-which increase the
speed of convergence. As a machine method, many of
these advantages are lost. The speed of convergence depends on relaxing the largest residual at each step. Hence,
the entire residual field must be scanned before each operation to locate this largest residual. This scanning for size
is still a very inefficient operation, particularly when it is
interposed between every set of five additions. Then, too,
the block and group relaxations, which so speed up the
convergence in hand computing, are very difficult to apply
using automatic computing machinery.
Another method related to the relaxation method is
Liebmann's smoothing method. Once again, we start with
the basic difference formula

b

[f(Xi-nY,i)

+

f(Xi+vYj)

points at a time, that it involves only simple operations,
and it is adaptable to machine computations. M. Karmes,
of New York City, has done this, reporting in 1943 on an
adaptation of this method to 601 multipliers. His machine
method is straightforward, one quarter of the value of the
function at the four neighboring points being summed to
give the value at the central point. To assemble correctly
the cards to be summed, Karmes prepares four decks of
work cards. Each of these contain if (Xi,}'j) , but they differ
in that a second argument is introduced in each deck. One
contains (i - 1,j); one, (i
1,j) ; one, (i,j - 1); and
one, (i,j + 1). The four decks are now sorted together on
the second argument and summed, summary punching the
new value of the function at each grid point. The deck
with the new function values is then reproduced four
times, the second arguments are put in, and the process is
repeated. This cycle is continued until 'the function values
converge within the required accuracy.
A method simil'ar to Liebmann's method, but better
adapted to machine computation, has been devised by
Dr. Milne and tested on the 604 electronic calculators at
the Institute for Numerical Analysis. Dr. Milne was seeking to avoid the sorting problem that led Karmes to the
use of four decks of cards. He added two difference operators, each satisfying Laplace's difference equation, together. The first is the usual

+

= 0,

while the second

basically the same operators rotated

=0

+

f(Xi,Yj-l)
+f(Xi,Yj+1) - 4f(Xi,Yj)] = O.
If now we multiply the equation by h 2 and then transfer
4f(Xi,Yj) to the right-hand side, we have an equation defining f(Xi,YJ) in terms of the four adjacent values of the
function. The method consists in guessing the value of the
function at each grid point, and then applying the smoothing formula to each point of the grid. The entire field is
smoothed again and again until no changes are introduced
in the function values to the degree of accuracy required.
This method has some advantages and some disadvantages. Its main disadvantage is its slow speed of convergence. Its advantages are that it deals with only a few

1S

45°,

Multiplying each by four and summing, we obtain

1
h,'!.

=0 .

26

COMPUTATION

The trick now is to multiply through by h 2 and then add
36!(%i,Yj) to both sides of the equation. This gives

This equation is now factorable, and we can define two
operators U and V such that
1

+ 4!(%i,Yi) +

1

+ 4!(%i,Yi) + !(%i,Yj+l)]O

U !(Xi,Yi) = 6[!(Zi-l'Yi)
V!(%i,Yi) = 6[f(%i,Yj-l)

j(%i+l/Yj)]

If these are applied successively to the nth approximation
to the function values at the grid points, they will yield
the n
1st approximation. Or,
r+l(%i,yj) = uv!n(%i,Yj) = VU jn(%i,Yi)'
The last equation indicates that the operators commute
and that rows or columns may be smoothed first.
This method works nicely on the 604 electronic calculating punch. For each iteration, the cards must be fed
through the machine twice, once in row sort and once in
column sort. At the end of the second run, a new set of
function values will have been computed for each grid
point.
The example tested a 9 x 10 rectangular grid with values
of arctan % / Y given along the boundaries. \Ve have available 10 place values of arctan %/y, so that a check was possible on the speed of convergence. The smoothing was applied first by rows and then by columns, although this
choice was completely arbitrary.
The wiring of the 604 control panel was simple and
straightforward. The value of the function was read into
factor storage 3 and 4, and a 4 and a 6 were emitted into
the MQ and factor storage 2, respectively, on the read
cycle. The analysis chart reads as shown below:

+

Step

Operation

Read
1. addf(xt,YJ)
2. add f(Xt,YJ-2)
3. Mult 4f(X',YJ-l)
4. Divide sum by 6
(Expanded division if
more than five digits
are used)
5. Transfer f(Xt,YJ-l)
6. Transfer f(Xt,YJ)

The result 0 f this operation is to punch V ! (%i,Y j-l) on the
(i,j) card. The two last transfers set up the operation for
the next point. This arrangement of storage units will
handle any size numbers up to eight digits, and that should
include all problems of practical interest today. There is
no question of the function values growing too large, as
the maximum and minimum values must occur on the
boundary.
The wiring of the 521 control panel is a little more complicated, as it was desirable to make the control panel automatically change itself for the differences between the first
and second runs. There are two problems that must be
handled on the 521 control panel. The first is that the input
field for the second run is the same as the output field of
the first run. And the second is the shi ft in argument.
The card layout is as follows. In column 1, punch the
row identification i. In column 2, punch the column identification j. In columns 3-8, punch the original value of the
function. After smoothing along a row, punch the answer
in columns 9-14; and after the next smoothing along a
column, punch the answer in columns 15-20.
The first problem, then, is to read from columns 3-8 on
the first run and punch into columns 9-14. On the second
run, read from columns 9-14 and punch into columns
15-20. This is done through punch selectors, the normal
side being wired to the first run read and punch fields, and
the transferred side being wired to the second run read
and punch fields. The selectors are transferred by a Y
in column 80, a punch which is introduced on the first run
by wiring from the emitter through the normal side of a
punch selector to the punch magnet for column SO. The
selectors which switch the read fields should be controlled
through their immediatOe pickup, while the selectors that
control the punch field should be controlled through their
delayed pickUp.
The shift in argument is easily handled. On the first run,
the j identification is gang punched backward into the following card, while on the second run, the i identification
is similarly gang punched back one card. Column 2 is
wired from the second reading brushes through a normal

Factor Storage
1
2
3 4
6 f

(.~t,YJ)

RO

MQ Counter General Storage
2

1

3.

4
RI
RI

RO
RO

RO

RO

RO
RI

RI

4

27

SEMINAR

point of a punch selector to the punch magnet of column
22. Column 1 is wired from the second reading brushes
through a transferred point of the same punch selector to
the punch magnet of column 21. This selector is then controlled through its delayed pickup by the y in column 80.
Two points might be examined in a little greater detail.
The first of these has to do with the smoothing of the
boundary values during the first run. As the cards are
going through in row sort, the first and last rows will be
entirely boundary cards, and the values punched into these
cards will be the smoothed boundary values rather than
the true boundary values. This is necessary for the correct
application of the formula, as a consideration of the function at the point (1,1) will show. Suppose first that we
do not smooth the first row. Then we will have available
at the end of the first run these values on their corresponding cards:
function
i J
1
f(O,1 )

°

1 1

~

[f(1,0)

+ 4f(1,I) + f(I,2)]

2 1

i

[f(2,0)

+ 4f(2,1) + f(2,2)]

.

At the end of the second run, the answer punched in the
card for the point (1,1) will be

3~

[6f(0,1)

+ 4f(1,0) + 16f(I,I) + 4f(I,2)
+ f(2,0) + 4f(2,1) + f(2,2)]

which is equivalent to the true expression only if f (i,j) is
linear along the boundary i = 0.
If, on the other hand, we smooth the first row, we will
have available at the end of the first run these values on
their corresponding cards:
i j
function
1

~ [f(0,0) + 4f(0,1) + f(0,2)]

1 1

~ [f(l,0) + 4f( 1,1) + f( 1,2)]

2 1

~ [f(2,0) + 4f(2,1) + f(2,2)].

°

At the. end of the second run, the answer punched on the
card for the point (1,1) will be the correct expression

3~ [f(0,0) + 4f(0,1) + f(0,2) + 4f( 1,0) + 16f( 1,1)
+ 4f(1,2) + f(2,0) + 4f(2,1) + f(2,2)]

.

Thus the use of a smoothed boundary value in the second
run actually is necessary for the successful evaluation of
the smoothing formula at all points of the grid.
The second point is the use of a single delay in transferring the selector, which governs the gang punching of
the i identification on the second run. Standard practice
for gang punching through a selector is to use a double

delay so that the card containing the pickup punch will be
passing the second reading station when the selector transfers. In this case, all cards have the pickup punch. Thus,
use of a double delay would transfer the selector from the
time the first card is passing the second reading station
until the last card is passing the reading station. Use of
a single delay will transfer the selector from the time the
first card is passing the punching station until the time the
last card is passing the punching station. Either type of
delay will give the correct gang punching result in this
case; so a single delay was used as a simpler method.
The complete sequence of operations now can be summarized. A deck of cards containing the boundary values
is reproduced a large number of times. A deck of cards
containing the initial values of the function at the interior
points is key punched. This deck and one of the boundary
decks are then sorted on columns 2 and 1. This puts the
cards in order of column number within rows. The cards
are then run through the 604 and again sorted on column
22. This orders the cards on rows within columns. The
first and last columns are removed, as they contain spurious -smoothed values. The remaining cards are again run
through the 604 and then sorted on column 21. The first
and last rows are removed, as they again contain spurious
values. The remaining cards are reproduced, reproducing
columns 21 and 22 into columns 1 and 2, and columns
15-20 into columns 3-8. These new cards form the deck
for the interior points in the next approximation. They
are combined with a new boundary deck, and the process
is repeated.
For the example we tested, one cycle of operations on
ninety cards took about five minutes. More time must be
allowed for reproducing new boundary decks, but certainly
ten steps an hour can be accomplished if a 604, reproducer,
and sorter are set aside for the problem. And then an
occasional check must be made of the convergence of the
solution. We listed the ninety cards after every tenth iteration and examined two successive lists for changes. In
about sixty iterations, we had reached an accuracy of
about two units in the fourth place.
This same example was tested in our hand computing
section, using a mixture of smoothing and block relaxing.
The field was smoothed three times, then block relaxed.
This cycle was repeated three times, and then three additional smoothings were made. At the end of these twelve
smoothings and three block re1axings, answers were
reached that were closer to the true answers than had been
reached in the 60-odd iterations by punched card machine.
The great difference in the speed of convergence is due
to the use of block relaxing. An intuitive idea of the
reason for this is gained by considering the basic action of
the smoothing operator. Now Dr. Milne's smoothing operator will work just as well on the residuals as on the functional values. The residuals are defined with respect to

28
this operator in a similar fashion to the way residuals
were defined with respect to Liebmann's operator. Now,
consider the original residual field and the effect of the
smoothing operator on it. I f the residual were plotted
vertically against the % and y coordinates of the points and
a surface passed through the ends of the residuals, a threedimensional model similar to a mountain would be obtained. As the original guesses were not good, plus and
minus errors would be found, large and small errors, and
the mountain would be rough-covered with peaks and
valleys. A few applications of the smoothing operator will
level off the peaks and fill in the valleys, producing a
smooth instead of a rough mountain. The outstanding deviation from smoothness will come at the boundary, where
the elevation of the mountain goes to zero. And beyond
the boundary, a fiat, level plane stretches to infinity in all
directions. The task of the smoothing operator is to erase
this lack of smoothness at the boundary by forcing the
entire mountain out through the boundary. And as the
altitude of the mountain decreases, the slope at the boundary approaches closer and closer to zero, and less and less
of the residual is removed with each iteration. Hence, the
convergence is rather poor, because the operator is most
efficient at smoothing and inefficient at forcing residuals
through the boundary.

COMPUTATION

This situation is completely upset when block relaxing
is added as a further tool. Now one smoothes for a while
until a smooth mountain is formed. Then one traces a few
approximate contour lines along the mountain. The area
between any two contour lines is then dropped in altitude
by the mean altitude of the two adjacent contour lines.
This then removed the bulk of the mountain, leaving small
peaks and ditches. These are rapidly smoothed over by
use of the smoothing operator, and again the bulk of the
mountain is carted off by the block relaxation.
While the use of block relaxing together with smoothing
provides a rapidly convergent way of solving Laplace's
equation, it is at present not set up for machine computation. The need for drawing contour lines and the interdependence of neighboring points makes it very difficult
to set up for automatic calculation on present day calculators. There are undoubtedly ways of accomplishing the
same thing without using block relaxation in its standard
manner, but these must be found by further investigations
and offer problems which I hope Dr. Milne will investigate during his next stay at the Institute for Numerical
Analysis.
DISCUSSION
[This paper and the following one by Dr. Harry H. Hummel were
discussed as a unit.]

An Eigenvalue Problem of the Laplace Operator
HARRY

H.

HUMMEL

Argonne National Laboratory

Here h is (%n+l - %n) = (Ym+l - Ym), the net spacing
in the difference problem. a is the eigenvalue of the difference problem, assumed equal to a.

I NAP APE R presented at the November, 1949, meeting at Endicott, Flanders and Shortleyl discussed the
solution of the equation
\]:!t/! = at/!.
(1)
Here a is an eigenvalue, and the problem is to find its highest value and the corresponding fundamental eigenfunction t/! for homogeneous boundary conditions. This paper
will describe the solution of this problem on IBM machines
for a two-dimensional region consisting of a square with
a square hole cut out of it (Figure 1). The function is set
equal to zero at the outer and inner boundaries of the
region.
The solution of (1) is accomplished by transforming to
a difference equation over a two-dimensional network of
points in the usual way (Figure 1). Set t/!1lJ+l,'Y
t/!1lJ-I,y
t/!1lJ,Y+I
t/!1lJ,Y-l
l t/!1lJ,y , where t/!1lJ,y is the value of the
function at (%,y). Then the difference form of (1) is
(l - 4) t/!1lJ,y
h2
::=;= a t/!:c,y .
(2)

+

S cards
(Reproducer)

X~

and
equation (2) becomes
w

S cards

A

~

..

I
I
I
I

I

r-Blankl
Card~

(xn' Ym-I){
(Reproduc er I
I
run) ml-- r-- - ""(%n-I, Ym)~(%nl-v Ym)
l J'
)
I
/ '" (xn, Ym
1
1/ (Xn, Ym+l)
/
I
2P cards
'/'Boundary I
602Aand
r--V
(no cards ) 1 ~/
Reproducer
I

Zero~

V

IL _ _ _ _ _ _ _ _ _ -

FIGURE

1.

(4)

n=1

I

1/

V

(602A) - J..
y

t/! = L:Cnt/!n .

11/ I

I

(3)

N

n
·--\-----1--------,
I

= A t/!1lJ,y'

Then a first approximate solution function t/! can be
analyzed in terms of the eigenvectors t/!n

(602A run)

1

t/!1lJ,y

Since the highest value of ~ is desired, the highest value
of A is also desired. The number of homogeneous equations (3) is equal to the number N of points (%,y) in the
network, which is, therefore, the number of eigenvectors
of the equations (3). For this algebraic problem it is
known l that -1 < A < 1, and also that the set of eigenvectors is complete. A solution t/!n will, of course, consist
of N numbers t/!1lJ,y, one for each point of the network, and
will correspond to an eigenvalue An.

+

+

==

By defining

Operating with (w-a) / (I-a), where a is a real number
such that -1 < a < 1,

I

I

I
I

w -

I

a

1 - a t/!

I
1

~

(An - a)

= LJ Cn ~ t/!n .

(5)

n=l

Thus, it is seen that, by performing this operation a
number of times with various values of a, the amplitudes
of higher eigenfunctions may be reduced as much as desired relative to that of the fundamental mode t/!11 the
eigenvalue A11 which is usually nearly 1. The value of
(A-a) / ( I-a) as a function of A and a, is shown in Figure 2.
Flanders and Shortleyl discuss the selection of values of a;
concluding that the greatest efficiency is achieved by choosing the a values as the roots of the Tschebyscheff Polynomial of order equal to the number of iterations it is
desired to carry out.

I

I
I
I
I

1

I
1

I

____ J

LAYOUT OF CARDS

29

30

COMPUTATION

In performing this operation Oon the machines a card is
provided for each point of the network. The two directions have been labeled x and Y as shown in Figure 1 ; the
x and y identification numbers are punched on each card.
To run the cards through a machine consecutively in the
y direction, they are sorted first on y, then x, and vice
versa.
A new deck is used for each iteration. The sequence of
operations in an iteration is as follows:
1. By a gang punching and reproducing operation on
the reproducer with the cards running in the y direction, the new function and its y neighbors are
punched in the new deck from the old deck on which
the new function has just been calculated. Both
decks have been sorted on y, then x.
2. New deck is sorted on y.
3. Deck is run through the 602-A consecutively on x,
allowing the x neighbors to be read from cards
ahead and behind. Thus, the average of four neighbors can be calculated.
4. Deck is sorted on x.

+I

0

-I

-2

°1°
I

I

..<-

-3

-4

-5

-6~----~----~--~

o

-I

RESULT OF OPERATION WITH

FIGURE

5. Cards are listed to check for errors.
6. New function is reproduced and gang punched into
still another deck, starting another iteration.

+1

w-a
T=O

2

It has been found convenient to apply this polynomial,
assumed to be of even order with roots symmetrical
about 0, as fOollOows:
Operate twice with the fundamental operator w to
obtain W2t/t, and then form the linear combinatiOon
[ ( w2_a 2 ) / ( l-a 2 ) ] t/t, thus using the roots ± a. Using this
result, one again iterates twice with wand repeats the process for another value of a, etc., until the roots are all used.
It is desirable to use the roots in an order that will prevent
high frequency oscillations from building up excessively,
as the number of digits carried in the iterations may be
exceeded at some points.
The remainder of this paper will be devoted to a discU!:lsion of the application of the fundamental operator w
to a function tIt. The problem is to compute the average
of the four nearest neighbors for each point of the network. This is done simultaneously for all points of the
network, and the resulting set of values is the new function wt/t. The network of points is shown in Figure 1. It is
necessary to cover only half the square because of symmetry.

The following special cards are used and necessitate
control wiring on the machines (Figure 1).
SOl' S1tCCessor cards: For these the predecessor neighbor
is eliminated. These occur next to the outer zero boundary, for which no card is provided.
2P cards: For these cards, which occur on the diagonal
line of symmetry, the predecessor is substituted for the
successor, which is not in the deck. That is, for the
point X n, Ym on the diagonal, on the y run one substitutes
(xn, Ym-l) (=Xn+H Ym) for x n, Ym+l' and on the x run substitutes (x n- H Ym) (=Xn, Ym+l) for Xn+l' Ym, thus obtaining
the proper neighbors.
Blank cards: These are used for the zero boundaries of
the hole. Punching is suppressed fOor them on the 602-A
so that they provide zero neighbors for adjacent points of
the network.
The operations in the reproducer run are shown in Figure 3. Note that the cards in the old deck must run one
ahead Oof those in the new one. It is desirable to have auxiliary identification on one deck or the other so that identifications may be compared. The cards shown are at the
end of one row and the beginning of another, illustrating
the operation Oof the 2P and S controls in the Y direction.
The programming of the 602-A is shown in Figure 4,
and control panel wiring is shown in Figure 5, page 32.
Here is formed 0.25 times the sum of· the neighbors (denoted as l).

NEW DECK

OLD DECK

Y+I
Iden.

New Funct.

Iden.

x~

F( x, YN-I)

x, Y N-2

Y N-I

w

Neighbor
F(x'YN_I)

F(x'YN-I)

F(x'YN)

F(x'YN)

---

Rep'
F{x'YN)

x'YN

x, Y N-I

~

F(xtl,y,)

•

F(xtl,y, )

-,

x,+ I, Y,
(x-57)

F(xtl'Y2}

F(x'YN_I) F(x, YN-2)

~.
--'
"-

1
G. p. ~t~e~)
•

.,;6

F(x'YN_I)

x+I~Y2

w

~
F(xtl,Y2)

I:>

F(x, YN-I)
!\to

F(x'YN

F(xt"Y3)

k ..S~ .b

lJIICIJ)

F(xtl,Y,)

----§Jt ~
F(x+l, Y3)

T

(~

~

•

=

Rep.

F(x+ I'Y3)

F{x'YN-2) F(x'YN-3)

a

Igep.

F(X+I'Y2)

x+l, Y2

x+l, Y3

x, YN

•

Neighbor

~~. ~~ "-

~

Rep.
X tl,y I
(x-57)

•

Y-I

Function

~

F(x+I'Y2)

"F(xtl,YI)

T

Rep.

X -57 OLD DECK
X -57 NEW DECK

2 P (y Direction) Reproducer Control
S (y Direction) Punch Control
REPRODUCER RUN (y)
FIGURE

3

STORAGE

~

2

I

Prog.

COUNTERS

12

7

3

c;o

Already
In

#,

x~'
x-I

~d

t

T2rx:lnJ

'"'Ti':2IT
L_____

I
RO

Skip
x-13

RC

RORC

~.
----

---~---

RO
x-21
T

#4

--------- ------------ ---- ---ti---RO

R.q Punch
x.'3N

x-I

R.1.
RO

~----,

R.I. Ix-21
I
T
r-.i-,
x

RC

RC

t

Read

Read

I

I

,.._1_.,

Read

IL... ___
X ..J:

[!pJ
Blank Card
2 p (x Direction)
S (x Direction)

~

L-r.J

Nothin~

x-13

x

R.I.

RO

O.25~

Multiply

X -13
X -21
X-22

----...

Control
Read
N x-21
.~

T

~

Read
Card
x+1

Read
y-I

Tx-21

Digit
Emitter

#3
Skip
x-13

Read
y+1

~

,..J

#2

'78 '

rx=,lT
L-- - . J -21
rL x
---,
R.I __

IRO
IN
' x·22
NORO
Tx-22
x R.1.

Read
Card x

'56'

'34'

Read x
Control x
Control x

T

602 A RUN (x Direction)

FIGURE

31

4

err

CALCULATING PUNCH - TYPE 602A - CONTROL PANEL
19

9

20

21

25

26

27

28

29

30

31

32

35

36

37

oloTo

0

38

39

40

41

42

43

44

0 0 IOTO

0

00

0

0

0

0

0

0

0

0

0

o lOT 0

0

0

0

0

o

oNe

0

0

0

o

0

0

0

---............ ......
3
o
000
000

o

0

T T

0

0

°IHtl'\~D"O

0

0

oNe

0

o

0

000

o

0

000

o

0

0

0

0

~~.P_U-oLS_E-'~~~~~~J-~__O--O

'vJ

N

/\

1(~~~K1p~~~*'RtMTsKIP~~
~ 0
0
0
0
0
01_1~1
I
~
Q

\~

0 To

l '1~

0

0

o~o

0

I P

Kl1 OR 12 41
C 0--0
0
TIM STOP 61

0

0

0100

jQ

R

0

0

~'L_
.01V150 .
o~o
0
R
0010000

10

I

~

0

0

0

0

O~O

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

~

0--0

0

0

0

3L

o

0

0

S

T

0

0

U

IU I .

•

6L

I

0

~o

o

0
0
0
50
00000
0
70
o
000
000
~STORAGE

100

• • •

• I

0

0

E

0
55
0
75

0
0

TRY~

0

0

0

0,0

0

0

~

0

0

5L

11

12
0

01

yl

16

6

6

616

6

~###~~

0

0

°IOTO

0

0

o 10 T-o

0

o

0

0

o

No

0

5
0

FIGURE

5

0

ONo

0

7

0

33

SEMINAR

The simplest means of calculating the eigenvalue A is
simply to take the new total of the function for all points
over the old total. This gives A = Swl/!/SI/!. A more accurate
value of A for the fundamental when higher modes are
present may be obtained by forming f I/! (wI/! ) / f I/! 2 •
Special IBM techniques described in this paper were developed by Mr. James Alexander of the Argonne National
Laboratory.
REFERENCE

1. D. A. FLANDERS and GEORGE SHORTLEY, "Eigenvalue Problems
Related to the Laplace Operator," Semiwar on Scientific Computation, November, 1949.

DISCUSSION

Professor Kunz: I would like to point out that there is
some difference between this and Laplace's equation, in
this sense: In Laplace's equation we have two items of
error: (1) How close do we approximate the differential
equation? And (2), how close are we to the solution of
the difference equation? We have those two, plus the fact
that the definition of A from the difference equation is not
the definition of A that is in the differential equation. This
may be seen by considering the simple case of the vibration, let's say, of a drumhead that is one by one. The
eigenvalue in this case is 2'7T2 • The appropriate finite difference equations can be solved exactly. In fact, for any
number of points taken, even though the number of interior points is only four or nine, the distribution is the
same; the actual distribution. It is a sine sine distribution
of the drumhead. So no error is made in approximating
the differential equation by a difference equation, as far as
the characteristic function is concerned. But the definition
of A is now 6 21/!/h21/!, which is n0't the proper definition in
terms of the differential equation.
I might mention in that connection that you can obtain
a much better result by not iterating further or taking
more points, but using simply a higher approximation to
the Laplacian.
Dr. Hummel: Yes; you do obtain a more accurate answer.
Mr. Turner: The difficulties in convergence in Dr.
Yowell's method and Dr. Hummel's method are associated.
As was pointed out, the characteristic solutions of the
equation \;2 I/! + A I/! form a complete system. Suppose the
equation were \1 2 I/! = o and in carrying out the numerical
operations at some point ij, instead of getting zero, we had
fij. Then, I think, it is quite obvious that these fi/S or the
errors can be composed of linear combinations of the eigenvectors associated with this problem. There are just as
many f'S as there are points, and there are just as many
eigenvectors as there are points ill your numerical difference equation.

Therefore, when a set of errors, a set of residuals,
occurs, which forms a repetitive pattern, it turns out that
they are actually composed of a combination either dominated by one particular eigenvector or made up of a linear
combination of the eigenvectors corresponding to one
particular eigenvalue.
If we were to go through the operations (I have had it
happen in actual numerical calculations that I get a set of
residuals) and after operating on them, all we did was to
change the magnitude of the residuals but didn't succeed
in changing their distribution, then in that case it corresponds rather clearly. If we treat the set of simultaneous
equations-that is, the matrix coefficients-as purely a
matrix operator, then what we have done is carry out the
operation A f = A f. That is practically the same thing as
happened in Dr. Yowell's paper.
If we will carry out this very simple operation we may,
in one step, eliminate the dominant phase of these errors.
First, at each point we form the sum of the absolute
values of the errors. We also take each of the errors, each
of the residuals, and operate on it with our matrix of
(~oefficients.

In other words, with the errors treated as the initial
function I/!, let's suppose that the result of operation equals
>"I/!. Let's call it some quantity v. Now, if we will form a
second sum; let's call this one A-B, which is equal to
l(V!f!/f). This sum is to permit us to determine whether a
particular eigenvector, which has perhaps both positive
and negative signs, is dominant. Then an approximate
value of the A is B / A, because in the operation on f to
obtain Af, that is A on f to get v, we have multiplied it by
the latent root of the matrix of coefficients, which happens
to correspond to the dominant eigenvector or combination
of eigenvectors having nearly the same latent root. Once
we have found A we can correct our original I/!. If we now
have errors of any substantial amount, corresponding to
the higher eigenvalues, this will produce a roughness corresponding to a small variation from point to point which
the subsequent smoothing process will eliminate quite
rapidly.
Mr. Kelly: Has anyone had any experience along these
lines of staying within the forced considerations of your
mesh? Has anyone observed any forced oscillations of the
type Dr. Hamming has found? We observed it staying
within the mesh by a factor of 10 to 1, and still observing
forced oscillations.
Dr. Hummel: This business of oscillation depends on
the range of the eigenvalues, doesn't it? I f you know what
the range of the eigenvalues is, you can certainly choose
the mesh in such a way that you won't get the oscillation.
You can always, of course, change your variables in such
a way that you don't get the oscillations. This has been
found true at least for the solution of the diffusion equation.

34
Dr. Alt: I have a question for Dr. Hummel in connection with the process of speeding up your convergence by
this trick. The division by (I-a) is not essential; that is
just to bring the eigenvalues back into' scale. But subtracting the constant a reminds me of something that I have
seen in the literature that I am not sure is the same thing.
It is in a paper by Aitken about 1937. It is the last paragraph of a very long paper, and is easily overlooked. What
he mentions is this: Suppose A is a matrix and % is a
vector, and that you are trying to solve the equation
Ax - AX = 0 for the largest value of A. If you replace the
matrix A by A - a this matrix has the eigenvalue (A - a) .
Some of the methods for finding the A's converge with a
speed which depends on the ratio of the largest to the
second largest A. Weare trying to choose a so as to maximize that ratio. But, as you mentioned, you have to make
sure that some of the smaller a's don't become large in that
process.
I did not hear what you said about getting around that;
but there is an answer given by Aitken. Suppose your
eigenvalues are AH A2 , • • • , An, and suppose they are all real
and arranged in size here. What you want to subtract is
the mean of the second largest and the smallest eigenvalue.
It is important to choose it this way. After this, the second
root and the smallest root become equal in size and opposite in sign, and the ratio of A1 to either of those is maximized. When the eigenvalues are complex, it is a little
more complicated. But you can see geometrically what
point you have to choose for a.
Dr. Hummel: That is essentially what I have in mind.

COMPUTATION

Dr. Alt: There is a very brief mention of this simple
method by Aitken in Proceedings Royal Society of Edinburgh~ 1936-37.

Professor K unz: I would like to point out an even
earlier work. There is an article by R. G. D. Richardson
in 1911, which is one of the earliest works on stresses in
a masonry dam. He considers the choice of a in quite some
detail. It has been disapproved of by those who did not
know of it.
Dr. Hamming: This point was discussed quite extensively at the last meeting. Flanders went a little bit further
in his discussions than has been, indicated here.

In the first place, you are not restricted to a linear expression. You can resort to polynomial expressions. What
they apparently had done--Flanders, Shortley, and others
-was to use the Legendre polynomials, which, as you
know, have many roots spread out fairly low and rise
sharply at the ends near 1, so that it multiplies this factor
and keeps all the rest of the bounds down. In private conversation afterward, we pointed out that he should not
have used Legendre polynomials but should have used
Tchebyscheff's equal ripple polynomials. In that fashion
you are not restricted to this. You simply form a polynomial combination of about the order you want, which
would be a Tchebyscheff polynomial-to keep the function
down over the whole range, to keep all these down while
working only on your maximum-and the degree has to be
restricted so that your eigenvalue is not caught over the
first zero of the polynomial which you are using.

A Numerical Solution for Systems of Linear Differential
Equations Occurring in Problems of Structures
PAUL E.

BISCH

North American Aviation, Incorporated

THE PRO B L EMS of engineering in which such systems are found and which are successfully solved are:
Determination of natural modes of free oscillations of
structures. 1
Determination of stresses in indeterminate structures. 2
In general these problems cover all the variations of an
actual structure; therefore, the classical solutions are impractical for the equations at hand.
There is only one variation in this solution between one
class of problems and the other; so the method will be
briefly sketched for the first class (oscillations) for which
it is more extensive. Let the differential equation be
dny
dy
An (%) d%n + ... + A 1 (%) d% + Ao (:r) y = 0

I f this approximate y and its successive derivatives are
used in the differential equation, a function ((%) is obtained which represents the error, as a correct y would
make the left-hand side vanish.
The two boundaries are called %1 and %2' and one equation for the solution of the C/s is obtained from

fX

J%l

,(x) Y, (x) dx = O.

There are s such equations, and they are homogeneous
in Ci • They can be solved for s - 1 of them as functions
of the s'th, provided that the determinant of the coefficients of the C/s vanishes. This condition provides an
equation of the s power of A, the s roots of which are
positive.
For any root Ai there results a set of coefficients Ci and
therefore an approximate solution Yi of the problem, or
mode of oscillation.
This method has many advantages which cannot be
pointed out here. It can easily beset up for tabular or
IBM calculations. When the An (%) are random curves,
the integrations can be rapidly made by increments on the
IBM machines, thus making the mrthod very general.
Its accuracy is very satisfactorY. For instance, it is only
necessary to make s = 3 in order to obtain. the first two
modes Yl and Y2 wifli an accuracy consistent with the engineering problem.
It can also be said that the preceding integral equations
happen also to satisfy the equation of least work for this
first class of problems.
On account of its simple algebraic form it is possible,
as shown in Sections I-T and I-B of reference 1, to solve
in advance the problem for a large family of cases.
Other problems of structures which have the same type
of differential equations but where the characteristic number A has another meaning can be similarly solved.

where the A's are functions of % and may contain a characteristic number A, or ()}.
The method, which is presented in detail in reference 1
and reference 2, is briefly described here. The problem includes n boundary conditions and when one boundary condition is used in the differential equation, another equation, called a secondary boundary condition, is obtained.
There are n such equations. Altogether there are 2n
boundary conditions.
The unknown y is then written

y=:L:

o

8

CiYi (%)

i=l

where Ci are factors to be determined and Y i (%) are
polynomials in % which satisfy the n boundary conditions
and the n secondary boundary conditions.
These polynomials Y i (%) have a form
ai+p(n+l)
B 1 ( 'I.) % ai+p + • • • + B 2n+l (.)
'I %
where ai and p are selected in a simple manner, and the
B (i) 's are obtained from recurrence formulae obtained
from the 2n boundary conditions. In general Cti and pare
the same for the same type of problem, and the B (i) 's
change very little with the coefficients of the differential
equation. The polynomials for the many cases of bending
and torsion oscillations are all to be found in reference 1.

THE SECOND CLASS covers two-dimensional problems of
indeterminate structures. An important one is the determination of stresses in box structures as in reference 2. The

35

36

COMPUTATION

key of this problem is the solution of a set of n linear
differential equations such as

Am (x)

d;~~ + Bm (x) dl: + C:-l (x) Ym-l
+C: (x) Ym + C:!+l (x) Ym+l = Dm (x)

with 'In = 1, 2, ... , n. The A, B, and C are known functions of %, and the D's are known functions of % and of
other independent variables.
This problem has 2n boundary conditions, 2 per equation. Moreover, when these conditions are used in the differential equations written at the boundaries %1 and %2'
2n secondary boundary conditions are obtained. This
makes a total of 4n boundary conditions.
The unknown Ym is approximated by

efficients are not homogeneous, and their direct solution
gives these coefficients as linear forms of the independent
variables of the problem.
Considerable simplification is obtained by the use of an
auxiliary variable z which varies from to 1 between
boundaries.
For the problem of reference 2 it was found with sets of
six linear differential equations that the solution checked
closely the solution based on consideration of the minimum
square errors obtained by using the integrals

°

f'''... ;q.

l%1 .

(x)

dx = 0

d€m/dCr

Yr =

although
is different from Yr(%).
Moreover, these solutions check well the solutions obtained by classical methods for sets of 6 linear equations
with constant coefficients, although s was taken as 1 which
represents the simplest solution.
In general it will be found that these solutions, in addition to their satisfactory accuracy, are several times shorter
than the classical solutions, even for the simplest cases of
such systems.
The general solution of a given problem can often be
carried algebraically up to the integrated form of the C
equations, thus giving a compact solution which can be
carried out simultaneously on IBM machines for several
cases of the same problem in the time required for one case.
Finally, the writer believes that the same method could
be successfully extended to linear partial differential equations, although he knows of no such application to have
been made to date.

D are found from recurrence formulae obtained from the
4n mentioned equations.
For the box structure problem of reference (2) the functions ym (i = 0, 1, 2, ... ,s) can be found completely determined in this reference.
The coefficients Crr in Ym are determined as in the first
class of problems. However, the ns integral equations obtained for the determination of the same number of
co-

1. North American Aviation Report N A-5811, "Method of determination of the frequencies, deflection curves and stresses of
the first three principal oscillations in torsion and bending of
aircraft tapered structures."
2. North Amerioan Aviati,on Report NA-48-310, "Determination
of stress distribution and rigidity of multi-cell box structures."
3. Aero. Res. Committee Reports and Memoranda:
R & M 1799-Approximation to Functions and to the Solutions
of Differential Equations.
R & M 1798-Galerkin's Method in Mechanics and Differential
Equations.
R & M 1848-The Principles of the Galerkin's Method.

L:Cr
8

Ym

=

Y~ (%) +

Yr (%)

~=I

as in the first class of problems.
The polynomials Y~ (%) are made to satisfy completely
the boundary conditions. Their form is
Y~ (%) = A~ + B~ x +
x 2 + D~n x 3 •
The coefficients A, B, C, and D turn out to be easily computed as linear functions of the independent variables
present in the previous Dm (%), and of the boundary
values.
The polynomials
(x) are set to satisfy the 4n boundary equations obtained when the right-hand terms made of
independent variables and boundary values are made equal
to zero. Their form is

C::

Yr

Am x ai + p + B'[" %ai +2p
+ crn ~.ai+3p + Dr %ai+4p
where ai and p are easy to determine and the A, B, C, and

Cr

Matrix Methods
KAISER

S.

KUNZ

Case Institute of Technology

where a, x, and b are termed matrices and stand, respectively, for the sets of numbers included in the parentheses
of (5). A matrix is conceived of as a complex of all the
numbers in the parentheses; thus, if any of these numbers
is changed, the matrix is changed.
Two matrices are equal only if they have the same number of .rows, the same number of columns, and corresponding elements (numbers) are equal. Matrices of the type of
X and b are called column matrices or vectors. In order
that (5) be equivalent to (1), the product of the matrix a
and the vector % must be the vector

WE SHALL STAR T with a set of simultaneous
linear equations as being something familiar, and I shall
restrict my discussion to three equations in three unknowns
Xv X 2 , and Xa. The generalization to larger sets should be
clear. We can write the coefficients of these unknowns by
using a single letter a with two subscripts; thus
all x 1 + al2 x 2 + alaXa = b l
a21 x 1 + a22 x 2 + a2a X a = b2
(1)
a31x 1 + a32 x 2 + aaaxa = b a .
Because of the presence of the constants bv b2 , and ba (at
least one of these is assumed different from zero), this set
of equations is said to be nonhomogeneous.
If the determinant of (1)
all a12 a l3
!:::.
a21 a 22 a23 =1= 0 ;
(2)
a31 aa2 aa3
then a solution exists and by Cramer's rule is
1
Xl = !:::. (b 1Al1
b2A21
baA a1)

allXl

==

+

1
X2 = !:::. (b 1A12

In the same manner, the solutions Xli %2'
can be expressed by the matrix equation
X = a-1b ,
where the square matrix a-1 has ~lements

(3)

ai}

Introducing the concept of a matrix and of a matrix
product, one can write (1) in the form

ax = b ,

(7)

=~

Aji

xa

given by (3)

(8)
(9)

The matrix a-1 is called the inverse of the matrix a. The
numerical determination of the elements of the inverse of
a given matrix is one of the important problems of numerical analysis.
Another basic problem is the finding of the product of
two matrices when the number of rows and/or columns is
large. The product of a matrix A by a matrix B can be
taken only if the number of columns of A, say p, is equal
to the number of rows of B. If this condition is met, the
product is a matrix C having the same number of rows
as A, say tn, and the same number of columns as B, say n.
The elements of Care

MATRICrtS AND MATRIX PRODUCTS

al3)
a2a (Xl)
X2
aaa
X3

•

k=l

Here Aij is the cofactor of aij, i.e.,
Aij = (- 1 ) i+j ~ ij
( 4)
where ~ij, the minor ofaij, is the determinant obtained
from !:::., by crossing out the row and the column in which
aij occurs; thus

a12
a22
aa2

+ a 1a%s )

a

= !:::. (b 1Ala + b2A 2a + baAaa)

all
a21
(
aa1
or simply as

x2

(a%)i = L:aikXk .

1

Xa

12

This may be summarized by saying that the elements of the
product are

+

+ b2A22 + baAa2)

+a

a21 x 1 + a22 x 2 + a2a%a
(
aa1 X1 + aa2 X2 + aaaXa

L:
P

l

( bb )
2

Cij =

bs

k=l

AikBkj

(10)

where i = 1, 2, ... , tn and j = 1, 2, ... , n. Clearly (7)
is a special case, n = 1, of this rule.

(6)

37

38

COMPUTATION

A matrix having n rows and n columns is termed a square
matrix of-degree n. It is easily verified that the product of a
square matrix and a vector involves n 2 multiplications, and
the product of two square matrices of nth degree requires n 3
multiplications.

in the same way, thereby obtaining a 1 in place of the
a;2 and a zero in pla.ce of the a;2. It is possible, then, to

drop consideration of the second row. It is clear that by
continuing this process the matrix can be reduced to the
form

(12)

NUMERICAL SOLUTION OF SIMULTANEOUS
LINEAR EQUATIONS AND MATRIX INVERSION

The numerical solution of a set of simultaneous linear
equations by means of (3) is usually thoroughly impractical when the number of equations n is of the order of ten
or more. This is due to the excessively large number of
multiplications required, namely, (n + I)! y (n), where
1  i > 2
1.1.
verse of a matrix, is not very difficult. One simply makes
1'=1
These equations result from multiplying the matrices S'
provision for keeping the b's separated, instead of allowing
and S in (17) and equating corresponding elements on the
them to add together as one proceeds.
two sides of that equation.
There are a great many methods which are closely related to the elimination method. Although some of these
Having the Sij, (18) can be solved, very easily, for the k i
are very good for certain purposes, such as the Crout
and (19) for the Xi, since these equations are in diagonal
method for a desk-type calculator, I shall lump them with
form. These steps require only about 2n 2 multiplications.
the elimination method. The methods I shall treat are
The total number of multiplications for the whole process
is approximately (1/6)n s .
chosen mainly for their interest.

o
o

(

suT~er~or~: ~~

Of(~l))we

(Xl)

+

...

.

L:

40

COMPUTATION

This method requires, therefore, only about half as many
operations as the elimination method; however, it is applicable only to symmetric matrices. While the taking of a
square root is not usually as simple as a multiplication or a
division, the process requires only n square roots and n
divisions, which are negligible for sizeable n. The coding of
this for automatic calculators has been tried, I believe.
(Some difficulty in this respect arises from the need, at
times, to take a square root of a negative number, thus
introducing i = \1-=1 into our computations. The numbers, however, are either real or pure imaginary.)

Iterative Methods
The next method I would like to consider is an iterative
method, a method of successive approximation, or sometimes called the Gauss-Seidel method. In discussing this
method I should like to treat a particular example. Again,
I shall restrict myself to three equations in three unknowns.
The particular equations I shall consider are:
25%1 + 2%2 +%3 = 69}
(21)
2%1 + 10%2 + %3 = 63
%1
%2 + 4%3 = 43 .
I have used a mathematician's prerogative to deal with
nice round numbers. Seldom will you be called upon to
work with such convenient numbers.
Often new methods are developed by starting with very
bold approximations. Here we shall initially assume that
all the coefficients, that are less than 4, are small enough to
be neglected. This makes it possible to write down at once
as a zeroth approximation,

+.

25%(0) = 69}
(22)
10%~0) = 63
4%~0) = 43
.
The equations (21) are approximated, thus, by a set of
equations in diagonal form.
This brings up a problem in terminology. Does one start
with a zeroth or a first approximation ? Well, I have
adopted the following answer. You normally employ the
designation first approximation, except when making a
wild guess that cannot properly be justified; then it is a
zeroth approximation. Surely in this case it is a zeroth
approximation, which shall be designated by the matrix
%(0)

=(

:~:;

)

%~O)

= (

~:;6)

(23)

10.75
For purposes of comparison, the correct answer is

x=( ~.)

(24)

Surprisingly enough, the zeroth approximation gives the
right order of magnitude for the %' s.
Now, having some idea of the size of the individual %'s,
we can go back to (21) and correct for the off-diagonal
terms. Thus, the first approximation is written:

25%i1) = 69 - 2%~0) - %~O) }
10%~1) = 63 - 2%~0) - %~O)
4%~1) = 43 _
%\0) _ %~O)
.

(25)

The right-hand sides are known; so %(1) can be found,
which turns out to be
%(1) = (

!:~~~

8.485

)

.

This still differs considerably from the answer given in
(24), but progress is being made.
Having a better answer for %, we are in position to make
a still better estimate of the correction terms in (25);
therefore, we can obtain a better approximation %(2). This
process can be repeated as often as desired. Thus, in
general
25%ii.+l) = 69 - 2%~i) - ;v~'il
10%~i+1) = 63 - 2%ii) _ %~i)
4%~i+l) = 43 xii) _ %~i)

}

(26)

and these equations can be written
X(i+l)

= (:~::;) = (
%(i+l)
3

~:;6)

10.75

+ (-~.2

-0.25

-0.08

o

-0.25

-0.04)

%(i))

-0.1

o

(

%~i)

%(i)
3

(27)
•

At the sixth approximation, we have
%(6)

=

2.002 )
5.0004
( 9.0006 .

These values are close enough so that we do not need to
apologize for them, and, clearly, the iteration can be continued with consequent improvement of the results, as long
as desired. This is a Gauss-Seidel process. Whether the
process converges depends on how large the diagonal terms
are, compared to the off-diagonal terms.
A slightly different technique may also be employed. Instead of pruning off all the non-diagonal terms, just those
terms above (or below) the diagonal may be removed. This
reduces the equations for the zeroth approximation to triangular form. To solve these, only n 2 multiplications are
needed instead of the ( 1/3) n 3 needed for the original
equations.
This method is due to Morris. He showed that if the
matrix of the coefficients of a is positive semidefinite,
which means that the associated quadratic form is either
positive or zero, then the process converges. The iteration
equations here are
(28)

41

SEMINAR

Having the ith approximation of all three %'s, the (i+ 1)
approximation of %1 is obtained from the first equation.
Then, knowing %ii,+l), the second equation can be solved
for %~i+l) and thereafter the third equation for %~+l).
Going back to the question of the convergence of the
iteration equations (26), this equation can be written in
matrix form as
(29)

_ (~ ~ ~) (~:t::)
(2~o 1~0 4~) (:t:::;) = (~;)
43
1 1 0
.t"~i)

%~i+l)

•

This in turn may be written
E %(i+l) = b - H %(i) ,
(30)
where E and H are the square matrices shown. Note that
this latter equation could be obtained directly from equations (21), which can be written in the matrix form
A % = b, by observing that A = E
H.
If both sides of (30) are multiplied by the inverse of E,

+

E-l =

0.04)
0.10
( 0.25 ,

we obtain the equation
%(i+1) = E-l b - E-l H %(i) •
(31)
Now, since E-1 b = %(0), the solution of the diagonal equations, E %(0) = b, given in (22), and letting F = - E-1 H,
equation (31) can be written
%(i+l) = %(0)
F % (i) .
(32)
Written out, this is just equation (27).
In particular
(33)
%(1) = %(0) + F %(0) = (1
F) %(0)
%(2) = %(0)
F %(1) = (1
F +F2) %(0)

+

+
+

+

%(n) = %(0) + F %(n-1) = (1 + F +F2 + ... + Fn) %(0)
thus, the convergence of the process reduces to the question
of the convergence of the series
f:Fi %(0) .

(34)

i=o

The latter can be shown to require that the characteristic
values of F, in absolute value, be all less than unity. These
characteristic values are the possible values of the constant
A in the equation
(35)
where

f=UJ

is any vector chosen so as to satisfy (35). Such a vector
is called· a characteristic vector.
Generally there will be three tf'S and three corresponding A'S, that satisfy this equation.
Thus, in place of (35) we may write
F tfi = Ai tfi, i = 1,2,3.
(36)
For n equations the number, of course, will be n instead
of 3.

I shall not prove the above requirement for convergence,
except in the following plausible way. Let us assume that
the characteristic vectors tfll tf2' tfs are three linearly independent vectors, so that %(0) can be expanded in terms of
them,
%(0) = k1 tfl + k2 tf2 + ks tfs ,
(37)
where the k's are suitable constants. Then
F %(0) = k1 F tfl + k2 F tf2
ks F tfs ,
or by (36)
F %(0) = k1 Al tf1 + k2 A2 tf2 + ks As 1/Is .
Repeated application of F to %(0) , therefore, gives
Fi %(0) = k1 A~ tf1
k2 A~tf2
ks A; I{Is ,
and hence, if
I Ai 1 < 1 , for i = 1,2, and 3,
lim Fi x(O) = O.

+

+

+

i~oo

Moreover, if I Al I > 1 A2 I· and I Al I > I As I and n is
sufficiently large Fn+1 %(0) ~ kl A~+1 tfl ~ Al Fn %(0) .
Thus, the series (34), as far as convergence is concerned,
acts like a geometric series with a ratio given by AI' Since
I Al I< 1, this series should converge.

Finding Characteristic Values of Matrices
The above discussion of the iteration methods points up
the need to find the characteristic values Ai for a matrix.
This fundamental problem is very interesting. Let me make
several observations concerning it.
Let F again be the matrix under consideration, but let it
be used to represent any matrix being studied. We require
the characteristic values Ai of equations (35) and (36). For
this purpose, let us introduce the unit matrix,
1=

(°1 ° 0)
0

(38)

1 0

0

1

,

then (35) can be written

(F - AI) tf = 0.
In expanded form, (39) is written as follows

(39)

°

lS
1
Fll-A F12
)
(ccc: ) =
F21
F 22 -A F
F 2S
(40)
(
FS1
FS2
Fss - A
Equation (40) represents three homogeneous linear equations for the components Cll c2 , and Cs of tf;.
A solution of these equations, other than trivial solution
Cl = C2 = Cs = 0, is possible only if the determinant of the
coefficients is zero, that is
Fll-A F12
F 1S
D (A)
F 21
F 22 - A F 2S
= 0. ( 41 )
FSI
FS2
Fss-A
Equation (41) leads to a polynomial equation of the nth
degree in A (here n = 3); therefore, there are at most n
characteristic values.
One of the methods for finding the largest of the characteristic values, say All is indicated by our previous discus-

==

42
sion. If 1/1 is any initial vector, which can be expanded in
terms of the characteristic vectors, as %(0) was in (37),
then repeated application of F leads to a vector that is
nearly 1/11 times a constant. \Vhen one reaches this point, any
further multiplication by F multiplies this vector by AI'
essentially. There are several methods based on this fact.
Another method, one with which you may not be familiar,
is to solve for the roots of D (A) directly by the use of the
method of false position; that is, we substitute some value
of A in the determinant and evaluate the determinant numerically. We repeat this for several neighboring A'S. This
gives us a few points on the plot of D (A) versus A. If D (A)
has opposite signs for two values Aa and Ab, then, since
D(A) is a polynomial and hence continuous in A, it must be
zero somewhere between these values. The method of false
position estimates this value by assuming a linear variation
of D (A) between Aa and Ab, and is conveniently coded for
automatic computation.
Since each evaluation of D (A) requires evaluating a determinant of the nth order, and this requires (1/3)n 3 multiplications, the process is open to serious objections. If it is at
all feasible, it is desirable to evaluate the coefficients of
powers of A in D (A), since once this is done, the task of
obtaining D(A) for some value of A is reduced to n multiplications. The great advantage of methods of this sort is
that all of the characteristic values can be evaluated at least
in principle, and not just the largest.

COMPUTATION

I shall close by pointing out one of the simple ways in
which one may obtain an upper bound for the absolute
value of the largest characteristic value AI' This can be accomplished by considering equation (40), which must be
satisfied. For the moment, assume that I c1 I is the largest
of the three numbers I C1 I, I C2 I, and I C3 I, the absolute values of the components of the characteristic vector 1/1 corresponding to AI' Then from the first equation arising from
(40), we have

A1 C1 = Fll C1 + F12 C2 + F I3 C8
or

I < I Fllllc 1 1+ IF121 ic2\

IA =
1

le

l \

+

\Fla\ ica\

< IFni + IFI21 + IFla\
This is just the sum of the absolute values of elements of
the first row of F.
If I c2 1 is the largest of the constants, it can be shown,
from the second equation, that IAll is less than the sum of
the absolute values of the elements in the second row. Likewise for ICal largest, the absolute values of the elements in
the third row are summed. Without making any assumptions as to the relative sizes of the ICil, nevertheless, the
following rule can be stated: If the absolute values of the
elements of each individual row are summed and the largest
sum chosen, this sum must exceed the largest characteristic
value. This upper limit is very helpful.
There are, of course, still other schemes for determining
an upper bound on the size of the characteristic values.

Inver sion of an Alternant Matrix
BONALYN

A.

LUCKEY

General Electric Company

AN ALTERNANT MATRIX is of the form as
shown in Figure 1. It is a square matrix of order n in
which the elements of each row are increasing powers from
o to N -1 of AN. Thus the elements of the first column
are one.
According to Aitken, l the reciprocal of such a matrix can
be written. The elements of columns 1, 2 and n are shown
in Figures 2, 3, 4, respectively.

A,
A2
A3
A.

AN

-

A~

AN-1

±

+ A2A. + ... + A2AN + A3A • + ... + A3A N + ... + A N- A
1

+ AlA.

N

1

A1

+ A3 + A. + A5 + ... + AN

FIGURE 2

N'l'II

COr.UMN

A, . A2 . A3 . A • ... A N- 1
(A N-A 1) (AN-A2) (AN-A3)'" (A N -A N_1 )

+ A,A2A. + ... +A,A 2A N_1 + A,A3A. + ... + A,A.AN-1 + ... +AN-3AN-2AN-1
(A N-A 1) (AN-A2) (AN-A.) ... (A N-A N_1)

+ A1A2

-

+ AlA, + ... +A1AN + A3A.+ ... +A.AN + ... +AN- A N
(A 2 -A 1) (A 2 -A.) (A 2-A.) ... (A2-AN)

+ + + + ... +

A,A2A.

N

A, . A3 . A • . A 5 ••• AN
(A 2-A 1) (A 2--A 3) (A 2-A.) ... (A 2 -A N)

(A 1-A 2) (A,-A3) (A 1-A.) ... (Al-AN)
A3
A.
A5
AN

±

.

2ND Cor. UM 1'4

1ST COr.UMN

A2

A~
A24

-

FIGURE 1

A2 . A3 . A • . A5 ... AN

+ A2An

A~

AN-1
1
AN-1
2
Af-1
AN-1

A'Z1

+ AlA. + ... + A 1A N-1 + A2A. + ... + A 2A N-1+ ... + A N-2A N-1
(A N-A 1) (AN-A2) (AN-A.) ... (A N-A N - 1)

A1

+ A2 + A3 + A, + ... + A N-1

FIGURE 4

43

44

COMPUTATION

Note that all of the denominators in anyone .column are
identical. The numerators are formed by taking various
combinations of the values of AN, AN-I' A N-2 , ... , Ao. The
forms for denominators and numerators are shown:

The algebraic signs of the terms in the even-numbered
rows, counting from the bottom up, are negative.
If 6A is constant and positive, where 6A is the difference between successive terms of column 2 of the alternant
matrix, the denominators can be simplified:
6A

r=1

rl=i

i = column index
1r = product of indicated quantities
Numerator
r=Y
Ar

=

(Aj - Ai-l)

DI

= (_I)N+i

D2

=

8

(N-I)! O! 8N - 1

(_I)N+i (N-2)! I! 8N - 1

Ds = (_I)N+i

LIT

=

(N~3)!

2! 8N - 1

r=1

rl=i

Terms taken N - 1, N - 2, ... , 3, 2, 1, 0 at a time for
rows 1, 2, 3 , ... , N, respectively.
D N- 1 = (_I)N+i I! (N-2)! 8N - 1

It might be of interest to consider the number of terms
in any numerator. This can be done by finding the number
of combinations of (J.V - 1-) things taken (N - j) at a time.
For example, in a matrix of order 13, the numerator of the
sixth row would contain 792 terms:
(N-l) !
N j
N-1C - = (N-j)! [(N-l) (N-j)]!
N = order of matrix
j = row index

DN

=

(_I)N+i O! (N-l)! 8N - 1

Dv D 2 , Ds , ... , DN are denominators of columns 1, 2, 3,
. .. , N, respectively.

To facilitate the use of IBM punched card machines, the
solution was written in a different form. This eliminates
finding as many products and combinations of products as
in previous forms. The first, second, and Nth columns for
the inverse matrix are shown in Figures 5, 6, and 7.

1ST COLUMN

FIGURE

5

2ND

COJ,U M N

FIGURE

6

NTH COLUMN

p

Row1:~D

·"1N N

PRow 2 : -

[1
1 1 ... +-+1 1J
--+-+-+
A N-2 A NA2 Al

ANDN A N- 1

3

{1
1 1 A 1N- [1
1 AN-a1 (1-A N-+A-1)J
--.-.-+---.-+A N-2 A
N1 [1- . -1+1(1
1)+1(1
1 1)J
+- -+---+-+A N- A N-2 A N- AN-a A N- A N-2 A N- A N- A N-2 AN-a
1[1
1(1
1)
+ ... +- . -1+ -+Al A N-2 A N- A N- A N- A N-2
1 (1- + -1+1)
1(1-+-+-+
1 1 ... +-Aa1)J }
+- + ... +A N- A N- A N-2 AN-a
A2 A N- A N-2 AN-a

P
Row4'--

. ANDN AN-a A N-2 A N- 1
5

Row N: A:DN

I

I

1

I

4

N- 1

4

4

I

3

1

I

[~1 ·~2 · ~3 ... A~-3 • A~-2 • A~-IJ= iN
FIGURE

45

7

I

2

46

COMPUTATION

P = AIA2AS ... AN
DH D2 , ... , DN are the denominators of columns 1 to N, respectively.
For actual calculation procedure the values of P,
D!) D 2, D s ' ... , DN
l/A!) 1/A2' l/A s ," ., I/AN
P / AID!) P / A 2D 2, ... ,P/ ANDN are calculated first.
Note that the last values are the elements of the first row
of the reciprocal matrix.
N decks of cards are made up, each containing N cards.
The decks are made up from a circular arrangement of the
values of 1/AN, 1/A N - l , • • • , 1/A l • The last card in each
deck is 1/Ai replaced by P I A~i' These cards also contain
the values offset gang punched on the following cards, as
shown below.
1ST COLUMN

1
AN
l/A N- l l/AN l/A N-l (l/AN)
1/AN-2 l/A N-l 1/A N-2 (l/AN
l/A N-l
l/AN-s 1/A N-2 l/AN-s (l/AN + l/A N-l

+

1/A2
l/As
P/AlD l 1/A2
2ND COLUMN

l/Al
l/AN
l/A N - l

P/AlD l (l/AN
1/A2

l/Al
l/AN
l/A N - l

1ST

COLUMN

l/AN
1/A N - l
1/A N - 2
)

+ 1/AN-2)

+ l/A N-l + ... + 1/A

3RD COLUMN

The first group of values is used as multiplier while the
second group is used as multiplicand after being progressively accumulated. The product for first column is shown
below. The last card in the deck contains the element of the
second row.
For each successive row, the first group of values is reproduced, and the products are offset gang punched on
following card. This process is done N - 1 times. Note
that each time the process is completed, the number of cards
in each deck with products other than 0 decreases one, until
the decks for the last row, the only card containing a
product, will be the last card which is the P / ANDN card.

2)

NTH COLUMN

l/A N - l
1/A N - 2
l/A N - s

1
1
1 )
liAs ( -A+A-+"'+-A
N
N-l
4
1
1
1 )
1/A2 ( A+A+"'+-A
N
N-l
s
REFERENCE

1. A. C. AITKE.N, Determ.inants and Matrices (London: Oliver,
1945) .

Matrix Multiplication on the IBM Card-Programmed
Electronic Calculator
JOHN

P.

KELLY

Carbide and Carbon Chemicals Corporation

THE 0 N L Y T Y P E of calculation to be considered
here is matrix multiplication. Time has not permitted any
concrete work to be done on matrix inversion, but I shall
have a few comments to offer on this later. In matrix work
there are only simple arithmetric operations. This multiplication demands only two steps:
Operation 1: This is a shifting from electronic storage
A (FS 1-2) into the electronic counter. For this problem,
channel A is permanently connected to FS 1-2 (assigned)
and channel B to F S 3-4 (assigned).
Operation 2: This is an 8-digit by 8-digit multiplication with the results in the electronic counter.
The problem to be considered is the following matrix
multiplication:
a12 • • • • b11 b12 • • • •
a22 • • •• b21 b22 • • • •

register or accounting machine counter in which the indicated item will be held. The problem considered here is
simple enough so that no shift is required. The elements of
rows 1-5 are stored in the accounting machine counters,
using an adding code of 7.
The entire bank 1 and all but one register in bank 2 of
the mechanical storage (941) has been used, in addition to
5 accounting machine counters. Of the two remaining
counters, number 1 is too small and number 2 will be used
to accumulate totals. Thus, the method to be used here can
be used for multiplication when the right matrix contains
21 or fewer rows. For larger matrices, elements containing
fewer digits might be used with the registers split.
Table II shows the general layout for the deck of cards
to be used, and a detailed description of the clearing cards.
Table III contains the description of one row of the lefthand matrix A. Each element of this matrix is card read
over channel A. The corresponding element from the righthand matrix B is called from storage on channel B. The
operation 2 is multiplication, with the 72 in channel C adding the product in counter group 2.

The general term of the product matrix takes on the following form: 1
Cij

=

L a i k b kj

It should be pointed out that, with the exception of the
clearing cards, only one row and one column have to be
programmed. All other rows or columns, as the case may
be, take on the same form.

k

that is, each term is the scalar product of a row and column
vector. There are two practical methods of calculation. One
is to perform all the multiplications involving one row of
the left-hand matrix; this generates an entire row of the
product matrix. The other is to perform all the multiplications involving one column of the right-hand matrix, thus
obtaining an entire column of the product matrix. The latter has been chosen for reasons which will become obvious.

The clearing cards can be eliminated by changing the
channel B coding of the last row of the A matrix to counter
read out and reset (8), instead of just read out (7). The
deck to be run through consists of column 1 of matrix B
and all of matrix A; column 1 of the product matrix C will
be obtained. The row number of matrix A is used as a control field to clear counter 2. The products that make up Cl l
will be listed; however, all other elements will be tabulated.

Calculations

The machine is loaded with the elements of one column
of the right-hand matrix, the instructions for which will be
found in Table I. The zero instructions for channels A and
B indicate card reading. In Operation 1, as mentioned
above, the quantity in FS 1-2 is shifted to the counter. The
instruction for channel C is the code number for the storage

THE STANDARD methods of matrix inversion-any of the
elimination methods to a triangle or to a diagonal directly
-involve approximately eight to sixteen thousand intermediate results, depending on whether you carry the unit
matrix in your calculation.

47

48

COMPUTATION

Let's take for granted that we do not carry it along. We
will just carry along the 400 elements of the 20 X 20 matrix in the inversion. Even this involves 8,000 summary
punchings, which take in the neighborhood of 1.5 seconds
cycle. The obvious
each as compared to a 0.4 second card
I
way is to avoid taking intermediate results out as much as
possible.
One way of accomplishing this is through the use of the
enlargement method. To review this: one starts with the
inverse of the upper left-hand element, which can be enlarged to the inverse of the upper 2 X 2 matrix through
simple multiplications, additions, and subtractions. It is a
function of the inverse of the single element, the additional
column, row, and diagonal element as shown below:
all
( a

a12)-l
a

21

22

TABL:e II
CL:eARING CARDS
Card No.

A

1
2

81
82
83
84
85
86
87

3
4

5
6
7

GENERAL D:eCK
(Obtains jth column of product matrix)
1. Seven clearing cards
2. jth column of B (20 Cards)
3. Matrix A in row order (400 Cards)
4.* 2 blank cards

a 2V a 22 ) •

This enlargement then proceeds, increasing the matrix by
1 row and 1 column in each step. Another alternative is the
use of second order enlargement, which increases the order
by 2 in each step. This general method appears to have
several advantages; summary punching and machine card
passes are reduced considerably, and the problem of significant digits can be avoided by checking the intermediate
mverses and iterating them for greater accuracy if necessary.

*These are necessary to print the list result when an intermediate
control break is used.

TABLE III
LEFT-HAND MATRIX Row CODE

TABL:e I

Row

Column

A

Oper.

B

C

i

1
2
3
4
5
6

00

2

73
74
75
76
77
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27

72

RIGHT-HAND MATRIX COLUMN COD:e
Row Column

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

j

A

Oper.

B

C

A-Entry

00

1

00

73
74
75
76
77
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27

b1• i
b 2• j
bs • J
b 4• i
b 5• j
b6• j
b 1• J
b s• j
b9 • J
b10 • J
b11 • J
b12 • j
b1s • J
b14 • j
b15• J
b16• J
bu. j
b1s• J
b19• J
b20 • j

All blank spaces indicate the same entry as that used in fir.st line.

C*

00

*This first card channel C clears the 604 electronic counter.

'

= f(a~i, a w

B

Oper.

7
8
9
10
11
12
13
14
15
16
17
18
19
20

A-Entry
ai,l
ai,2
ai"s
ai.4
ai,5
ai,6
ai,1
ai,

s

ai,9
ai,10
ai,l1
ai 12
ai, IS
ai,14
ai,15
ai.16
ai, 11
ai, IS
ai,19
ai,20

All blank spaces indicate the same entry as that used in first line.

R:eF'ER:eNC:e
1.

KAISER

S.

KUNz,

"Matrix Methods," pages 37-42.

Machine Methods for Finding Characteristic
Roots of a Matrix*
FRANZ

L.

ALT

Computation Laboratory , National Bureau of Standards

THE PUR P 0 S E of this paper is to describe a few
expedients which can be applied to computation of characteristic roots of matrices by means of punched card machines. In the course of two problems of this kind, recently
handled by the Computation Laboratory of the National
Bureau of Standards, some of these methods or variants of
methods were actually tried out on cards, and some others
were considered and laid out without actually being carried
through. In both cases the general type of method used was
suggested by the originator of the problem.

method, which fails once in a hundred cases but saves work
in the remaining 99 cases. With this norming convention,
the factors y~k) converge to the dominant characteristic
root, and the vectors y(k) to a corresponding characteristic
vector.
The computations were performed on the 602-A calculator. The 602 or 604 would have been equally suitable,
since there is no great amount of number storage. A machine with two card feeds, such as the Aberdeen IBM
Relay Calculators, would have been superior, because in
this case it would have been possible to feed the matrix
cards into one card feed and the vector cards into another.
Since we had no such machine available in our laboratory,
we proceeded as follows:
The matrix elements are punched into a deck of cards,
one element to a card. This deck is reproduced as many
times as we expect to have iterations. Before starting any
one iteration, one of these decks is prefaced by a small deck
containing the latest approximation to the characteristic
vector (in the case of the first deck, the chosen trial vector
y(O) ), the combined deck is sorted by columns, the vector
element~ are gang punched into the matrix cards, then the
deck is followed by a set of summary cards, sorted by rows,
and put through the 602-A for performing the matrix multiplication. This operation produces the unnormed, new approximation to the characteristic vector, punched into the
summary cards. These are then sorted out and put through
the 602-A again for the norming process.
To obtain the second characteristic root, the method of
"sweeping-out" the first characteristic vector is used. That
is to say, proceed exactly as for the first root, but after each
iteration subtract from the iterative vector y(k) a certain
multiple of the first characteristic vector. The same process
can be carried out for subsequent characteristic roots. In
these cases it is desirable to punch each component of each
iterative vector in several suc.cessive summary cards, one
for each of the previous characteristic vectors to be swept
out.
In the actual example carried out in our case, there were
additional computing steps brought about as a result of the

THE FIRST of these examples was of the conventional type:
given a matrix A of order n (in the example, n = 14), with
elements aile representing the approximate distribution of
elastic forces in an idealized airplane wing, to find the three
characteristic roots with greatest moduli. For finding the
dominant root (i.e., the one with greatest modulus) there
is the common method of starting with a trial vector
O
y(O) = (y~O), y~O), ••• , Yh »), and multiplying it repeatedly
by the given matrix. Thus, y(k) = Ay(k-l). This method is
described, e.g., by Frazer, Duncan and Collar.l It is an
excellent method for punched card machines, since the
multiplication of a vector by a matrix can be carried out
very simply. One minor trouble that arises is that after
repeated multiplication the numbers fall outside the range
of decimal digits which had been allotted to them on the
machine. This is prevented by "norming" the vector after
each multiplication by the matrix. We accomplished the
norming by making the last component of the vector equal
to 1 after each step. Thus (with a slight change in notation) we set
1
y(k)

=

Ay(7c--l) ; y(k)

=

yiF) • y(k)

This method would fail in case the matrix were such that
the last component of the characteristic vector happens to
be very small compared to the other components. Obviously, other norming methods could be used which avoid
this failure. However, it seems preferable to have a simple
*The preparation of this report was sponsored by the Office of Air
Research, USAF.

49

50

COMPUTATION

fact that the equations of motion of the airplane wing were
referred to a moving coordinate system. This requires an
adjustment after each iteration; the computation is similar
to the sweeping-out of earlier characteristic vectors.
The vectors converge reasonably well, except in cases
where there are two characteristic roots with equal or almost equal moduli. vVe did not run into any such cases.
Nevertheless, we felt it useful to speed up the convergence
of the process. A method which we used for this purpose is
the one described by A. C. Aitken 2 and called by him "the
delta-square process." It consists in taking three successive
approximations to the desired characteristic root, say, Vt-1'
Vt, and Vt+v and extrapolating from them to the desired
root by using the expression
Vt+1 Vt
Vt+l -

1 -

2Vt

vi
+ Vt-l

•

The same method can be applied to find directly a close
approximation to the characteristic vector.
Another method, which we discussed but have not yet
used, consists in subtracting from all terms of the principal
diagonal of the matrix a suitable constant, chosen in such
a way as to increase the ratio between the dominant and
subdominant characteristic root. (The subdominant root
is the one with second-largest modulus.) Suppose, for example, that it is known that all roots are real (as, for instance, in the case of symmetric matrices with real coefficients), and the largest root is estimated to be 10, the second largest 9, and the smallest 1. By subtracting 5 from all
elements of the principal diagonal of the matrix, a matrix
is obtained whose characteristic roots are smaller by 5 than
those of the original matrix; that is, the largest root has
become about 5, the second largest 4 and the smallest -4.
The ratio of largest to second-largest, in absolute value, is
now 5: 4, whereas previously it was 10: 9. Since the speed
of convergence of the iteration process tends to increase
with the size of this ratio, the process is likely to converge
faster for the modified matrix. In general, the constant to
be subtracted, in case all roots are real, is the arithmetic
mean between the root nearest the dominant root and the
one farthest away from it. It is necessary, of course, to have
estimates of these roots in order to apply this method.
ONE VERY OFTEN encounters matrices which might be called
"almost triangular." The name "triangular" shall be applied to matrices in which all elements above the principal
diagonal are zero. By "almost triangular" is meant a matrix
which has only a few nonzero elements above the principal
diagonal, and those are all bunched close to the diagonal.
To be exact, an nth order matrix A with elements auc will be
called "almost triangular of degree t" if aik = for k - i > tJ
where t is some integer between and n - 1. There is no
restriction on the elements below the principal diagonal.
However, some of the statements which will be made
toward the end of this paper apply only to "almost diagonal"
matrices, which are defined analogously by aik =
for

°

°

°

Ik - il > t; that is to say, both above and below the principal diagonal all elements except those close to the diagonal are zero.
For a completely triangular matrix, that is, t = 0, no
computation is required. The characteristic roots are equal
to the elements of the principal diagonal.
N ow take the case t = 1. Consider the matrix A - AI as
the matrix of a system of simultaneous homogeneous
equations.
=0
(all - A) %1 +
a 12 %2
=0
a 21 %1 + (a 22 - A) %2 + a 23 %3

+ . ..

+

°

anI %1
(ann - A) %n =
Our problem is to find those values of A for which the system has a solution. Because of the "almost triangular"
character of the matrix, the first equation contains only the
first two unknowns, the second equation only the first three
unknowns, generally the kth equation, only the first k + 1
unknowns. For simplicity of presentation, let us assume
for k - i = 1. Let us choose a particular
first that aik #
value of A and ask ourselves whether the system of linear
homogeneous equations has a non-trivial solution for this
particular A, that is, a solution in which not all of the unknowns are equal to 0. It can easily be seen that because of
our assumption that ai, i+1 # 0, the value of the first unknown in any non-trivial solution is not zero. And since
the system is homogeneous, an arbitrary nonzero value can
be assigned to %1' for example %1 = 1. Now substitute
%1 = 1 in the first equation and obtain %2' then substitute
%1 and %2 in the second equation and obtain %3' etc., down
to the (n -1) st equation from which the value of %n is obtained. If, now, all these values %v %2' ••• , %n are substituted into the nth equation, this equation mayor may not
be satisfied. The result of the substitution in the left-hand
side of the equation is, of course, a function of the particular
value of A chosen initially, and it may be designated by
E (A). If, and only if, E (A) = 0, A is one of the characteristic roots of the matrix. Now, the value of E (A) for a
number of different A'S may be computed, until enough
values of the function E(A) are obtained to determine its
zeros, either graphically or by inverse interpolation or some
other method.
This method of obtaining characteristic roots was described by Myklestad 3 ,4 and Proh1. 5 They described, independently of each other, the application to two different
engineering problems, but apparently without noticing its
general applicability and import. To Dr. A. Gleyzal goes the
credit for having noticed this and for having generalized
the method to cases of t > 1.
If these substitutions are carried out for a number of
different values of A, let us see how the values of the unknowns %v %2' etc., depend on A. Of course, %1 is chosen
arbitrarily, and it does not matter how it is chosen, as long
as the same %1 is substituted with each value of A. For the

°

51

SEMINAR

sake of definiteness, let Xl = 1. It is seen at once that X 2 is
a linear function of A; specifically, x 2 = - (a 11 - A) / a12 •
Similarly, it is seen, by substitution, that X3 is a quadratic
function of A, and generally that Xk is a polynomial in A of
degree k - 1. This property is .of very great assistance in
checking the computations. It is also evident that E(A) is a
polynomial of degree n and, as such, has n zeros. (These,
of course, coincide with the n characteristic roots of our
matrix.) Theoretically, therefore, it is sufficient to compute
E (A) for n + 1 values of A. Thereafter the polynomial can
be extrapolated indefinitely by differencing. In fact, however, one might lose too many significant digits in this
process. It is, therefore, preferable to compute E (A) for
more than n values of A, say, something like 2n values, and
to distribute these as far as possible equidistantly over an
interval which spans all expected values of A. I might also
mention that, in the example with which we were concerned, the question was not to find all characteristic values,
nor to find the largest or smallest in absolute value, but to
find those characteristic values which are located in a given
interval. For this type of problem this method is particularly well suited.
Sometimes one finds a matrix which is not almosttriangular as it stands, but which can be brought into
almost-triangular form by suitable rearrangement of rows
and columns. (It is necessary to apply the same permutation to both rows and columns in order to preserve the
characteristic roots.) In the case of any matrix with many
zeros, it is worth looking for this possibility.
Let us, now, drop the requirement that all ai, i+1 be different from zero and assume that one of these elements, say
aj, j+1 is equal to 0. In this case we can obtain the characteristic roots of the matrix A by partitioning it into a
matrix Al of order j, consisting of the elements in the upper
left-hand corner of A, and a matrix A2 of order n - j, consisting of the elements in the lower right-hand corner of A.
Each root of A is a root of either Al or A 2 , and vice versa.
The roots of A 1 and A 2 can be found by the method described above. If there are several zeros among the elements
just above the principal diagonal, the matrix A is partitioned into a correspondingly larger number of submatrices.
We now turn to the case t > 1. The essential features of
the method can be fully explained for t = 2. Generalization
for greater values of t will be self-evident. The basic idea is
again to attempt to solve the homogeneous linear system of
equations for a number of particular values of A.
The system now has the form
(all - A) Xl +
a 12 X 2 + a 13 %3
=0
a 21 Xl + (a 22 - A) x 2 + a 23 X3 + a 24 x 4 =0
(1)

+ .. .
+ .. .
anI Xl + .. .

an - 2 ,1 Xl
an- 1 ,1 Xl

+ an-2, n Xn

=

+ an-I, n Xn
=
+ (ann - A) Xn =

°°°

As before, consider first the case in which all ai, i+2 are different from zero. For any given A, in the case t = 1 it was
necessary to start with an assumed value of Xl' For t = 2,
it is necessary to start with assumed values for both Xl and
Xl!' For example, start with Xl = 1, X 2 =
and obtain successively from the linear equations the values of X 3 , X 4 , • • • ,
X n • By this time the first n 2 equations have been used.
Now all the x's can be substituted into the last two equations. This gives a value e1 (A) for the left-hand side of the
(n -1) st equation and a value 11 (A) for the left-hand side
of the nth equation.
We now repeat the process with two different initial
values for Xl and X 2, say, Xl = 0, .t'2 = 1. For the left-hand
side of the two last equations two new values e2 (A) and
12(A) are obtained. In general, all four values e l1 e 2, 11' and
12 will be different from zero. To find the characteristic
values of the matrix, notice the following: any non-trivial
solution of a system of equations will start with certain two
values Xl = au X 2 = a 2. (If all ai, i+2 =1= 0, then Xl and X 2
do not vanish simultaneously.) The values of the remaining
unknowns in such a solution are expressible as linear combinations of the unknowns, obtained in the two basic solutions before, with weights a 1 and a2 • Substitution of these
unknowns in the last two equations gives two values
e(A) = a 1 e 1(A) + a 2 e 2(A), f(A) = a 1 11(A) + a 2 /2(A).
For a non-trivial solution, e(A) and f(A) must vanish
simultaneously. For this to be possible it is necessary (and
sufficient) that the determinant
e1 (A)
e2 (A)
D(A) =
11 (A)
12(A)
be equal to 0. This determinant is a function of A whose
zeros coincide with the characteristic roots of the matrix.
Just as before, we select a number of values of A and evaluate the determinant D (A) for each of them and then find
the roots of D (A), either graphically or by interpolation.
To summarize, the steps taken are as follows:
X (1) = 1, Xl (2) = 0, Xl = a
Assume
1
1
X (1) = 0, X (2) = 1, X = a
Assume
2
2
2
2
X3 (2)
X3 = a1x 3(1)
Solve eq. 1 of (1) X 3 (1)
a2 x 3 (2)
X (2)
X 4 = a1 X 4 (1)
Solve eq. 2 of (1) X 4 (1)
4
+ a2 x 4 (2)

°

+

.

.

Solve eq. n - 2
of (1)

Xn (1)

X

n

(2)

Xn = a1x n(1)
+ a2 x n (2)

Substitute in
eq.n-l of (1)
e1
e2
e = aIel + aze z
Substitute in
11
12
I = aliI + azlz
eq. n of (1)
The terms in the last column above need not be computed; they are listed here only to illustrate the explanation.
The values of Zk(1) and Zk(Z), considered as functions of A,
are polynomials, so that their computation can be checked

52
by differencing. Likewise the determinant D(A) is a polynomial. It is possible to prove directly that D (A) is of degree n in A. This proof is not given here, since it is not
needed for the argument.
Cases in which one or more of the coefficients ai, i+2 vanish need to be treated specially. It seems most economical
not to complicate the machine method of computation by
allowing for these degenerate cases, but rather to treat these
cases separately when they arise.
In the general case of any value of t, the determinant
D(A) will be of order t. Considered as a function of A, it is
always a polynomial of degree n, regardless of its order.
Since the evaluation of the determinants of higher order
is laborious, the method given here is recommended primarily for the cases of t = 1, 2, or 3. These are just the
cases which are most likely to occur in practice.
The performance of computations on punched card machines is straightforward. The coefficients of the matrix are
punched into cards, one coefficient to a card. From this
deck, a series of decks is prepared, so that t decks are available for each value of A to be used. All of these are identical
with the original deck, except that the value of A has been
subtracted from the numbers in the principal diagonal. The
cards containing the coefficients ai, i+t are characterized, by
a special punch, as summary cards. It is expedient, but not
necessary, to divide each row by ai, i+t, so as to make the
latter coefficient equal to unity. No cards are required for
. coefficients which are equal to zero. The computations proceed in a number of identical steps, one step for each unknown %j. We are going to describe one of these steps.
Suppose that the value of %j has just been computed, by
using the cards of row j - t. There is one such %j for each
deck of matrix cards, i.e., for each value of A and each combination of assumed values %1' • • • , %t. Each %j is punched
into the corresponding summary card.
Now sort all matrix cards on the column number, selecting each jth column. Automatically, for each deck, the summary card is in front and is followed by the remaining cards
of the jth column. Feed the cards into the multiplier (either
the 602, 602-A or the 604 could be used), use the value of
%j, as read from the summary card, as a group multiplier,
and punch the products aij%j into the card corresponding to
aij' (Alternatively, it would have been possible to use the
reproducer instead of the multiplier and to gang punch the
values %j themselves instead of the product.)
N ext, select the row j - t + 1. In this row each card
except the summary card has a product, aik%k (or in the
alternative procedure a value %k), previously punched into
it. This is so because the cards of the jth column have been
punched in the preceding step, the cards of earlier columns
have been punched in earlier steps, the cards in column
j + 1 are the summary cards in this row, and cards for
columns following j + 1 do not exist in this row, since all
corresponding coefficients are O.

COMPUTATION

N ow feed the cards into the machine and add all the
products (in the alternative procedure, the products are
formed in this step and added at the same time). When
the machine reaches a summary card, it punches the sum
of all products. This is the value of %j+l' Then select all
these summary cards, place them in front of the matrix
decks, discard all other cards of row j - t
1, and from
here on this sequence of operations is repeated.
The polynomials e (A) and f (A) are evaluated in the
same way. The fact that %j is a polynomial in A can be used
conveniently for checking the computations by taking differences of sufficiently high order. Finally, if t > 1, the
determinants of order t have to be evaluated, either manually or by machine, depending on how many there are.
This in turn depends on the order of the matrix and the
size of the interval being searched for characteristic roots.
A considerable gain in efficiency over this method can
be accomplished in the important special case of almostdiagonal matrices, for moderate size of t. In this case each
row of the matrix contains at most 2t + 1 nonzero elements. All sorting of cards is eliminated, the entire computation is performed in a single run through the 602-A
multiplier. Where formerly the computation for a particular value of j was carried out in succession for all A'S before going on to the next j, in this case all cards pertaining
to one coefficient deck (i.e., to the same A and to the same
choice of %1 , ••. , %t), are kept together, arranged by rows .
At each step of the substitution, not more than 2t
1
different unknowns %i are needed. These are all stored in
the machine, the cards of a row ate fed in, the coefficients
aik read off the cards, and multiplied by the corresponding
Xi, and the products accumulated and punched into the
summary card of the row. Thereafter, the first of the
stored %' s is discarded, and each subsequent % is moved to
the storage location of the preceding one. The last storage
location is filled with the % which has just been computed.
N ow the machine is ready to receive the cards of the next
row, and the whole process is carried out without ever
stopping the machine. In our work so far, this method has
been planned but not yet tried out on cards.

+

+

REFERENCES

1. R. A. FRAZER, W. J. DUNCAN, and A. R. CoLLAR, Elementary
Matrices (Cambridge Univ. Press, 1938), see espec. pp. 134
and 140-141.
2. A. C. AITKEN, "Studies in Practical Mathematics," Proc. Roy.
Soc. Ed.in. 57 (1936-37).
3. N. O. MYKLESTAD, "New Method of Calculating Nat~ral
Modes of Uncoupled Bending Vibration of Airplane Wmgs
and Other Types of Beams," Jour. Aeronaut. Soe. 11, no. 2,
(Apr., 1944), pop. 153 to 162.
4. N. O. MYKLESTAD, "New Method of CalculatIng Natural
Modes of Coupled Bending-Torsion Vibration of Beams,"
Trans. Am.. Soc. Meeh. Engrs. 67, no. 1 (Jan., 1945), pp. 61
to 67.
5. M. A. PROHL, "A General Method for Calculating Critical
Speeds of Flexible Rotors," Jour. Appl. Mech. (Sept., 1945),
pp. 142 to 148.

53

SEMINAR

DISCUSSION

Mr. Kimball: Dr. Alt mentioned that they had only one
experience in the iterative multiplication. In 1945, using
a complex node matrix of size 14 by 14, we obtained convergence to four digits in 33 steps of iteration, about half
an hour for each step, using the 601 multiplier.
Mr. Bell: Concerning practical problems that arise in
evaluating the formulas that were derived: if you have a
system of simultaneous equations and reduce the first to a
triangular matrix and then by back substitution to a diagonal matrix, you have essentially two procedures, and
this complicates the machine work.
We have found that there is a critical point beyond
which you would consider the back solution and that the
order of that matrix is quite high. I would say something
like the 15th order, at least. The advent of the 604 has
made the straightforward approach much simpler. In
other words, instead of working on the first column and
then eliminating it, leave it in the matrix.
Another point is that if you divide and make your elements 1 immediately, you are dividing by numbers whose
size may be quite small, and that may make the size of
the numbers go outside the limits of your field.
We have found that a method which protects us in that
respect is to leave the numbers as a number. Your equation is then of a form where you subtract from each element a ratio multiplied by a number, and then the correc-

tion tends to be small, if the dividing term is large, which
will keep your numbers within size.
We have done some work with the iterative methods
without a great deal of success. We have found that the
conditions of convergence are more difficult to determine
than just going straight into the problem and trying to get
a solution.
In what we have done practically, in trying an iterative
process, we have set it up and started it running. If we
don't get solutions, if it begins to diverge, we stop, assuming that it is divergent.
It seems to me that essentially those processes are designed where you do not have a machine that is capable
of a grinding operation, such as the IBM machines. So
that we almost always set the problems up for a direct
solution.
One other thing is that in problems of the form where
you have a matrix that is symmetrical on both sides, and
other special matrix forms where there are mathematical
techniques that will give you much fewer operations, it
means that you must have different procedures and different methods for your operators, and that always slows
you down. We have aimed to do as much of our matrix
work by this one simple process as possible; and, although
the number of mathematical operations can be unnecessarily large, the elapsed time is very much reduced, rather
than trying to be elegant at every step.
Chairman Hurd: A very good point.

Solution of Simultaneous Linear Algebraic Equations
Using the IBM Type 604 Electronic Calculating Punch
JOHN

LOWE

Douglas Aircraft Company, Incorporated

MAN Y MET HOD S exist for solving simultaneous
equations with punched card accounting machines. The one
presented here takes advantage of the speed and flexibility
of the 604 electronic calculator. A 10th order matrix can
be inverted in one hour by use of this method, which compares with approximately eight hours through use of relay
multipliers. Furthermore, the method is extremely simple. a
The basic reduction cycle consists of: sort (650 cards per
minute), reproduce (100 cards per minute), sort (650
cards per minute), and calculate (100 cards per minute).
This cycle must be repeated a number of times equal to the
number of equations.

For best accuracy and to insure that all numbers stay
within bounds, the e1emeI).ts of M should be close to unity,
and, if possible, the principal diagonal elements of A should
be larger than the other elements.
A column of check sums (negative sums of each row)
appended to M provides an easy and complete check on the
work. These check sums can be calculated by machine, but
if they are manually calculated and written as a part of M
they provide an excellent check on the key punching. Also,
experience has shown that the agreement of the final check
sums with X is an index to the accuracy of X.
MACHINE PROCEDURE

THEORY

Layout
The following card fields are necessary:
A. Row (of M)
B. Column (of M)
C. Order (initially n and reduced by one each cycle
until it has become zero)
D. Common 12 punch
E. Pivotal column 11 punch
F. Pivotal row 12 punch
G. Next pivotal row 12 punch
H. Product or quotient
1. Cross-foot or dividend
J. Multiplicand or divisor

Several variations of the basic elimination method can be
used with the machine procedure outlined. The one described requires no back solution and is well suited to machine methods. It is well known and will be described very
briefly.
The equations may be expressed in matrix notation as
AX = C. C and X may have, of course, any number of
columns. If A-I is desired, C becomes 1 and X becomes A-I
(see reference 1).
The object of the calculation is to operate on the matrices
A and C, considered as equations, so as to reduce A to a
unit matrix, thus reducing C to X.
Let M be the augmented matrix composed of A and C.
Choose any row, k, of M and form M' such that

Procedure, Using Rows in Order
1. Start with M punched in fields (A), (B), (C) and
amounts in (H).
2. Sort to column. Emit 11 in (E) of column 1.
3. Place column 1 in front and sort to row. Emit 12 in
(G) of row 1.
4. Reproduce cards. Emit 12 in (D) of all cards. Reproduce (A) to (A), (B) to (B), and (G) to (F).
Reproduce (H) to (1) except that (H) of the pivotal column cards is gang punched in (J). Emit 11
in (E) of the first card of each gang punched group
(column 2 in this case). It is advisable to pass blanks
on the punch side for the 11 in ( E) masters. See
note (3).

,
mkj
mkj=mkk
,
mij

=

mij -

mkj.
mik , ~
mkk

=F k •

The kth column of M' is zero, excepting the kth row
which is unity. Therefore, no cards are made for the kth
column of M'.
Form M" from M' using the above equations, but a different row for k. If this process is repeated until each row
has been used and all the columns of A eliminated, the
columns of C will have been reduced to X.
aThe value of the determinant of the matrix of coefficients can be
obtained as a by-product of the process. See reference 1.

54

55

SEMINAR

5. Sort the new cards to column [row 1 with 12 in (F)
should automatically be the first card of each column] .
6. Calculate on the 604. On the 12 in (F) masters, calculate (1/1) = Q, punch in (H) and store. On
the following detail cards, calculate I-QI and punch
in (H). Gang punch 12 in (G) of row 2 from (D)
by means of digit selectors. Gang punch (n -1) in
(C). See note (3).

7. Sort to row. If check sums are carried, cards may be
tabulated controlling on row. All rows should sum to
zero except the pivotal row which should sum to -1.
Round-off errors will appear, of course.
8. At this point, these facts exist:
a. The cards are in order by row with column 2 first
in each row [column 1 was not reproduced in
step (4)].
b. Column 2 has an 11 in (E) which was emitted
in (4).
c. Row 2 has a 12 in (G) which was gang punched
in (6). Therefore, the cards may be reproduced
again as in (4), sorted as in (5), this time placing row 2 in front, mUltiplied as in (6) gang
punching (n - 2) in (C) and 12 in row 3, and
sorted and checked as in (7).
The process then consists of repeating this basic cycle:
sort, reproduce, sort, calculate, until the order has been
reduced to zero. Then all the columns of A will have disappeared, all the rows will sum to - 1, and C will have
become X. For a final check, multiply AX and compare
with C.

would provide for all but the most disadvantageous cases.
Since the 521 control panel is the same for any number of
decimals, it may be advisable to have two or more calculator control panels.
3. In order to divide by ten digits, the ten-digit divisor
is split into its first eight digits, x, and its last two
digits, y. Then
a(x y)-l = ax-1 - ax- 2y + ax-3 y 2 - ••••
Only the first two terms of this series are calculated.
If eight decimals are carried, y< 10-6, and eightdecimal accuracy is obtained if x>.1 .

+

The following procedure provides eight-decimal accuracy
when x<1.
a. As terms are calculated on the 604, they are checked,
and if > 1. , an 11 is punched (not shown on schedule
of fields).
b. This 11 punch is gang punched to field (I) in the reproducing operation.
c. In the next calculation, if this 11 punch is absent in
the divisor field, the divisor and dividend fields are
shifted two places to the left in reading. Thus, y becomes zero, and eight-decimal accuracy is obtained
at all times.
After the first reproduction, or if pivotal rows are chosen
manually (see note 5.), it is necessary to emit this 11 punch
in the divisor field if the divisor is > 1.
4. If several sets of equations are being handled simultaneously, time can be saved by not sorting case in
step (7) but making case the major sort in step (5).

NOTES :b

1. The use of digit selectors in (6) can be obviated by
placing the next pivotal row behind the pivotal row
before (5) and gang punching the 12 from (F) to
( G) in (6). It is felt that use of the digit selectors
offers less chance for error.

5. The nature of the equations may be such that rows
and columns cannot be pivoted in order as outlined,
but must be chosen so as to give the smallest quotients. In this event, the 11 in (E) and 12 in (G)
must be emitted prior to the reproduction and their
automatic insertion discarded at a sacrifice in speed.

2. In (6), using a standard 40-program 604, the following limits on size of numbers seem to exist:
Divisor:
10 digits
Quotient:
10 digits
Dividend:
12 digits
Multiplicand: 10 digits
Cross-foot:
11 digits

6. It is usually not economical to check every cycle on
the accounting machine. Errors should be rare and
will carry forward if they occur. One compromise is
to let a given cycle check while the next one is being
processed.

The question then arises as to the best way to apportion
these digits between decimals and whole numbers. Eight
decimals is a good choice for many problems, and seven
bThe writer will be glad to supply copies of the planning charts
for the 604 and reproducer control panels used in this procedure.
Address John Lowe, Douglas Aircraft Company, Inc., Engineering Department, Santa Monica, California.

REFERENCES

1. WILLIAM EDMUND MILNE, Numerical Calculus (Princeton University Press, 1949), p. 26.
2. FRANK M. VERZUH, "The Solution of Simultaneous Linear
Equations with the Aid of the 602 Calculating Punch," M athematical Tables and Other Aids to Computation. III, No. 27
(J uly 1949), pp. 453-462.

56

COMPUTATION

DISCUSSION

Mr. Turner: What do you do if

B22

happens to be very

small ?

Mr. Lowe: In the manner I described, you would actually pick the starting rows and column in sequence-that is,
the first column in the first row and the second column in
the second row, and so forth. It isn't necessary to do that.
You can pick anyone you want. In picking, pick the one
that would give you the most advantageous numbers. In
particular, we usually try to pick the one that gives the
smallest quotients in doing this division.
Mr. Wolanski: We have a method that is similar to this,
but we always use the element that is greatest; we cannot
say the first row or the first column. In the first column we
use an element which is the largest; when we do eliminate
and get B 2H and B 31 equals zero, we start in on our second

column, and we pick the element that is the largest.
Mr. Lowe: Our method for finding out if the numbers
get too big is simply to punch out a few more numbers
than we can use the next time and then sight-check the
cards.
Mr. Bell: In our handling of this problem we try to remove judgment from 'the operation which the operator
performs. We don't want him to have to look at it and
evaluate and decide which term to use. So, in handling
matrices-usually, in groups-we simply start up from the
main diagonal. Perhaps. ten per cent of the problems will
go bad. We take that ten per cent and start down the main
diagonal, and maybe ten per cent of those go bad. Well,
then we have ljlOOth left over, of the total working volume, and those we actually evaluate and select proper big
terms in order to make it behave. But by doing that the
mass of the work is h~ndled in a routine way.

Rational Approximation tn High-Speed Computing
C Eel L

HAS TIN G S,

JR.

The RAND Corporation

tive." Our parameter values are then, of necessity, determined to an excessive number of figures for practical
purposes. These may be cut back to obtain "working"
approximations. Each primitive approximation will be described by an accurate error curve, the primitive parameter
values, location of roots ri, location of extremal values ei,
and the common extremal values d.
Thus, we approximate the common logarithmic function
logx over (IIv' 10, Y 10) by the form
x-I
~= x
1(2)

T HIS is a brief report on a study that is being made at
RAND on the use of rational approximation in high-speed
computing. The work we report upon was largely stimulated, in the first place, through appearance of the IBM
Type 604 Calculating Punch, and our work was given
further impetus by the reported development of the IBM
Card-Programmed Electronic Calculator.
The opportunity of doing, away with the use of card
tables thus presented itself to us, and we began to prepare
for the day when compact approximate expressions would
take their place in the art of digital computing. The subject
of rational approximation then became a matter of increasing importance. We note in passing that proper use of the
604 can eliminate the use of tables to a considerable extent.
Thus, to give an example, one can evaluate a fifth (or even
higher ) degree polynomial in single-card computation on a
"604. The machine will then read an arbitrary value of x
from a detail card and punch out P (x) upon the same car can be expanded on a program cycle or a series of
programs, then if the inverse function is needed, the
series is expanded until it equals the sine, thus giving the
e) - x = 0 and expand the
angle. If we write F (X 0
expression by Taylor's theorem and use Newton's method

FIGURE 1

+

of approximation, e = x

;,(~~fo)

The true curve is approximated by a series of chords.
Each value of X 0 is obtained by summary punching with
progressive totals, tabulating blank cards, and accumulating
a constant first difference from the digit emitter. Some hand
computing is necessary to find the points where the constant
emitted first difference should be changed, and the number
of changes is balanced against the size of the allowable
error of Xo.
The division by F' ( X 0) has been replaced by a multiplication by 6, 1, and equation (1) is iterated repeatedly
until the final values are reached. Although this procedure
may not appear attractive, it is for the purpose of eliminating the division. It may be applied if a table, say, to four
places is available, and a table to eight places is needed.
Perhaps some of Mr. Hastings' inverse functions could be
generated more easily than the direct functions. It would
have an application there.
N ext, I would like to indicate the results in two simple cases. If a reciprocal table is to be obtained, then
f(x) = X = l/x, and f'(x) is eliminated, as derived from
this expression, instead of substituting 6,1/W in equation
(1). Then X = Xo
(1 - xXo)Xo. It is not necessary to
summary punch any differences. An iterative formula is obtained which would have been obtained more easily some

is obtained, where Xo

is some approximation to the correct value. Also the following equations are true:
dX = f'(x)dx
dx = F'(X)dx
dX
dx

1

= F' (X) =

f'()
x

6,1

w

Now write
Xl

= Xo + e=
r

Xo

+ [x -

6,1
F(Xo)] - .

(1)
w
This is a first-order approximation, but it is applied only
to the residual which is supposed to be small. If a set of
approximate Xo's can be obtained for the entries to be put
into the table, this iterative equation is computed instead of
attempting to compute f (x). This is on the assumption that
F(Xo) can be computed more readily than f(x). In the
case of an arc sine table, for example, this means that one
must have, necessarily, a table of sines to use, or a sine
series, or something similar.
If the table is to be at equal intervals of the argument, presumably the entries are in units of a certain decimal place,
so that l/w simply requires proper placing of the decimal
point, and 6, 1 is obtained by differencing the values to be

+

62

63

SEMINAR

other way! In the second case, if f(x) = X- 1 / 2 , the same
procedure is applied, obtaining
X = Xo + (0.5 - 0.5xXg)Xo .
This is a quantity which arises when transforming from
rectangular to polar coordinates. It is necessary to divide
by the square root of r2.
N ow I would like to say a few things about the construction of a sine table. My remarks apply beautifully to the
sine and cosine function, and contain some ideas which may
be extended to other functions. Let Figure 2 represent the
quantities in a sine table at equal intervals of the argument.

cp - w

sin (cp - w)

A sin (cp - w)
6,1

cp

sin cp

6,3

A sin cp

A2 sin cp

6,1

cp+w

sin (cp+w)

6,3

A sin (cp+w)
FIGURE

2

The second difference opposite sin cp is
sin (cp

+ w)

- 2 sin cp

+ sin (cp -

w)

= - 2 (1 - cos w) sin cp = A sin cp .

Similarly, the fourth difference becomes A2 sin cp, which is
rigorous, and not an approximation. This is a property of
the sine function. The above suggests that if interpolation
is desired, the best procedure is to use Everett's interpolation formula. This will reduce to :
sin (cp

+ nw)

+ m(P -

m(P-m2)
3!
A

= sin cp [ m -

m 2)(22 - m 2)
5!
A2

n(P-nZ)
3!
A

+ . ..] + sin (cp + w) [ n

+ n(P-n 5!)(2
2

2

-n 2 )?
A-

+ . ..] .

(2)

This process of interpolating for a sine between two given
values means that each of the values is multiplied by its
corresponding square brackets, which is, in general, different from the ordinary concept of interpolation.
It is seen from trigonometry that the square brackets in
equation (2) which have been derived by means of working
with Everett's interpolation formula, have closed expressions, namely
sin (1 - n)w d sin nw
.
an - . - - .
smw
smw
Here I would like to point out something which is just a
curiosity. Suppose cp = 0, then the first line of equation (2)
will drop out. Consider the second line. One of the first
. things the teacher tries to emphasize when this subject is

reached in trigonometry is that sin nw is not equal to
n sin w, but you see that, with the exception of the higher
order terms in the series, it is true. So that is a good way
to confuse everybody!
I shall describe briefly how we constructed an eight-place
sine table, in what we considered a most efficient manner of
arriving at a set of punched cards for the table. Ninety
cards were key punched, proof read, and differenced, each
containing the sine of an integral degree. If the argument
is in degrees and if an eight-place sine table is desired,
then the interval must be O. °01 in order to have linear interpolation. This yields a second difference of three units in
the eighth place. Since the second difference coefficient is
never greater than one eighth, the error is less than a half
unit in the last place.
If the expression in the square brackets of equation (2)
was computed one hundred times for n = 0.01, 0.02, etc.,
and the results used to form the one hundred interpolates,
the basic idea of the process can be seen. Now this complete
set of square brackets is much like a sine and cosine table
itself, because after going from n = 0.0 to 0.5 for the first
bracket, the second half from 0.5 to 1.00 may be obtained
by reading up the column, as one reads down one column
to get the sine from 0° to 45° and then back up the cosine
column to obtain values from 45° to 90°.
In the present case it means that when one multiplication
has been made of a square bracket times sin cp, it is used
once for a given value of m in one degree and again for the
same value of n in the next degree. Although every single
sine entry is obtained as the sum of two products, there are
only as many products as there are entries in the table, because each one is used twice.
In practice the entries are not formed directly by equation (2) but the square brackets are differenced, the products formed, and then the interpolates are built up by progressive totals over each range of one degree. This process
enables a multiplication with eight-figure factors (the normal capacity of a 601) and still protects the end figures
against accumulated roundings. The differences of the
square brackets are of the form:
0.01
A6,lE2 + A26,lE4
If this expression is evaluated to 12 decimal places, there
will be only 8 significant figures beside the leading 0.01.
6, 1E2 means the first difference of the Everett second difference coefficients. The multiplication of sin cp by 0.01 is
accomplished by crossfooting, and the rest is multiplied in
the usual way. This allows two decimal places for protection of the end figures, owing to the progressive totals 100
times, and two extra places in the computation to insure
the correct rounding to the closest half unit.
Nine thousand multiplications were performed, using 100
group multipliers, and the work was arranged with three
fields to the card. Then the cards were sorted and tabu-

+

+ ...

64

COMPUTATION

lated; 3,000 summary cards yielded the final values for the
table. There was an automatic check as each sine of an
integral degree was reproduced at the end of that interval.
The final table was then reproduced, and it was necessary
to punch the first differences of the rounded values in order
to interpolate these. The final check was the differencing
of the first differences, which were inspected. The entire
operation took about twenty hours.
You will perceive readily, from Figure 2, that a sine
table may be constructed simply by multiplying A sin 
3 f(x)
f1 = - '2 x - o - = - "2-----X-'

fii(X) =

+ .!1 x
4

= +.!1 f(x)
2

-7/2

4

fiii(x) = _ 105 X- 9 / 2

x '
= _ 105 f(x)

8

8

.-11/2
five x ) -~ + 945
16 .t

=

x3

+ 945
f(x)
16
X4

1 [

f(x) = f(x o + h) = XV2 1 2

h
+ 815 x~

'

'
2

h {
35 h
'32 Xo
1 + 24 x~

}

2

{

63 h
1 + 48 x~

}

+ ....

]

The terms h2 / x~ in the braces are actually the third and
fourth derivative terms which cannot be included because
the interpolation formula is to be only quadratic. However,
since these terms are always positive, they shall be used as

65

SEMINAR

value of the argument at the beginning of the interval is not
zero, but A, then let N - A = n where N is the number
which is actually used as the multiplier in equation (4).
Making these substitutions in (5) in order to reduce it to
the form of (4), we find that all of the following relations
exist. I shall write down only the end results.

E

KE

D2

Fo

= f2'
=10 -

+

D1 = 11 - 2(a
A)D2
(a+A) D 1 - (a+A)2D 2 ·

This is about the simplest way, of which I could think, to
present the development from the function and a Taylor's
series to the final results entered in the table.
DISCUSSION

FIGURE

3

illustrated in Figure 3. Let E be the total error committed
at x = x 0
a if we neglect the third order term completely. Let KE be the fractional part of this error which is
taken into account if the cubic term is replaced by a linear
term, as shown. Then the remaining neglected error is
h 3 E - hKE. This error has a maximum (shown by the
short vertical line) at h = Y K/3. If the error is set at
this point equal in magnitude and with opposite sign to
the error at the end of the interval, we have

+

~~( ~ -

K

)E = (K - 1) Eand K = 3/4 .

If we analyze the quartic term in the same way, we obtain K = 2 (y2 - 1). Our interpolation formula becomes
2

f(x)

1 [ 1 - 2-3 1{
3 35 a
= .to
_3/2
1 + 4-24
2
Xo
Xo

}

h

{I
(y2 + ~~
8 x~
+
1

1) 63 a

2

24x~

}

h2]

.

Since the two braces are so nearly alike, we may use, without sensible error, 1 + 1.09375 a2 / x~ for each of them.
N ow we may expect that the interval which may be
used is somewhat more favorable than that which would
be determined on the basis of. neglecting the third and
fourth order terms completely. I shall let Dr. Grosch explain the way in which the intervals are obtained. What is
done as a rule of thumb is to write 192e = w 3 fiii (x), where
B is the admissible error, usually one-half unit in the last
place. Then the size of the third derivative will control the
value of the interval which may be used.
At this stage we have
f(x) = fo + hf1 + h 2f2.
(5)
\Ve are still faced with one other problem before we are
finished: h is counted from the middle of the interval; so
we shall write n - a = h. Then 'n is counted in the same
units as h, but from the beginning of the interval. But if the

Dr. King: I would like to make some much more general
remarks. It is a good thing for a lot of people to work on
these problems so as to make sure that the best method
finally comes out.
On the other hand, there is a point where it is inefficient
to have too many people, and I would like to ask the speakers whether they think the last word has been more or less
said on optimum interval tables, and, if so, I am sure there
are some particular little details that could be improved.
So I would like to hear from them whether they think the
time is ripe for people to get together and have one system
of optimum interval tables.
Dr. Grosch: I think, honestly, we can say that the polynomial case for the single variable is just about under control now. By the time you go to two variables it becomes
so complicated that it may not be worth investigating
until we have some big bi- or trivariates that we just have
to make. It is really a beastly job, even in the linear case.
I have made some explorations in that direction and don't
feel at all satisfied. In the univariate case, I don't think
there is much we can do beyond this inverse matrix business, and the reason I am so sure of it is this: that if you
pick any interval (and you may pick a wrong interval, because several terms in the Taylor series are contributing;
higher order terms, as Dr. Herget shows, are being added
with lower order terms and so forth); but if you pick an
interval under any assumption whatsoever, Mr. Hastings'
comment of yesterday is the key to the whole situation that
the error curve for a certain degree of approximation is
going to look just about the same. It will change a little bit.
He said Tchebyscheff was the zeroth order approximation
to that curve. It will change a little, but the position of the
extrema is very stable. Therefore, you are going to make
an error of E at the position where you think those extrema
are going to occur; and, even if the function doesn't behave quite the way its next higher term indicates it should,
the extrema aren't going to shift very much. Therefore,
your value of the actual error curve obtained when you
use the table will not be more than a tenth of a per cent
or a hundredth of a per cent greater than theoretical E,

66
unless you come to a curve so very bad that the error
curve doesn't look anything like a Tchebyscheff polynomial; and, of course, we can always invent such curves.
But I think they are going to be quite hard to invent.
I also expect that the rational function is going to have
a very stable error curve, what Professor Tukey referred
to as the Tchebyscheff polynomial for rational functions.
But I don't have that as yet and I don't know whether
Mr. Hastings has.
Professor K unz: I think one of the important things in
this talk is that Dr. Grosch has reminded us that there are
other ways of interpolating. Just as a very trivial suggestion: if you take a hyperbola and pass it through three
points, this gives you second order interpolation in a more
general sense. Some of you who haven't tried similar
things might like to try it. One of the interesting properties is that you try inverse interpolation with such a
function, and it is just as easy as direct interpolation. You
can obtain second order inverse interpolation very nicely.
I have used this in quite a few cases, and it sometimes
yields a very nice fit to curves, particularly if they have a
singularity somewhere in the region.
It is just a suggestion to become sort of initiated to
these reciprocal differences which are a little forbidding
and are awfully hard to integrate.

COMPUTATION

Professor Tukey: I cannot agree with Professor Kunz
in the use of the word "interpolation." The essential point
about this is that we have given up interpolating, just as
we have given up expanding in series. Weare trying to
get something that is a good fit, and that is a different
problem.

Mr. Bell: While we are on the subject of tables, I would
like to point out another way of getting a fit to curves of
various sorts. It is a widespread opinion among engineers
that a problem which involves curves of some sort cannot
be done on punehed cards. I am talking, of course, about
engineers who have hearsay knowledge of IBM equipment.
Now, this is not true. All you have to do is read a lot
of points. With the points you can obtain first differences,
set up a linear interpolation, which can be done quite
quickly. Of course, this is completely non-elegant, but
very practical. Many times you have whole families of
curves. We, in our organization, are fortunate in having
rapid ways of reading such data. We can read, say, a
thousand points from families of curves in maybe an hour
and be ready to go on a problem without mathematics.

A Description of Several Optimum Interval Tables
STUART

L.

CROSSMAN

United Aircraft Corporation

f (x)

THE S EVE R A L optimum interval tables included in
this paper were constructed for linear interpolation on
the IBM Type 602 Calculating Punch in an endeavor to
accelerate the process of table look-up. In each case a
critical table of thousands of lines was reduced to a table
of fewer than two hundred lines. The number of lines per
table might have been still further reduced by using a
higher order of interpolation, but this was not desirable
since the interpolating time on the type 602 calculating
punch is approximately proportional to the order of interpolation.
The following tables were constructed:
Table

Function

A

et

B

1-r'!./7

e-t
(1-r2/7)~

C

Accuracy

Range and Interval

hi

Size

5

. 10- } t = -1.7000( .0114
.0102) -oAooo 104 cards
1 . 10-4
1 . 10-51
~ r = .30000 ( .00350
1 . 10-5 J - .00013).99900
192 cards

r2/ 7
Arc cosh % 1 . 10-4

= 1.0002 ( .0001
- 1.0500)27.3600

~%i~,1~~-Wi--~
~~--------%i+l---------t

%

132 cards
FIGURE 1

Tables A and B each consist of two functions with a common argument. This arrangement was convenient in that
both functions of a given table were needed, at the same
time, in the particular problem for which the table was
constructed. Only one sorting operation is necessary to file
the table cards with the detail cards in preparation for the
interpolation of both functions. Including two functions in
a table with a common argument may result in a slightly
larger number of lines than would be obtained in either of
the functions if each were optimized independently. However, the additional lines are of little consequence, since an
entire sorting operation is eliminated.
The tables were constructed using a method developed
by Dr. Herbert R. J. Grosch of the Watson Scientific Computing Laboratory.
The method consists of dividing the interval (a, b) upon
which the required function is to be approximated, into a
number of sub-intervals, upon which the function is replaced by straight ·lines of the form (Figure 1)
fez) = bi + mi (%-%i) .
(1)

For the functions included in this paper it is convenient
that bi be referenced to the vertical axis at %i for each interval, thereby limiting its magnitude. The optimum subintervals are determined by the expression
w

=

4\!'-;
Id 2f \i

'

(2)

dx 2

where
w

= .the tabular interval

;; = the
f

2nd derivative of the function, f (% )

= the maximum theoretical error between the ap-

proximate value and the true value of the function.
The number of lines, N, can be found approximately
from the expression
N(a, b) =

67

f'

d: .

(3)

68

COMPUTATION

Since each of the tables was constructed in the same
manner, a description of Table B will illustrate the details
of construction.
This table was constructed such that the functions are
everywhere represented on the interval to an accuracy of
1 X 10--5 (total error) .
The total error consists of the sum of the theoretical
error (£) between the straight line and the function, the
rounding error when interpolating (1), and the departure
of b i from theory due to rounding. Hence, for each of the
two functions
f(r) = 1 - r2/7
(4)
F(r)

=

(1 -

r2/7)~

r 2/ 7

(5)

the value of £ is as follows:
1 X 10--5 = £ .5 X 10--5 + .05 X 10--11 ,
£=.45X10- 5 •
(6)
I t will be noted that the inclusion of "the rounding error
when interpolating (1)" insures that the rounded interpolated value of the function agrees within 1 X 10--5 of the
true un rounded value of the function.
An examination of the second derivatives of the functions
discloses the direction of increasing w, i.e., the successive
values of w increase as the absolute values of the derivative
decrease. For the function fer), the intervals increase from
left to right throughout the range of the function. However,
for F ( r) the intervals increase from left to right to the inflection point, at r = .73115, for the left-hand portion of
the curve, and from right to left for the right-hand portion
of the curve. Hence, for the function F (r) the two sections
of the curve are treated independently and the intervals
calculated from r = .30000 and r = .99900 toward the inflection point.
The second derivative of f (r) and the value of £ when
substituted in (2) give
w = 18.782971 X 10--3 r 6 / 7 •
(7)
Beginning at r = .30000 and substituting in (7), the
first interval was computed. The interval is added to the
value of r to establish the new r for calculating the next
interval. The process is repeated through r = .99900. Each
value of w is calculated to seven places, but the last two
places are dropped and theunrounded five-place value used
to establish the starting point for the next interval.
The second derivative of F (r) and the value of £ when
substituted in (2) give
59.396970 r 8 / 7 (1-r2/7)iI X 10-3
w=
(8)
(8r'!/7 - 27r 2 / 7 - 18)i
The intervals were then computed for each portion of the
curve. As the value of r approached the inflection point, the
intervals increase and become infinite at the inflection point.
The intervals which cross the inflection point are shortened
to end at that point.
A comparison of the intervals for the two functions disclosed that the intervals for F (r) were smaller than for

+

f (r)

over the complete range, and the controlling factor in
establishing the tabular values of r for both functions. Then
the values of the functions for each argument were calculated with seven-place logarithm tables.
The final step in the preparation of the table involved the
calculation of the interpolation coefficients for each interval.
This was accomplished on the type 602 calculating punch.
The straight lines which approximate the function, f (r) ,
have the particular form
fer) = bi - mt (r - ri),
1'i 1, is not contained in qL1' Then
%~1 %~2 ••• %~t-1 is not contained in qS-1' This is contrary to
the definition of qL1 which must contain all terms such that
l exponents = s - 1.

s,

x/

The Monte Carlo Method and Its Applications'*
M.

D.

MARK

DONSKER

KAC

C;ornell [lniversitjV

could be studied as follows. Reintroduce time dependence
by considering
u(x, y, z, t) = cp(.-r, y, z) r'At
then, u will obey the equation

C E R T A I N PRO B L EMS leading to complicated partial or integro-differential equations have recently been
approached and some actually solved by utilizing various
probability techniques and sampling methods. Collectively,
these methods have become known as the Monte Carlo
Method.
The problems to which Monte Carlo techniques have
been applied seem to be divided into two types. Typical of
the first type is the problem of neutrons diffusing in material media in which the particles are subjected not only to
certain deterministic influences but to random influences as
well. In such a problem, the Monte Carlo approach consists
in permitting a "particle" to play a game of chance, the
rules of the game being such that the actual deterministic
and random features of the physical process are, step by
step, exactly imitated by the game. By considering very
large numbers of particles, one can answer such questions
as the distribution of the particles at the end of a certain
period of time, the number of particles to escape through a
shield of specified tbickness, etc. One important characteristic of the preceding approach is that the functional
equation describing the diffusion process is by-passed completely, the probability model used being derived from the
process itself.
A more sophisticated application of Monte Carlo Methods
is to the problem of finding a probability model or game
whose solution is related to the solution of a partial differential equation, or, as in the present paper, to determine the
least eigenvalue of a differential operator by means of a
sampling process. As an example of how the latter problem
might be attacked, we quote from a paper of Metropolis
and Ulam: 1
"For example, as suggested by Fermi, the time independent Schrodinger equation
!:::.cp(x, y, z) = (A - V) cp(x, y, z)

au
at
= !:::.u -

Vu .

This last equation can be interpreted, however, as describing the behavior of a system of particles each of which performs a random walk, i.e., diffuses isotropically and at the
same time is subject to multiplication, which is determined
by the value of the point function V. If the solution of the
latter equation corresponds to a spatial mode multiplying
exponentially· in time, the examination of the spatial part
will give the desireq cp (x, y, z) - corresponding to the
lowest 'eigenvalue' A."a
The main purpose of the present paper is to present an
alternative method for finding the lowest eigenvalue and
corresponding eigenfunction of Schrodinger's equation. The
chief difference between the two approaches is that ours
involves only a random walk eliminating entirely the multiplicative process. This alteration in the model seems to simplify the numerical aspects of the problem, especially if
punched card equipment is to be used. Apart from the possible numerical simplification, the method is based on a
mathematical theory which in itself is of some interest.
T he Mathematical Theory
Let Xv X2 , X3 , • • • be independent identically distributed
random variables each having mean 0 and standard deviaX2
Xk. Under certain
tion 1 and let Sk = Xl
general assumptions on V (x), the most severe of which is
that V (x) be non-negative, it can be shown 2 that the limiting distribution function ~,t) of the random variable

+ + ... +
(J" (

~L: V(~~)

(1)

k;;;ant

is such that

*This paper (except for the two appendices) was written while the
authors were associated with the National Bureau of Standards
at the Institute for Numerical Analysis. It appears in the Journal
of Research of the National Bureau of Standards under the title
"A Sampling Method for Determining the Lowest Eigenvalue and
the Principal Eigenfunction of Schrodinger's Equation." The
preparation of the paper was sponsored (in part) by the Office of
Naval Research.

·<:SJJ<:SJ e- a- st
J

o

0

da(J"(~,t)dt =l<:SJtf;(X)dX'

(2)

-00

aTo the best of our knowledge, this method has not been tried out
numerically.

74

75

SEMINAR

where tf! (x) is the fundamental solution of the differentia]
equation
1 d 2tf!
(3)
"2 d%2 - [s V ( % )] tf! :;::: 0 ,

+

subject to the conditions
tf!(%) ~ 0

This latter source of error is especially significant since, as
will be apparent shortly, it is impractical from other points
of view to take t very large. All of this difficulty may be
obviated by considering (7) for two distinct values of t,
say tl and t 2 ; then, if the exponentials after the first are
neglected as before on dividing there results,

itf!'(%)i 0 ~~

and

Et (~)

=

. E1e -!k~nt
~ V ( ~ + Vs~) I
11m
n~oo

(l6)

nt

n

n

S

nt

<

0

!
'

In conclusion, we wish to thank Dr. E. C. Yowell of the
National Bureau of Standards for wiring the control panels
and for excellent supervision of all the punched card work.
APPENDIX

I

We give here an intuitive approach to the mathematical
theory of section 1. This approach was suggested to us by
Dr. G. A. Hunt.
Consider the simple random walk in steps
± b. (b. = 1lVn)
each step being of duration T. At each stage the particle has
the probability T V(Sk b.) = (lIn) V(sk/yn) of being
destroyed, where Sk b. is the displacement of the particle
after time k T (k steps).
In the limit as n~ 00 (b. ~O, ~O, b. 21 T = 1) we are led
to a continuous diffusion process with destruction of matter
governed by the function Vex) > O.
The probability Q(x,t)d.t" that the particle will be found
between x and x
dx at time t can be found by calculating
the Green's function (fundamental solution) of the differential equation
iJQ 1 iJ2Q
at=Z iJx2 - V(x)Q ,

+

i.e., that solution of the equation which for
Q(x,t)~8(x) .
The integral

t~O

satisfies

1~(X,t)d~.
(17)

the mathematical expectations on the right-hand sides being
conditional expect~tions under the conditions Snt > 0 and
Snt < 0, respectively.
Thus
00
1 .
00
(18)
2 1Et (O - E,(O r = ~ e-',',{I;(f) _~p(x) ,h(x)dx ,

1

represents the probability that the particle will survive during the time interval (O,t).
In terms of eigenvalues and normalized eigenfunctions of
the Schrodinger's equation (4) we can express Q(x,t) as
follows:

Q(x,t) = Le-Ajt Xj(O) Xj(x) .
j

where

+

1,x > 0
p(x) = _ 1, x < 0 .
If t1 and t2 are sufficiently large,
1
Et(~)-Et(~)
A2 /"'Owl - - - - log
2
2.
( 19)
t 2 - t1
Et(~)-Et(~)
1
1
From the discussion of section 2 it should be clear how one
applies (19) when the data of Table II are available. One
must see to it that ~ is so chosen that",2 ( ~) # 0; otherwise
a higher eigenvalue may have been calculated. From the
data of Table II the following is obtained
A2 /"'Owl 1.1 ,
whereas the exact value is y2 = 1.41. The poor agreement
could have been expected in view of low accuracy in the
calculation of AI'

{

It is, of course, understood that Vex) is assumed to yield a
discrete spectrum. In the case when continuous spectrum ~s
also present the formula has to be modified but the calculations of the lowest eigenvalue are, in general, not affected.
Finally,

1;(·~,t)dX =1;..-", .p;(0)1:,(X)dX ,

1;

and it remains to verify that

~

(x,t)dx =f:-a dau( ;t)

-00

0

First note that the expectation (average) of
~
(~)
e- Jn k~nt
V Vn = e -

7"

!.

A'~nt

V (Sk 6)

80

COMPUTATION

approaches

j:-daU(a ;1)
asn~oo

(because the distribution function

of~.L: V(
k$nt

S;)

V

approaches 0"( Ct; t) and V (x) > 0).
On the other hand, using the approximation
e - 'T V (Sk 6) '-' 1 - T V (Sk ~)
note that

is approximately the probability of survival of the particle
if its consecutive displacements are S1~' S2~' ••• , Snt~.
In taking the expectation of

e-

'T

k~nt

we average the probability of survival over all possible
choices of successive displacements (all possible paths) and
thus obtain the unconditional probability of survival. This
unconditional probability of survival approaches, as n~ 00 ,
the integral

l~(X,t)d:r .
On the other hand, it also approaches

T

Consider the boundary r of 0 as an absorbing barrier. This
means that whenever the particle, performing the random
walk, leaves the region 0 it is destroyed.
In the limit as ~~O, the probability Q (xo, Yo; t) that
the particle will survive during the time interval (O,t) can
be obtained as follows:

Q(xo,yo; t)

ff

=

P(xo,yo I x,y;t) dxdy,

whereP(xo, Yo I x, y; t) is the fundamental solution of the
. I equatton
. 7ft=:2
(JP 1 u.
I\p
d1'fferentta
,
subject to the boundary condition
P = OonT ,
and the initial condition
P(xo, Yo I x, y; t) ~ B(x-x o) B(y-yo) ,
as t~O.
This fundamental solution can be expressed in terms of
the eigenvalues and normalized eigenfunctions of the mem.,.
brane problem as follows:
e - Ajt tfj(x o, Yo) tfj(x, y) .
P(X(H Yo I x, y; t) =

.L:

a; I)

j

and consequently

j;adaU(a ;1)

~2

- = 1.

o

V (Sk 6)

j:- dau(

Cover the region 0 with a square net, the side of the square
being ~. Starting from a point (xo, Yo) = (m~, n~) inside 0, consider a two-dimensional random walk in which
a point is allowed to move to each of its four nearest neighbors, the choices being equiprobable (probability 1/4). The
duration of each step is T related to ~ by the equation

Thus

Q(x o, Yo; t) = .L: e-Ajt tfj(x o, Yo)

= l;(X,I)dX

2:

tfj(x, y)dx dy ,

0

j

=

ff

and
e-Ajt

J

tfj(O)l;j(X)dX

\ _

logQ(xo,yo;t)
.
t
The probability Q (xM Yo; t) can be calculated by a sampling method in the following way. Start No independent
particles from (x 0' Yo) and let each of them perform the
random walk described above. Watch each particle and
count the number Nt of those particles which have not left
the region during the time interval (O,t).
Then
"'1 -

l'

-

1m

t-HfJ

-00

Although it has been assumed V (x) > 0, all considerations
are applicable to potentials which are bounded from below.
Although atomic and molecular potentials become negatively infinite, they can be cut off sufficiently low without
changing appreciably the eigenvalues and the eigenfunctions.
APPENDIX II
In this appendix we sketch a Monte Carlo Method for finding the lowest frequency of a vibrating membrane. In
mathematical terms, find the lowest eigenvalue of the equation
I

2~u+Au=0,

and
Al '-'

-t1 log NNto .

The practicality of this method has not been tested.

valid in a region 0 and subject to the boundary condition
u,

=

°,

.on the boundary r of o. (This corresponds to the case of
the clamped membrane. ). The lowest frequency is the square
root of twice the lowest eigenvalue.

REF'EREN CES

1. N. METROPOLIS and S. ULAM, "The Monte Carlo Method," Journal of the American Stat. Asso. (September, 1949), pp. 335-341.
2. MARK KAC, "On Distributions of Certain Wiener Functionals,"
Trans. Am. Math. Soc. 65 (1949), pp. 1-13.

81

SEMINAR

DISCUSSION
Mr. Bisch: I was very much interested in the clear discussion and the problem you chose, which is part of a problem we have to solve. My first question concerns that function, Vex) or vex, y, z). Could you have that function
determined experimentally-in other words, not expressed
algebraically?
Professor Kac : Yes.
Mr. Bisch: The second question is about the boundary:
Could you in this method leave the boundary as a simple
unknown temporarily?
Professor Kac: To which problem are you referring? Do
you mean here there is no boundary?
Mr. Bisch: There is no boundary in a problem of the
membrane. You made the boundary zero. In other words,
your deflection was zero. Could you leave that deflection
temporarily unknown as a quantity like U 0 ?
Professor Kac: I think so. Actually, all I can say with
certainty is the following: If you have the other boundary
condition, (du/ dn) = 0, the other classical condition, then
all you have to do is to have the boundary not absorbing but
reflecting. Now, the mixed boundary au + b (du/ dn) =
can again be done in prinCiple by making the boundary
partially reflecting, partially absorbing. When you come to
the· boundary you play an auxiliary game, which would decide whether you are going to throw the particle out or keep
it in the game. You see, this is only an eigenvalue problem.
Consequently, you cannot just say that the solution must be
f (%) on the boundary, because that would not give a characteristic value problem, and this is designed primarily to
find eigenvalues. On the other hand, if it comes to Laplace's
equation with a prescribed condition, then Dr. Yowell will
speak about a random walk method which will do that. In
fact, they have some experiments in the case of Laplace's
equation in a square.
Dr. King: One should emphasize the point of view of
accuracy. I don't believe there is any hope of getting, say,
six significant figures out of a Monte Carlo Method.
Professor Kac: Agreed.
Dr. King: But I disagree a bit with your point of view
that it is worth doing, even in the very simplest cases, if
you are not interested in accuracy. I think for the same
amount of work you could work the harmonic oscillator
with other methods and get one significant figure or two.
Professor Kac: If I understood you correctly, you are
saying that apart from the practical applications it is interesting because of the principle involved. Is that correct?
Dr. King: Yes.
Professor Kac: There I definitely agree.
Dr. King: I think it is still practical, though, apart from
being an amusing experimentation; it is a practical method
if you are interested in an engineering problem; so you
only need a couple of figures.

°

Professor Kac: With that I agree. As a matter of fact, it
has another advantage, actually, which bore some fruit, not
particularly exciting, but this way of looking at it produces
even results of some theoretical interest.
For instance, I am able-although I won't do it here because it will be a bit too technical and probably too tedious
-to justify the so-called vV K B method, at least one aspect of it, by diffusion analogies; and there are other new
viewpoints. If you can look at something from different
points of view, it is certainly helpful, and often practical.
On·the other hand, for someone who has the tough attitude ("You give me the lithium molecule to ten per cent,
or I won't take the method"), of course, one would still
have to see what one can do, and, actually, I agree with you.
What I am trying to do is to bend backwards in being
cautious about it.
Professor Tukey: It seems to me that there is a point to
be made that came out in the discussion at the last conference. That is, that one point of view for the use of Monte
Carlo in the problem is to quit using Monte Carlo after a
while. That, I think, was the conclusion that people came
to then. That was the natural evolution and perhaps the
desirable thing. After you play Monte Carlo a while, you
find out what really goes on in the problem, and then you
don't play Monte Carlo on that problem any more.
I think the thing to suggest here is that, by the time people have played Monte Carlo on the lithium atom, perhaps,
or the lithium molecule, or something more complicated,
people will get to the place where they won't be playing this
simple Monte Carlo any more; they will be playing Monte
Carlo in some peculiar space where you have obtained approximations to the wave functions as your coordinates,
and not %, y, and z; and then you will start to get more for
a given amount of machine time.
This is going to get arbitrarily complicated. When you
start to cross-breed Monte Carlo with a standard methodand that isn't too far away in the handling of hard problems-you are going to have to do something like that.
Professor Kac: There is confirmation of that because of
rather extensive calculations performed with relatively simple equipment built for that purpose at Cornell, by Dr.
Wilson and collaborators, in connection with his study of
cosmic ray showers. They found the Monte Carlo Method
most valuable because it showed them what goes on. I mean
the accuracy was relatively unimportant. The five per cent
or the seven per cent accuracy they obtained could be considered low; but all of a sudden they got a certain analytic
picture from which various guesses could be formulated,
some of them of a purely analytical nature, which later on
turned out to verify very well. As a matter of fact, that is
certainly one aspect of Monte Carlo that should be kept in
mind. I agree that one of the purposes of Monte Carlo is to
get some idea of what is going on, and then use bigger and
better things.

A Punched Card Application of the Monte Carlo Method*
P.

c.

F.

JOHNSON

c.

UFFELMAN

Carbide and Carbon Chemicals Corporation

I NTH I SPA PER a description will be given of a
punched card technique for applying the Monte Carlo
Method to certain neutron problems. It should be kept in
mind that this procedure is designed for a machine installation consisting of a standard type 602 or 604 calculating punch, a collator, a reproducer, a sorter, and an
accounting machine. Other combinations of machines would
require a different approach, and an installation with a
card-programmed electronic calculator would use an entirely different technique from the one about to be described.
In any event, the problem may be stated as follows:
Assuming a monochromatic point source of neutrons of
given energy within an infinite medium of known constituents, hypothetical case histories for a number of these
neutrons will be built up as they undergo a series of random collisions with the constituents of the selected medium.
These collisions result in either an absorption or an elastic
scattering of the neutrons, and the main work of this problem is to follow the selected neutrons through successive
collisions until they either are absorbed or fall below a certain energy level.
For each collision of each neutron the following will be
recorded: the type of collision undergone; the energy loss
per collision, 6. u; the total energy loss at the end of the
current collision, u; the z-component of the distance traveled, p, between collisions, 6.z; the z-component of the total
distance traveled, z; the angle of deflection after the collision, w; and the direction cosine with the z-axis, p..
When these data have been recorded, they may be used
for various statistical studies to determine such things as,
how far neutrons of a given energy may be expected to
penetrate a certain substance. An example of how these
data may be used will be given subsequently.
To begin any such series of calculations, an arbitrary
number of neutrons-say 1,000 numbered from 000 to 999
-are selected and permitted, on punched cards, to go off
in a random direction, travel a random distance, hit a type
of particle determined in a random fashion, and lose an
amount of energy determined also in a random way. For
each neutron, a collision card is made bearing the neutron
number, the collision number, and all the data (distance,
direction, etc.) pertaining to that particular collision. Then

the neutrons are again permitted to go off in a random direction, travel a random distance, hit some type of particle,
and lose energy. These data are recorded on a second set of
collision cards, one for each neutron; in addition, summary
data for the neutron's history to date are computed, such as
total distance traveled along the z-axis during both collisions, total energy lost during both collisions, and direction
cosine with the z-axis. In this manner the neutrons are carried from collision to collision, until all of them either are
absorbed in undergoing a collision, or drop through successive collisions, below a stipulated energy level, at which
point they are no longer useful for the purposes of the
problem.
In the sequence of calculations described here, the energy
loss, u, and the direction cosine, p. (both of which are independent of distance), will be calculated first for each collision of all neutrons. The distance, z, which is directly dependent on p. and indirectly upon u, is then calculated for
successive collisions of each neutron; z could be calculated
at the same time that u and p. are calculated, if the capacity
of the machines used so permitted, or u, p., and z might be
calculated in separate, successive operations in that order if
machine capacity is limited. It is understood that in actual
practice many of the operations which will be described as
separate steps should be combined; the method of combination will depend on the machines available, the training of
the operator handling the problem and, of course, the particular problem involved (magnitude of numbers, etc.).
For this particular example, one thousand neutrons will
be traced in a medium of two constituents. An elastic scattering as the result of a collision with an atom of one kind
will be designated a type 1 collision, with the other type of
atom a type 2 collision. Any collision resulting in absorption
will simply be called absorption, without regard to type of
atom hit.
Three decks of cards are assembled for the preparation
and actual computation of the problem described. They are:
1. Probability cards (card layout 1)
2. Master cards (card layout 2)
3. Collision cards (card layout 3)
*This paper was presented by Edward \V. Bailey.

82

83

SEMINAR

CARD LAYOUT 2

CARD LAYOUT 1
Card No.1

Title: PROBABILITY CARD
Card Color: Solid Blue

Title: MASTER CARD
Card Color: Solid Red
Card No.2
Source: Various Tables and Computations
(1)

1}-

Card No.2

41.
42.
43.

2.
3.
4.
5.
6.

D- Card No.1
2.
3.
4.

5.

6.

(2)

7.

~

46.
(7) 47.
48.
4
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.

K2(+.xxxx)

8.
9.

10.

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.

'" (±.xxx)

u(+ xx.xxx)
\/1-",2 (+.xxx)

cos 7rrs (±x.xxx)

p,

j

(10)~:

(±.xxx)

6.u (+x.xxx)

66.
( 1) .
8.
69.
70.

\/ 1-p,2 (+.xxx)

29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.

K (+.xxx)

x(+ .xxx)

7rrs (xx.xxx) (reduced for our 90°
deck)

Collision Code

In r4 (-x.xx)

A 1 ( u) (.xxxx)

Y2

(.xxx)

1'3

(.xxx)

Ao(u) (.xxxx)

76.

77.

r4 (.xxx)

78.
79.
80.

Al ( u) is the probability of a collision of type 1 or of absorption. Ao (u) is the probability of absorption. The probability of a collision of type 2 is l.0000-AI (u). All values
on the card are key punched from tables.

p.

The probability cards (card layout 1), of which there are
some twenty or thirty, are key punched from probability
tables calculated for the particular problem at hand. Each
card contains, in columns 53-57, a value u which is the
lower value of the energy range which the card represents,
a~d in columns 68-71 and 72-75 the probabilities associated
with the particular energy range, of absorption or scattering with the different types of particle involved. Thus, Al (u)
is the probability of a type 1 scattering or of absorption,
and Ao (u) is the probability of absorption. The probability
of a type 2 scattering is then 1.000 - Al (u). Probability
card 1 also contains the value>.. which is the mean free path,
or probability of scattering within a given distance, associated with the energy range.

2

4

The number of the operation in which a field is derived
is given in parentheses to the left of the card columns. See
"Master Cards" Operations for step involved.
}'1aster

Probability Cards

= r = ra = r

Cards

The deck of master cards, of which there are 2,000 in
this example (card layout 2), contains all the tabled values, such as functions of angles and natural logarithms,
which must be searched from time to time during the problem, as well as some calculated values which appear over
and over again, and hence can be intersperse gang punched
more readily than calculated each time. The arguments on
the master cards are p., r 2 , r a, and r 4 , and on anyone card
the four arguments are the same: they are put in the four
different fields as a convenience to the operator, the fields
on the master card being then lined up with those on the
collision card. Since all of the arguments are three-digit
numbers, our master deck will consist of 1,000 cards representing the 1,000 different choices of a three-digit number
(000-999, inclusive) and having a collision type code of 1,

84

COMPUTATION

and another 1,000 cards representmg all possible random
digits, having a collision code of 2 (refer to master card
table, step 1). In addition to the 4 three-digit arguments,
the master card will also contain the following which are
numbered in accordance with the step number on the table
of operations referring to the master card.
Step
No.

Function

+

K2
1 - 4r2M(M
1)-2, where the value of Mis
a constant depending on the collision type, is calculated
and punched in the master cards;
3 K is then intersperse gang punched into the master
cards from a square root deck;
4 w = [(.ll1
1)K - (M - 1)jK]j2 is calculated on
the 604 and punched;
5 V 1- w 2 is intersperse gang punched from a sin cos
deck;
6 V 1- ,LA? is intersperse gang punched from a sin cos
deck;
7 rs is multiplied by 71", reduced to an angle less than 90°,
the absolute value of whose cosine equals that of 7I"rs ,
and the angle is punched in the card while the sign is
punched in column 23 ;
8 cos 7I"rs is intersperse gang punched into columns 20-23
from the sin cos deck;
9 In r 4 is intersperse gang punched from an In deck;
10 6. u = -In k 2 is intersperse gang punched from an In
deck.
This completes the data on the master cards. It should be
noted that a number of the steps could be combined in actual computation, such as steps 1, 2, 4, and 7. Also, in the
absence of suitable function decks, some of the functions
could be calculated instead of intersperse gang punched.
2

39.
40.
41.
42.
43.
44.

~

46.
(1) 47.
48.

Title: COLLISION CARDBASIC "RANDOM NUMBER" CARD
Card No. 3
Card Color: Plain Manila
Source: Computations and Random Number Tables
(1)

jJ- Card No.3
2.

3.
4.

5.
6.

7.

8.
9:

10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

0'

(2)

~1:

ID-

cos 7rra (±x.xxx)

23.
24.
25.
26.
27.
28.

2~:

ID-

(3) 31.
32.
33.
34.
35.
36.
37.
38.

1nr4 (-x.xx)

rl (.xxxx) (to determine collision type)

50.
51.
52.
53.
54.
55.
56.
57.

+

CARD LAYOUT 3

Serial Number

r2 (.xxx) (to determine
energy loss)
ra (.xxx) (to determine
azimuth)

r4 (.xxx) (to' determine
distance traveled between collisions)

58.

59.

The random numbers r 1 , r 2 , r s, and r 4 are independent of
each other. The number of the operation in which a field is
derived is given in parentheses to the left of the card columns. See "Collision Cards" Operations for step involved.

Collision Cards
The collision cards (card layout 3) are begun as "random number" cards. An estimate is made, on the basis of
how many neutrons are involved and how many collisions
are expected, of the number of collision cards which will be
needed.
The following steps are then taken to obtain the information required concerning the case history of each
neutron. These steps are numbered in accordance with
the steps listed on the table of operations for the collision
cards.
Step
No.

Function

1 The estimated number of collision cards required is
made up with card number (3) in column 1 and random digits, which will be the basis for the random
choices to be made for each neutron, in columns 6880. The first four of the random digits are called r H
the next three r 2 , the next three r s , and the last three
r 4 • The source of the random numbers, which may be
either taken from tables or calculated, should be
recorded to avoid choosing them in the same way in
a future problem of the same nature. The cards
should be numbered serially to preserve the order.
2 Since cos 1Trs and In r 4 are dependent only on random
numbers, they may be put in all of the basic collision,
or "random number," cards before calculation is
started. The cards are sorted to order of r s, matchmerged with the master deck, and cos 1Trs intersperse
gang punched into the random number cards.
3 Similarly, In r 4 is intersperse gang punched from the
master deck into the random number cards. The random number, or basic collision, cards then look like
card layout 3 and are ready to be developed into com-

85

SEMINAR

4

5

6

7

S

9

10

plete collision cards. They are next sorted back to
serial number order to restore the randomness of the
random numbers.
The first 1,000 of the random number cards are numbered consecutively from 000 to 999 in columns 2-4;
01 emitted into columns 5-6; O's emitted into field
53-57; and /-'- = 1 - 2ro, where ro is chosen from the
other random numbers on the card, is calculated and
punched in columns 24-26. Each of these cards will
then represent one of the thousand neutrons as it
undergoes its first collision, having an initial energy
loss u of zero and having a random direction cosine /-'-.
The thousand cards are sorted to order of /-'-, matchmerged with the master deck, and y I_/-,-2 intersperse
gang punched into the collision cards.
The cards are then sorted into ascending order of the
value of energy loss, u, in columns 53-57 (which will,
of course, be zero for all cards for the first collision,
but not thereafter) and match-merged with the probability cards on columns 53-57, so that for each energy
range there is a probability card followed by all the
collision cards in the particular range. Collision cards
having an energy level less than the specified limit
are given a collision code of 3 and set aside, since they
will undergo no more collisions. A is interspers'e gang
punched from the probability cards into the remaining
collision cards.
The merged deck of probability cards and collision
cards is run through the collator, reading from each
probability card the probability Ao (u) of absorption
and selecting from the following collision cards all
cards whose r l < Ao (u). These cards are coded 0 in
column 67 and set aside, since they will undergo no
more collisions. The remainder of the collision cards,
which are still merged with the probability cards, are
recollated, this time comparing r l with Al (u). Collision cards whose r l < Al (u) are coded 1 in column
67 to indicate a type 1 collision, while cards whose
r l > Al (u) are coded 2 to indicate a type 2 collision.
Collision cards having a code of 1 or 2 are sorted to
the order of r 2 by collision code (67, 72-74) and
match-merged on those columns with the master deck.
K2, K, W, yl-w 2 and 6.u (all of which depend directly
on the collision type and r 2) are intersperse gang
punched from the master cards into the collision
cards. The master deck is then put .together again
while the collision cards are being sorted to the order
of neutron number (for convenience in case a particular neutron has to be checked in the future).
A random number card (card layout 3) is merged
behind each collision card to become the second collision card for that particular neutron.
The merged deck is put in the 604 and /-,-i, y 1-~, Wi,
Y l-w~, cos 7rr3i are read, where i is the collision num-

ber; /-'-i. + I = /-'-i Wi- Y 1-~ V I-wi cos 7T r ai is calculated
and punched in the random number card following
the collision card. The collision number is also read
from the collision card, increased by one, and punched
in the following card. The neutron number is carried
forward directly from the collision card to the following random number card, whereupon the random
number card becomes the new collision card for the
neutron. Ui and 6. u are read from the ith collision
card, added together and punched as Ui + I in the ith
card and as Ui in the i
lth card; that is, the total
energy loss after the ith collision is the same as the
total energy loss before the i + 1th collision.
11 The cards are then sorted on the collision number, the
first collision deck set aside temporarily, and the deck
of second collision cards taken through steps 5 to 11,
thus generating a deck of third' collision cards, and so
on until all the neutrons have disappeared.
There is, then, a collision card for each collision each
neutron has undergone. This card contains the total
energy loss of the neutron after the collision and its
direction cosine. To complete the data only the total
distance traveled after each collision is needed.
12 To get this value all the collision cards are merged or
sorted into order of collision number within neutron
number (columns 2-4, 5-6). The first collision card for
neutron 000 is read, and p = Aln r4 and 6.z = p/-,- are
calculated. p and 6.z are punched in the first collision
card; 6.z is punched in the z field in the first collision
card and stored as well.
13 The second collision card for the same neutron is
read, p and 6.z calculated and punched, 6.z added to
the z stored from the previous collision and the new
z value punched in the second collision card; that is,
for any given neutron Zi+1 = 6.Zi+l
Zi, where i is
the collision number. When the Z values are computed, calculation of the data pertaining to individual
collisions of each neutron is completed (card layout
4). These data can then be grouped and combined
for whatever statistical studies are desired.
Complete case histories for each neutron are readily
available and may be used in the solution of problems of
various boundary conditions involving the constituents of
the specific problem. Final compilation of the data takes
the form of frequency distributions and factorial moments
of these distributions. Two examples of such distributions
are:
1. Distribution of the total distance traveled from the
source before absorption or reaching a given energy
level.
2. Distribution of the energy loss at each collision.
The factorial moments for these distributions are com':'
puted for curve fitting and calculation of various parameters of these distributions. Calculation of these moments

+

+

86

COMPUTATION

involves the use of standard accounting machines and
presents no problems.

The number of the operation in which a field is derived
is given in parentheses to the left of the card columns. See
"Collision Cards" Operations for steps involved.

CARD LAYOUT 4
Title: COLLISION CARD
Card Color: Plain Manila
Source: Card 3
(Random Number Card, see Card Layout 3)

DISCUSSION

Card No. 3

(1) 1.
2.
( 4) (10) 3.

Card No.3

(4)(10)

Collision No.

41.
(12)42.
43.

Particle No.

Mr. Turner: I am not familiar enough with the details
of this calculation to know whether it is possible to go back
and reconstruct, shall we say, the X' coordinates. Can that
be done?

e (±xx.xx)

Mr. Bailey: Not in this particular problem. I am not
very familiar with the work which is being done now. It
may be that one of those actually doing the work could
answer that question.

Serial No.
(8)

](2

(+.xxxx)
A (+.xxx)

K (+.xxx)
w (±.xxx)

u, (+xx.xxx)

'" 1-w2 (+.xxx)
U'+l

(+xx.xxx)

Miss Johnson: We can go back on this problem and reconstruct the X' coordinates but it wouldn't be practical. We
are working with the problem now where all the different
coordinates, and time, are being calculated as we go, and
we are using six constituents or six different types of atoms
instead of only two.
Chairman Hurd: In this problem there was no interest
in the X' coordinates, Mr. Turner.

cos 7rra (±x.xxx)

/1-

6u (+x.xxx)

(±.xxx)

"'-1--/1-2

Collision Code

(±.xxx)

rl (.xxxx)

In r4 (-x.xx)

r2 (.xxx)

34.

(12)1]=6: p
37

Mr. Turner: I was thinking of its use to get the spectrum,
that is, the scatteri1ug, for the lower energy particles. If you
could get the X' cbordinates, you could rotate your space
and from the same data get the distribution for lower
energy particles.

(x.xx)

Mrs. Dismuke: That is the idea, of course, in the problem we are doing now. We started at higher energies. In
the particular problem, which Mr. Bailey described, we
started at such a low energy that our popUlation probably
wouldn't be big enough.

ra (.xxx)

3.

(12) 39.

40.

6z (±x.xx)

r4 (.xxx)

.

Type of Cards: MASTER CARDS
Operation
1. Make Basic Master
Cards

Formulation

2. Calculate and PunchK2

K2

=

1 - 4r2M(M

+ 1)-2

3. Punch K in Cards

4. Calculate and Punch
5. Punch VI Cards

w'2

in

.l.l1achine Operations Involved
On 604 generate the consecutive numbers 000-999,
punching the number for each card in columns 24-26,
72-74, 75-77 and 78-80. Emit a "2" into column 1 and
a collision code of "1" into column 67. Make a similar
deck with a "2" in column 67.
Calculate on 604, controlling on collision code to emit
proper value of M.
Sort the cards to the order of K2 match-merge with a
square root deck, and intersperse gang punch K into the
master cards.
J

W

(d

= [ (M

+1) K -

(M -1) / K] /2

Calculate on 604, controlling on collision code to emit
proper value of M.
Sort the cards to the order of w, match-merge with a
sin-cos deck (matching W with the sine) and intersperse
gang punch V 1 - w2 (cos) into the master cards.

S E M I N

87

A R

Operation

Formulation

Machine Operations Involved

6. Punch v' 1- p,2 in
Cards

Same operation as above, except match p. instead of
with sine.

7. Calculate and punch
7T'r3 so that we can pull
cos 11" r3 from our 90°
deck.

Ca1culate 7T'r 3 directly if ra <.500; calculate 180° -7T'ra
if r3 >.500 and punch X in column 23 to indicate a
negative value.

8. Punch cos 7T'r3 in Cards.

Sort to the order of 7T'r3, match-merge with a 90° sin-cos
deck and intersperse gang punch cos 11"ra into the master
cards.

9. Punch In r4 in Cards.

Sort to the order of r 4 , match-merge with In deck and
intersperse gang punch In r 4 into master cards.

10. Punch 6 u in Cards.

6u = -lnK2

w

Sort to the order of K2, match-merge with In deck and
intersperse gang punch In K2 into master cards, omitting
sign since In is negative and we want -In K2.

Type of Cards: COLLISION CARDS
1. Make Basic Collision,
or "Random Number,"
Cards.

2. Punch in each "Random Number" Card the
Cosine of a Random
Angle of Deflection.

Determine approximately how many collision cards will
be needed and reproduce, or calculate, random numbers
into columns 68-80 of that many blank cards. At same
time, emit a "3" into column 1 of these cards and number the cards serially in cols. 45-49. Call cols. 68-71 r 1 ,
72-74 r 2, 75-77 r 3 , and 78-80 r 4 •
cos 11"1'3

Sort "random number" cards to the order of r 4 in cols.
78-80, match-merge with half of the master deck (say,
the collision code 1 half) on cols. 78-80 and intersperse
gang punch In r 4 into the random number cards. Sort
the random number cards to the order of cols. 45-49 to
restore randomness.

3. Punch on each "Random N umber" Card a
In picked at random.

4. Pick 1000 neutrons to
undergo their first collision with no previous
energy loss, and start
them in a random direction.
5. Punch v' 1- p.2
Collision Cards.

In

Collision no. = 01
u = 00.000
p. = 1 - 2r0 for first collision only

On the 604 generate the consecutive numbers from 000999, punching the number for each card in columns 2-4.
Emit "01" into cols. 5-6
Emit "00.000" into cols. 53-57
Choose ro from r H r 2, r 3 , and r 4 : say the third digit of
each number, and calculate p. = 1-2ro. (p. will be chosen
in this fashion only on the first collision cards.)
Sort the collision cards to the order p., match-merge
them with the master deck, and intersperse gang punch
v' 1- p.2 into the collision cards.

the

6. Punch A, or the mean
free path, in the cards.

Sort "random number" cards to the order of r3 in cols.
75-77, match-merge with our master deck on cols. 75-77,
and intersperse gang punch cos 11"ra into the random
number cards. (Use only half of the master deck: the
collision code 1 or the collision code 2 cards, since cos
7T'ra is independent of collision type.)

A is the probability of scattering

in a given range and is dependent
on the energy range into which
the neutron falls.

Sort the collision cards to the order of u (which will be
00.000 on the collision cards for the first collision but not
thereafter), merge them with the probability cards on
columns 53-57 so that, in the merged deck, there will be
a probability card for a certain energy range, and behind
it will be all the collision cards with energy Ui within
that range, then the next probability card and the collision cards within its range, etc. Check the sequence of
the merged deck and then intersperse gang punch A
from the probability cards to the collision cards. Do not
sort the probability cards and collision cards apart.

88

COMPUTATION

Operation

Formulation

1vlachine Operations Involved

7. Determine whether each (a) If energy loss, u, of the neuneutron (a) fails to un- tron exceeds a certain value the
dergo another collision neutron undergoes no more collibecause of low energy; sions; (b) if r l < Ao the neutron
(b) is absorbed; (c) is absorbed; (c) if Al > r l >
undergoes a collision of A () the neutron undergoes a type
type 1 or (d) under- 1 collision; (d) if r] > Al the
goes a collision of type n.eutron undergoes a type 2 colli2. If not (a) then (b), SlOn.
( c ) , or (d) is a random
choice.

Cards of group (a) are removed from the deck by hand
and "3" punched in them as the collision code. Operations ( b ), (c), and (d) are all done on the collator.
Cards falling in group (b) are given a collision type
code of "0" ; cards falling in group (c) are coded "1";
and cards falling in group (d) are coded "2." Collision
cards having a code of "3" or "0" are removed from the
deck since the neutrons these cards represent will undergo no more collisions, while cards having a code of
"1" or "2" are carried on through the collision.

S. Pick a random energy Use r 2 and collision code for ranloss for each neutron dom choice of
and associated with this, K2 =1-4r2 M(M+1)-2
a value of (0 and v'1-(02 (0 = [(M+1)K - (M -'1)/K]/2,
6.u = - In K2

Sort collision cards to order of r 2 by collision code (co Is.
67, 72-74), match-merge with the master deck on
columns 67, 72-74, and intersperse gang punch K2, K,
(0, v' 1- (02 and 6. ~t from the master cards to the collision
cards. Sort the collision cards to order of neutron number (for future convenience) while putting the master
deck back together.

9. Pick a set of random
numbers to be used in
calculating the next collision cards for each
neutron.

Merge a "random number" card (see card layout 3)
behind each collision card, the random numbers on the
card to be used in calculating data for the next collision.

10. a. Calculate p,i + I for
the next collision.
h. Punch the neutron
number and new collision number in the new
card.

P,i+l

+

Read p" (0, v' 1-(02, v' 1- p,2, cos 7Tr 3 from the old collision cards, calculate P,i+1 and punch it in the new collision cards.

p,i(Oi
v'1 - p,~
v' 1 - (07 cos 7T r 3

=

ColI. No. i

+1=

ColI. No. i

+1

Intersperse gang punch the neutron number from the
old collision card to the new collision card. Read the
collision number from the old card, increase it by one,
and punch it in the new collision card.
Read Ui and 6.. u from the old collision card, calculate
and punch it in columns 58-62 of the old collision
card and in columns 53-57 of the new collision card.

c. Calculate Ui + I and
punch it in both the old
and the new collision
cards for each neutron.

Ui+1

11. Sort collision cards on
the collision number
and repeat steps 5-11
until all neutrons have
disappeared.
12. Pick a random distance
p for each neutron to
travel and calculate the
distance 6.z traveled
along the z-axis.
13. Repeat operation 12 for
all collisions of all neutrons starting each neutron at z = o.

p = A In r 4

6.z =
Zi+1

pp,

= 6.z + Z'i

Sort or merge cards to order of collision number by
neutron number (cols. 2-4, 5-6). Read first collision
card for first neutron; calculate and punch p, 6..z, and z,
storing z. Read second collision card for first neutron;
calculate and punch p, 6..z, and z, etc.

A Monte Carlo Method of Solving Laplacls Equation
EVERETT C. YOWELL
National Bureau of Standards

D URI N G the first meeting of this seminar we discussed
the solution of Laplace's equation in two dimensions by
smoothing techniques. As was pointed out at that time, the
smoothing process is only one approach to the problem. It
was an approach that was being tested at the Institute for
Numerical Analysis because it used a simple, iterative routine that a calculating machine could easily be instructed to
follow.
A second experimental calculation that we performed, in
seeking a simple method of solving Laplace's equation, was
a test of a method suggested by Dr. Feller. The basis for
this method is completely different from the other methods
we mentioned. Dr. Feller was seeking a probability approach to the problem. That is, some sort of a random
process is set up such that the probability distribution of
the answer given by the process obeys the same differential
equation as that of the physical problem. It is to be hoped
that the random process offers a simpler computing scheme
than any of the direct approaches to the solution of the
differential equation, thus making the computational task
simpler and less time-consuming.
In the case of Laplace's equation in two dimensions, this
random process is merely a two-dimensional random walk.
In such a walk, uniform steps are made along one or the
other of the coordinate directions, the choice being purely
random, and in either a positive or a negative direction, the
choice once again being random. If such a walk is started at
a point inside the boundary of a region, it will eventually
end up at the boundary. The number of steps will vary for
two different walks starting at the same point, but the average number of steps can be computed. Suppose, now, a
large number of walks from a single interior point is made.
Each time a boundary is reached, the process is stopped,
the value of the function on the boundary is recorded, and
the process is repeated from the same starting point. If,
after a large number of walks, an average of all the boundary values is noted, that average will approximate the value
of the function at the starting point and will converge to
that value as the number of walks increases to infinity.
This process was tested on the IBM Type 604 Electronic
Calculating Punch. A region bounded by the four lines
% = Y = 0 and % = Y = 10 was selected and a unit step
was made in the random walk. The boundary values were
f(10,y) = f(%,10) = 0, f(O,y) = 100 - lOy, j(%,0) =
(10 _ %)2.

The random variables were introduced into the problem
according to the evenness or oddness of the digits in a table
of random digits prepared and furnished us by RAND
Corporation.
The wiring of the 521 control panel was very simple. The
machine was to make two choices, depending on the evenness or oddness of two random digits. Hence, two columns
of the random digit cards were wired through two digit
selectors, and the outputs of all the even digits of each
selector were wired together. These two even outputs were
then used to transfer two calculate selectors.
The initial value of the coordinates of the starting point
was introduced by a special leader card. This carried a control punch in column 79, and the original coordinates in
columns 1-4. These values were read directly to factor storage units 1 and 3 and also to general storage units 1 and 3,
and these units were controlled to read only on an % punch
in column 79.
The final value of the answer was punched on a trailer
card, which carried a control punch in column 80. The sum
of the boundary values, and a tally of the number of times
the boundary was reached, were read out of general storage
2 and 4 and punched in the trailer under control of the y
in column 80.
The wiring of the 604 control panel was more complicated. The analysis chart is given below.
Factor Storage

1

2

3

Counter

4

General Storage
1
234

Y

%

Read
Step Suppress

1.
2.
3.
4.
5.
6.

7.
8.

9.
10.
11.
12.
13.

89

MQ

N.
N.
N.
N.
N.
N.
P.
P.

Operation

+,

Counter RI
GS 1 read out if calc selector 1 energized, GS 3 read out if not.
Emit "1," RI 2nd, counter add if calc selector 2 energized, subtract if not.
Counter RO. GS 1 RI if calc selector 1
energized, GS 3 RI if not.
Emit "1," RI 3rd, counter subtract.
Balance Test.
Counter read out and reset.
GS 2 read out.
Emit "1," counter add.
Counter read out and reset, GS 2 read in.
FS 1 read out, GS 1 read in.
FS 3 read out, GS 3 read in.
Emit "1," RI 3rd, counter add.
Emit "1," counter subtract, balance test.

90

COMPUTATION

Read
Step Suppress

14.
15.
16.

P.
P.
P.

Operation

Counter read out and reset.
Emit "1," counter add, RI 3rd.
Counter subtract, RO GS 3 if calc selector
1 energized, RO GS 1 if not.
17.
P.
Read out and reset counter, MQ RI.
18.
P.
Emit "1," RI 3rd, counter add only if calc
selector 1 energized.
19.
P.
GS 3 RI, counter RO and RS if calc selector 1 energized, GS 3 RI, MQ RO if not.
20.
P.
Multiply, GS 3 RO.
21.
P.
0 adjust, RI 2nd.
22.
P.
RO GS 4, RI 3rd, counter add.
23.
P.
RO and RS counter, GS 4 RI, RO 3rd.
24.
P.
GS 2 RO, counter add.
25.
P.
Emit "1," counter add.
26.
P.
Counter read out and reset, GS 2 RI.
27.
P.
FS 1 read out, GS 1 read in.
28.
P.
FS 3 read out, GS 3 read in.
29.
Counter read out and reset.
The first three steps compute the coordinate of the next
point in the random walk. Either the x or the y coordinate
is adjusted, according to one of the two random digits. And
the adjustment is either positive or negative, according to
the other random digit. Steps four and five test to see if the
upper bound in x or y has been reached. If a negative balance test occurs, then the point is still under the upper
bound, and steps 6 through 11, which correct the tally and
reset the coordinate units to their original values, must be
suppressed. Steps 12 and 13, which test for the lower
boundary are permitted to take place, and the new balance
test supersedes the old one. Now a negative test is a sign
that the boundary has been reached; so steps 14 through 28
-which compute the value of the function at the lower
bound, correct the tally, and reset the x and y storage units
to their original values-are suppressed on a plus balance
test.
Three possible conditions can arise from the balance
tests. If the first test is positive, the upper boundary has
been reached. Then steps 6 through 11 occur, and the remaining steps are suppressed. If the first test is negative,
the upper boundary has not been reached, and a second test
must be performed to see if the lower boundary has been
reached. So steps 6 through 11 are suppressed, and steps 12
and 13 occur. If the balance test on step 13 is positive, the
lower boundary has not been reached, and steps 14 through
28 are suppressed. If it tests negatively, the lower boundary
has been reached, and steps 14 through 28 occur. In all
cases, step 29 is taken to reset the counter at the end of
each computation.
The operating procedure was as follows: From the computation of the mean length of the random walk, an estimate
was made of the number of steps needed to give a specified
number of walks. This many random digit cards were selected, a leader card was put in with the coordinates of the
starting point, and a trailer card in which the answer was

to be punched. This deck was then run through the 604,
and the sum of the boundary values and the tally of the
number of times the boundary was reached was punched in
the final card. The quotient of these two numbers gave the
value of the function at the starting point.
The tests indicate that the method will give the correct
answers, but the speed of convergence is very slow. The
smoothing method converges as (lIn), where n is the
number of times the field is smoothed. In this- case, we
mean that the difference between the approximate solution
and the true solution is proportional to (lin). But in the
Monte Carlo Method, the convergence is proportional to
(llyn). And the convergence here is a statistical convergence; that is, the probable error is proportional to
(llyn), where n is the number of random walks. With
statistical convergence, one can never guarantee that the
solution is correct, but one can state that the probability is
high that the solution is correct.
It is obvious that the control panels described will not be
applicable if the boundary values of the function are at all
complicated functions of the coordinates. The method can
still be used if the following changes are made: A trailer
card is inserted behind each random digit card, and the coordinates of the point are punched in the trailer card. Or,
similarly, a few columns of the random digit deck are reproduced into work cards, and these are run through the
604, punching the coordinates of the point in each card.
The boundary values can be internally tested as long as the
boundary is a rectangle, for two sides can always be made
zero by a suitable translation of axes, and the other two
constants can be stored in the 604. Hence, the x and y
storage units can be reset to their initial value whenever
the boundary is reached. If the boundary is not rectangular,
then the mean length of the random walk and the dispersion
of the mean length can be computed and enough random
digit cards assigned to each random walk so that any preassigned percentage of fixed length walks will cross the
boundary. A sufficient number of walks are made so that a
reasonable number of boundary crossings are available.
The cards from the 604 in either of these methods will
contain the coordinates of the points of the walk. In the case
of rectangular boundaries, the boundary point can be found
by a sorting operation. In the case of the non-rectangular
boundary, the sorting operation must be followed by a
selection of the first point of each walk that crosses the
boundary. In any case, the functional value at each boundary point can then be gang punched in the proper boundary
cards and an average made.
The great drawback of this statistical method is its slow
speed of convergence. This should not-cause the method to
be discarded, for the ideas of Monte Carlo procedures are
still new and ways may still be found to speed up the convergence of the solution. It is also true that the speed of

SEMINAR

convergence of the Monte Carlo Method is not affected by
the dimensionality of the problem, and this may prove to be
a very great advantage in problems involving three or more
dimensions. Finally, Monte Carlo procedures may be very
important in problems where a single number is required
as an answer (such as the solution at a single point, or a
definite integral involving the solution of the differential
equation). In these cases the entire solution would have to
be found by smoothing techniques. With these limitations
and advantages recognized, the simplicity of the procedure
certainly makes the Monte Carlo Method worth considering
for the solution of a partial differential equation.
DISCUSSION
Mr. Turner: When punching the position of each sU'ccessive point in a random walk from a given starting point (in
oTder to find the value of the function at the given point)
can one consider the particular boundary value as one term
of the partial sums for each of the points that was crossed
in the random walk? That is, would it be a safe process
with a large number of walks, to consider each one of the
points, which was crossed as a starting point, for one of the
other points in the problem?
Dr. Yowell: This question is now being investigated by
Dr. Wasow and Dr. Acton at the Institute for Numerical
Analysis. Their preliminary findings, about which I hesitate to say very much, indicate that this can be done, but it
raises the standard deviation by a great deal, although it
gives the same mean.
Mr. Turner: Another suggestion is that instead of starting f~om the interior, start from the boundary; then the
problem of when the boundary is reached does not arise.
In other words, use the symmetry of the functions. Start
from the boundary and walk into the interior.
Professor Kac: That defeats the purpose. The result is
the harmonic function at the starting point. So if one starts
at the boundary, then one gets what he already knows, because the boundary value is given.
Mr. Turner: What I meant was: start at the boundary
and then generate a sequence of points that are associated
randomly with that particular boundary point (a sequence
of interior points). In the final operation sort the cards on
the point coordinates, run them through the accounting
machine, and sum the values that are on the boundary
cards; in other words, gang punch each boundary value
into each one of these cards that was generated by the
random walk.
Dr. Yowell: The question that occurs to me is, if one
starts at any two boundary points, is the relative frequency
of reaching a particular interior point the same as the relative probability of reaching the two boundary points by
random walks originating at the particular interior point?
Professor Tukey: It seems to me that Mr. Turner has a
very interesting suggestion, and that is to start at the

91
boundary and walk n steps inside, until the boundary is
reached. One then has two n pairs of numbers, each pair
consisting of a boundary point and an interior point, and
I understand the suggestion is to take a lot of walks of this
type and cross-tabulate the number of times a pair (consisting of a certain boundary point and a certain interior point)
occurs. It is an interesting speculation.
Professor K unz: I would like to remark that we have
taken a lot of liberties here. These probabilities around the
edges (the Green's functions, which we are calculating
here) must be evaluated for every interior point, n 2 in number; they must be multiplied in each case by every boundary
value, 4n, and that is too much work already.
Mr. Bell: Consider doing the problem on a type 604. The
standard 604, of course, won't program repeat, but it has
been pointed out that internal wiring can be altered to
cause the machine to repeat the program cycles indefinitely.
The 604 has a pulse rate of 50,000 pulses a second. It
takes about 25 pulses to do a particular operation, except
mUltiplication and division, and they take a few more. Suppose we assume 100 cycles; which means 2,500 pulses, so
that 20 of these testing operations a second can be performed. We can thus make about 1,200 such tests in a
minute.
Thus, my question is: Would it be reasonable to take a
604, and, with quite minor modifications, load it up, let it
run these random walks where all the computing is being
done down to reaching a boundary condition, at which point
a new card feeds and the computing continues?
Here is an opportunity to get the speed that is really
available with electronic computing. T am certain that the
circuits that would have to be changed are minor, a matter
of a few hours' work, and a standard 604 could do this.
Mr. Sheldon: If this is to be done, it will be necessary to
make the 604 compute its own random numbers since the
random numbers must get into the machine in some way.
Mr. Bell: I wonder if that is a limitation. It is being done
in the 602-A. It would depend, of course, on the functions;
but a great deal can be done with the 604, and people who
have had little or no experience with it are often amazed at
how much can be done in spite of the limited storage
available.
Dr. Yowell: We have actually tried generating random
digits on the 604. We have 100,000 of them generated on
a 5 by 8 representative multiplication, much of it the type
that has been used on other machines, of squaring a tendigit number and extracting the central ten digits. These
have been- subjected to the usual tests, and all we can say
so far is that the digits are extremely lumpy and the question of how homogenized the digits have to be, in order to
give good results, is something that should certainly be considered before generating random digits with a small multiplier or multiplicand.
Chairman Hurd: I think the idea sounds promising.

Further Remarks on Stochastic Methods
in Quantum Mechanics
GILBERT

W.

KING

Arthur D. Little, Incorporated

I HA VE REMARK~D that people working with
computing machines frequently rediscover old mathematics.
My philosophy is that with a computing madiine one can
forget classical mathematics. It is true that if one knows a
little analysis, it can be worked in, but it is not necessary.
However, Professor Kac 1 indicates that it looks as though
I am going to have to learn some new mathematics after
all, about Wiener functionals, to really understand what I
am trying to do with the Monte-Carlo Method.
Mr. Bell has introduced the idea of a "DUZ" board, which
is a control panel by which all sorts of calculations can be
done without having to worry about wiring for each problem. The Monte-Carlo Method is really a "DUZ" method;
and can do many kinds of problems without one having to
think up new schemes of attacking each one.
In particular, I like to feel that one contribution the
Monte-Carlo Method has made (and the Monte-Carlo
Method is probably only the first new method we are going
to have for computing machines), is the by-passing of
specialized mathematical formulation, such as differential
equ~tions. From the viewpoint of a physicist or a chemist
there really is no differential equation connected with the
problem. That is just an abstraction that allows the problem to be solved by means of the last few hundred years
of mathematics. With a computing machine one cannot use
analysis directly. Here is an opportunity (for the physicist
and the chemist) of· putting the real problem right on the
machine without having to go through differential equations.
Although Professor Kac 1 and Dr. Yowe1l 2 have illustrated this much more thoroughly, I would like to discuss
again the very simplest kind of problem, namely, diffusion.
We can discuss dIffusion in one dimension, although I will
indicate there is no fundamental difference in approach
for many dimensions (in contrast with some analytical
methods).
Consider a source of material such as a dye in a capillary
in which the dye can diffuse either way. We wish to find
out the distribution of these molecules after time,t. We are
accustomed to say that the concentration varies with time

according to its gradient in concentration along the x axis,
with a proportionality constant D, the diffusion coefficient.

au = Da2y

at

ax

2

This differential equation is a simple one, and can be integrated. However, the physical problem need not be solved
this way. The physical process may be' considered directly.
A molecule at the origin is subject to Brownian motion.
(This is what makes diffusion work, and is implied in the
differential equation.) That is, in time, 6t, it is going to
move 6x either in one direction or the other. In another
interval of time, 6t, it is again going to move either forward or backward. The molecule describes a random path
of 6x in each interval of time, 6t, and after n steps it will
arrive at some point x.
Another particle also follows a random path and arrives at a different place. If this process is carried out for
a thousand particles,. the distribution of particles at each
value of x is a step function which, in the limit, will be
exactly the solution of the differential equation. In the
one-dimensional case the particles distribute themselves
according to a normal distribution, which spreads out and
becomes flatter with time.
These calculations can be carried out very easily on a
computing machine, because when a particle moves 6x
and 6t, it is only necessary to add 6x to its x coordinate,
with a sign determined by a random number which can
be fed into the machine. So, fundamentally, this is a very
straightforward method. It might be crude in complicated
pr:oblems, but one does not have to think very hard to set
up the problem.
More difficult problems, which also have a physical
basis, can be put directly on the machines, by-passing the
differential equations. For instance, in quantum mechanics
we are interested in the behavior of chemical systems. By
a system is meant here an assembly of particles whose
x, y, z coordinates completely describe the system. Thus,
if the stationary distribution of these coordinates is known
the stationary states of the system are known, and if that
variation with time is known, chemical reactions can be

92

j

93

SEMINAR

described. If the x, y, z coordinates of each particle are
used to map out a multi-dimensional space the system can
be described uniquely by a point in this space. A change
of the system with time corresponds to a motion of this
point in the dimensional space. From the point of view of
the Monte-Carlo Method, the point is a particle. It is an
abstract particle and does not refer to the electrons or
nuclei in the system. In quantum mechanics these abstract
particles do not have to diffuse in the classical sense; they
can jump around in the configuration space. These jumps
are called transitions. In this way a very characteristic feature of quantum mechanics can be introduced into the formulation of the problem. However, to simplify presentation
and tie the Monte-Carlo Method to what has already been
discussed at this seminar, we shall take the point of view
that these transitions are over a small range and are governed by a partial differential equation. This is the so-called
time-dependent Hamiltonian equation:
.r.

'tn

atf;(x,t)
H ( .)
at
=
tf; x,t

with solutions
tf; ( x ) e

-i>..tj11,.

This can be looked upon as a diffusion equation in a space
with complex coordinates. It has periodic solutions. However, by making a transformation to a new time (it/I!) it
reduces to an ordinary differential equation with real variables of the type that Professor Kac l described:

au
at

= \J2U

)
u (x,t

= tf;(x )

~

-

V(x)u

with solutions

e

-At.

The first two terms correspond to the ordinary diffusion
equation which I have discussed. The third term is additional. The diffusion process corresponds to the same kind
of random walk, only now when a particle reaches a point,
x, it is multiplied by a factor e-V(IlJ)dt. Thus, if the potential
is positive, the particle has less weight when it arrives at
the region. When V (x) is negative it has more weight.
Thus, the particles, beside diffusing, actually increase or
decrease in weight. The diffusion phenomenon governed by
this type of differential equation, that is by Schrodinger's
equation, corresponds to the particles diffusing from the
origin and distributing themselves, at first, in a curve similar to the normal one obtained for ordinary diffusion. The
effect of the potential energy is such that as the particles
diffuse far out they decrease in weight very quickly,
whereas the particles diffusing near the origin do not decrease in weight very fast; so that the quantum mechanical
distribution function gets cut off at the extremes and dies
out uniformly throughout the whole region. The distribu-

tion is, in fact, the actnal wave function in terms of the
coordinates, dying out with a logarithmic decay equal to
the eigenvalue. The exponential decay of the wave function
causes some difficulty in accuracy on computing machines.
To avoid this difficulty a modified potential function can be
used,
V'(x) = Vex) - >..0
where >..0 is an approximation to the lowest eigenvalue.
With this modified potential function in the exponential
the total number of particles does not decrease. Thus, the
particles diffuse out to form an approximation to the wave
function whose area remains constant and which rapidly
settles down to the proper value.
Employing the Monte-Carlo Method in this way, we have
been able to set up a simple problem, namely that of the
harmonic oscillator on the 602-A calculating punch, which
is not an elaborate machine. We have actually been able to
insert only one card and have computing carryon for a
whole day. When enough random walks have been made, a
button is pressed, and the sum of the weight factors divided
by the number of walks is punched. This is the eigenvalue.
I t is rather interesting to see that, using random numbers
during the day's calculations, a very definite number has
been obtained, namely, the eigenvalue of the harmonic oscillator, which is correct at least to a certain number of
significant figures. If a real source of random numbers was
available, the process could be repeated on another day in
which an entirely different set of numbers would have
passed through the machine, and the same eigenvalue, within
statistical fluctuations, would be obtained. Thus, the computations are carried out entirely with random numbers.
Even in one calculation, the numbers at the beginning and
the end are entirely independent, statistically. There happens to be a sort of Markoff process so that the numbers
over a period of two or three minutes are related to each
other, although the numbers over a period of several hours
are quite independent.
I want, now, to make a connection of the Monte-Carlo
Method to the other methods of solving differential equations discussed at this seminar. To do this we can discuss
merely the ordinary diffusion equations and leave out the
potential terms which is a characteristic of quantum mechanics. The diffusion equation must be written as a difference equation:

u(n,m)

= ~{u(n-tm-l)

+ u(n+l,m+l)}

x = n6x
t = m6t
26t = 6X2
which states that the number of particles at x, after one
step, is equal to one half the probability of the particles
being either to the left or to the right. The set of difference

94

COMPUTATION

equations, one for each value of x, form a matrix equation,
where the distribution along x is represented as a vector:
~

~

U(m) = Tu(m-l)
1
2

T=

° 12
1
2 °
1

1
2

1
2
2
The matrix T operates on the original distribution u(O),
which will be assumed to correspond to particles only at the
origin and, therefore, has an. element = 1 at x = 0, and 0
elsewhere. Multiplication of the vector by the matrix T corresponds to the diffusion in a short time, 6t, the operation
of the square of this matrix on the original vector to taking
walks of two steps. The nth power of the matrix corresponds to taking n steps. Each term in each element of the
resulting vector corresponds to a single walk, and the grand
total of all the terms in all the elements of the vector corresponds to all possible walks.
The method can be generalized by allowing the particles
not only to move d.x and dt, but to move ndx and ndt where
the chances of moving 2d.x, 3dx and 4d.x, etc., are governed
by some distribution law. It is clear that this distribution
law is exactly that required to express the second derivative
in terms of the function to the left and to the right:
2

d. U = ~
dX
2

. (

ak U

/\ )
.x+ k u.x

°

k

=-

T7
J\..

to +K

Physically, this corresponds to particles having, instead of
a mean free path, a distribution of free paths, and the idea
is introduced that a particle can make a transition from one
state to another and not merely diffuse over a single interval, dx. In this case, the matrix T has 2K diagonals, but
although the expressions for the elements of T u are more
complicated, it is still true that all possible random walks
correspond to the nth power of this matrix, so that the situation is exactly the same as described in the elementary
diffusion problem.
The characteristic vectors of this matrix can be found in
the usual way of iterating the matrix. Thus, it is seen that
the Monte-Carlo Method of solving a differential equation,
when carried to the limit of all possible random walks, becomes the recommended method of finding the characteristic vectors of a matrix. It is interesting to see the MonteCarlo Method as a "nuz" method, in the sense that it works

with a 2 by 2 or a 10,000 by 10,000 matrix. The advantage
of the Monte-Carlo Method is that, instead of computing all
terms in all elements of the nth power of a matrix, only a
sample of, say, 1,000 need be taken to obtain results to two
or three significant figures. It is, therefore, quite clear that
if all possible random walks were taken, a distribution
would be obtained which would be exactly defined by the
iterative method of solving difference equations.
DISCUSSION

Dr. Brillouin: There is a problem in connection with all
these applications of the Monte-Carlo Method that has been
in my mind for some time, and I would like to ask a question. One of the difficulties in using the machine is that you
have to repeat the computation a number of times, at least
twice, to be sure that the machine doesn't make mistakes.
How can you repeat twice, the same random walk? How
do you make the checks on the l\10nte-Carlo Method on the
machine?
Dr. King: As I pointed out, that is fundamentally impossible if you have a real source of random numbers.
However, the Mortte-Carlo Method could be carried out
very conveniently if there were a hub on every IBM control
panel that said "random number" on it, and supplying a
random number. To check the results one would merely
repeat the whole problem, using entirely different random
numbers. The eigenvalues obtained should, of course, be
the same as the first time through, within a statistical fluctuation depending on the number of random walks taken.
However, this method does not allow for faulty wiring or
machine failures. To make sure that no mistakes of this
type have been made, we have adopted the procedure of recording the random numbers used with every step. We then
repeat the whole procedure, using the random numbers in
reverse order. In other words, we allow the particles to
walk in the opposite direction from the first case. This usually means that the function of the types of random numbers be changed so that a fairly reliable check of machine
methods has been made.
REFERENCES

1. MARK KAC, "The Monte Carlo Method and Its Applications,"
pp. 74-81.
2. EVERETT YOWELL, "A Monte Carlo Method of Solving Laplace's
Equation," pp. 89-91.

Standard Methods of Analyzjng Data
JOHN

w.

TUKEY

Princeton University

ISH 0 U L D L IKE to be able to make you statisticians
in one brief exposure, but it seems unwise to try. We are
going to go over some methods that form sort of a central
core of the statistical techniques that are used today, trying
to do it in such a way that when someone comes to youwanting this or that computed-you may have a better understanding of why these particular things were chosen.
By and large, we are not going to discuss the formulas
that would actually be used in the computation (although
we shall occasionally refer to those used in hand computation). I will leave that to Dr. Monroe l and Dr. Brandt. 2
I am going to discuss these methods in terms of how it is
easiest to think about them.
My purpose, then, is to supply background: statistical,
algebraic and perhaps intuitive.

of a set of values %, and a process of fitting a straight line.
Weare going to start two steps below this and work up
slowly.
Let us suppose that we have a number of measurements
of what is supposed to be the same quantity, perhaps the
velocity of light, perhaps the conversion efficiency of a
catalytic cracking plant, perhaps the response of guayule to
a new fertilizer. These measurements will not all be the
same (if they are, we are not carrying enough significant
figures!). We must take account of their variability!
The simplest way to do this is to suppose that
1. they are composed of an invariable part and one that
is fluctuating,
2. these parts are united by a simple plus sign, and
3. the fluctuating part behaves like the mathematician's
ideal coin, or ideal set of dice, with the contribution
to each measurement wholly unrelated to that in the
last, or the next, or any others (nearby or not).
In terms of this situation, where these parts are things we
shall never know, we want to get as good a hold on the
underlying facts as we can-from three values, from thirty,
or from three hundred. These three suppositions suppose a
lot, and much experimental technique and much experimenter's knowledge of the subject matter go into making
them good approximations by doing the experiment appropriately. We shall not go into these problems of design, as
opposed to analysis, of experiment.
Of course, not all experiments can be conducted to fit
such a simple model, and we shall also meet more complex
ones. These will usually involve more parts, and when these
parts are connected by plus signs, we shall usually use
methods which grow naturally out of those used for the
single sample.

Interpreting Data
We shall have more to do with models than you might
expect-quantitive models for what might have happened.
This will seem strange at first glance, for most of us usually
keep this aspect of interpreting numbers in our subconscious. But the whole of modern statistics, philosophy and
methods alike, is based on the principle of interpreting what
did happen in terms of what might have happened. When
you think the situation over, I think that you will agree.
There are few problems, indeed, where it is sufficient and
satisfactory to say, "Well, here are the numbers, and this is
a sort of summary of them. Now, somebody ought to know
enough to do something with this !" But many of my friends
will try to do this, astronomers in war work or sociologists
studying public opinion; they try to stop too soon. A reasonable quantitative model would take them much further.
In discussing new machines, it has been said that "statistics is counting." Many think so. But the sort of statistical
procedures that I am going to discuss are basically procedures for analyzing measurements, not counts. They can
also be used for counts which behave like measurements,
and I believe Dr. Brandt will discuss this. There are other
ways to bring counts into the picture, but we shall not go
into them here.

Notation
We shall deal mainly with averages and variances. If we
have, or we think about, N numbers, Wv W 2 , • • • , WN then
W. =

average

(W)

=

Wl

+

W

2

+IV ... +

WN

variance (w) =

Simple Models
When a physicist or engineer thinks of reduction of data,
his first thought is of a set of points y, one or more for each

wi

+ w; + ... + Wk -N(W + W + ... + wNF
l

N-1

95

'

2

96

COMPUTATION

The variance, as defined here, differs from the mean-squareerror 01 the physicists by just this denominator of N-l
instead of N. There is still a deep schism on whether you
divide by N or N -1. I am far on the left, and divide everything of this kind by N -1 under all possible circumstances.
It makes the formulas simpler; I think that is a good reason.
When you have a set of numbers X H X 2 , ••• , Xn which we
think of as a sample from a larger collection, as for example
in our model just described, we calculate the same things
but use different words and notation
X.

= nlean (X) = Xl

+ + ... +
X2

Xn

n
1{SDx = mean square deviation (x) =
Xl2

+ +
%22

•••

+

Xn2 - ;1(.
; .1:1

X+

n

+ +
X2

• • •

+ Xn

)2

n-l
We shall steadily use these three conventions:
A • in place of a subscript means the mean (= average
over the sample).
A ... in place of a subscript means the average (taken
over the population).
A + in place of a SUbscript means the sum over the sample.
Thus, for example, if Yij has been observed for each combination of i = 1,2, ... , c with j = 1,2, ... " r., then

Y·3

=

}{Y;3 + Y23 + ... + YC3}=~ Y+3

Y2·

=

~ {Y21 + Y22 + ... + Y2r} = ~ Y2+

This choice of notation and terminology is compact and
convenient, and we shall use it systematically, but you are
warned that it is not universal, not general, or, in some aspects, not even widespread.
It will allow us to state models and indicate possible computations in a relatively compact and clear way.
A

SINGLE SAMPLE

The Model

Weare now prepared to take the model for a single
sample that we have already discussed and express it algebraically. It is
Yi = 'YJ
Ei,
i= 1,2, ... n,
'YJ fixed,
(1)
Ei a sample from a population of size N, average E...
and variance 0'2.
that is, we think of the fluctuating parts as a random selection of n values from N values. These N values will have
an average, which we have already agreed to call E... and a
variance which we now agree to call 0'2. We may think of N
as large as we like, and can easily think of N = 00 as a
limiting case. We may confine our analysis to finite N, and
always take the infinite case in the limiting sense. Any
practical meaningful new features which might appear in a
mathematical model for the infinite case would be statisti-

+

I

cally extraneous, and we should have to seek for ways to
eliminate them by changing the model (this we might do by
changing the mathematician).
Illustrations
Let us imagine ourselves in control of an army camp containing 50,000 men. We may be interested in their average
height. Suppose that 'YJ is the average height for the whole
army of 10,000,000 men, which we do not know; then we
can define E as the difference between an individual soldier's
height and 'YJ. There will be 50,000 such values of E (if we
admit, for the sake of simplicity, that each soldier has a
height which can be reliably measured). We may select 200
men at random (which is an operation requiring care) from
the personnel files, and have their heights measured. We
shall have available

y. =

'YJ

+ E. = 2~ {Y1 + Y:i + ... + Y200} ,

and we are interested in
y ... = 'YJ + E... = average of all 50,000 heights.
Here N = 50,000, n = 200, and the Greek letters, as usual,
refer to quantities which we do not know, even after the
experiment. We are, however, interested in learning as
much as we can about some of these Greek quantities, or
about some combinations of them. In this case, we wish to
E....
infer about 'YJ
As another case, let us take the measurement of the velocity of light in a vacuum. Here 'YJ would naturally be the
"true" velocity of light in a vacuum, if such exists, while
theE's are defined by difference, as the "errors"of single
determinations. We have no obvious limit to the size of the
population of errors ; so we takeN = 00. The average velocity measured by this observer, under these conditions, with
this apparatus is

+

y ... = 'YJ+ E....

This will not be the "true" velocity because of systematic
errors in theory, in instrument design, in instrument manufacture, and because of the personal equation of the observer, to name only a few reasons. These systematic effects
are reflected in E.... The statistical analysis can tell us nothing (directly) about E... , since we cannot measure· any E, or
anything related to an E which does not involve 'YJ. We can
learn about an individual 'YJ+E, since this combination is an
observable value, and hence we can learn statistically, about
'YJ+ E....

The allowance for E... is a matter for the physicist, although
the statistician may help a little.
We should hasten to say that in the models we use here,
there is much flexibility. Another person might define 'YJ to
be the average value obtainable by this observer under these
conditions, with this apparatus. By doing so, he would define E... = O. Since this would not change the experiment, it is
very well that we shall find that it would tiot change our
formulas or our conclusions.

97

SEMINAR

The Identities-Simplest Case
We can certainly write that
Yi
y.
(Yi - y.) ,
and many college freshmen would like to infer from this that
y~ = yi + (Yi - y.) 2 ,
which you know to be wrong unless Yi = y. (or y. = 0) !
However,. the sum of the deviations (Yi-Y.) for all i is
zero, so that
~y; == ~yi + ~(Yi - y.)2
== nyj + ~(Yi - y.)2
==nyi+ (n-1)s2 (definess2 )
(1')

== +

==~ +( ~Yi - ~:),

(working form).

The last line indicates how the two terms are ordinarily
calculated by hand. It is written in terms of sums instead
of averages. The divisions are postponed to the last. In general, people who do calculate this expression on hand machines makes a regular practice of such postponements. It
sav~s miscellaneous rounding errors from arising, and
makes it clear that certain decimal places are not needed
after a division. Notice that there is one standard process.
We shall see again and again that we have summed a certain number of entries, squared that sum, and then divided
the square by the number of entries. This y!/ n is a standard
sort of thing that arises again and again.
All this has been done as if we really expected y to be
nearly zero. What if we had expected it to be nearly Y?
There is an entirely analogous set of identities, namely
~(Yi-Y)2 ==~(y._Y)2

+ ~(Yi-y.)2

== n(y.-Y)2 +

~(Yi-y.)2

(1")

== n(y.-Y)2 + (n-1)s2 (the same S2!)
== (y+_nY)2
+ (!y~ _ y!)
.
n
n
~

Notice that the term !(Yi-y.)2, which appears here, is
exactly the same term which appeared in the previous identity. Thus, it equals (n-1)s2 as before, and can be calculated in the same simple way as before.
When the results are placed in a table, the standard form
is that of Table lA, where the mystic letters along the top of
the table refer to "degrees of freedom," "sums of squares,"
and "mean squares." The entries in Table IA show how the
actual entries-the numbers found by computation which
would be entered in such a table-are related to the actual
observations.
TABLE IA
TABULAR ARRANGEMENT FOR MODEL (1)
Item
DF
SS
MS
mean
1
n(y._Y)2
n(y._Y)2
residue
n-l
( n-1 ) S2
S2
Average Values under the Model
The model under which we are working, specified precisely in (1), states that the (

~) kinds of samples of size

n are equally probable. The various quantities we have been
discussing vary from sample to sample. But we can determine their average values-averaged over the ( ~) kinds
Together
of samples of size n-in terms of 'YJ and f .. and
with some variances, these are given in Table IB, page 98.
The essential things to notice are the average values in
the two bottom rows of Table IB ; the average values of the
two mean squares in Table IA. The average value of the
"mean square for residue" is just
the variance of
the fluctuating contribution. The average value of the
"mean square for the mean" consists of two terms,

(12.

(12,

(1 - ~ )"'" which is nearly,,' when the poPulati~n i~ large
(and is otherwise smaller), and n(y.. _Y)2 which IS zero
when the contemplated value Y is equal to the population
f ...
average y .. = 'YJ
The first essential point to be gained from these average
values of mean squares (which from now on we shall simply call "average mean square" and abbreviate by "AMS")
is this. These average values really say that:
1. All the systematic contribution has been siphoned off
into the mean square for the mean.
2. As much of the fluctuating contribution as possible
has been siphoned into the mean square for residue.
3. We know how much, on the average, of the effect of
the fluctuating contribution remains in the mean
square for the mean; we know this in terms of and
so can judge its size from the mean square for residue.
This is the qualitative picture--one you need to understand!

+

(12

Interpretation of Anova Table
Table IA is called an analysis of variance table (which I
often like to abbreviate to "anova table"). How do we interpret specific tables containing numbers? We know that
we must face sampling fluctuations from sample to sample,
but if we neglect them momentarily, we can learn much.
The average value of the residue mean square is
bare
The average value of
and unadorned; it tells us about
the mean square for the mean is complex; it has two terms.
We can avoid this complexity by forming

(12,

(12.

;;1{ (mean square for the mean ) -

(1 -

n )(mean
square for}
N
the residue)

whose average value is

~{n(y-y.. )2+(1 + ;)(12-(1- ~)(12}=

(Y-y .. )2 .

On the average this component of mean square for the
mean tells us about (Y _y.. ) 2. Thus, an observed large
value suggests that Y -y.. is not zero, while an observed
value which is small, zero, or negative indicates that Y -y...
could be zero (for all this sample tells us) and that Y -y.. is
surely small.
By analogy, we define the CMS (component of mean
square) for the residue by
CM S residue = M S residue,

98

COMPUTATION

TABLE IB
AVERAGE VALUES AND VARIANCES FOR MODEL (1)

Quantity

Average Value

Variance

'YJ

'YJ

0

£i

£.

y

'YJ+

£.

y.

'YJ+

£.

y.- y

('YJ

+

since its average value is just u 2 , and notice that we may
write

~ MS mean - (~- ~ )

MS residue.

N ow a very usual case is N = 00. Let us suppose that this
is so, and that we find
MS mean = 1000 ,
MS residue = 10 .
We must conclude, if there were at least three observations,
that the CMS mean, which came out to be
900
n '
is quite sure not to be zero on the average. Thus, Y -y. is
not zero, and we do not believe that Y can equal y•.
On the other hand, if
MS mean = 9, lOor 11
MS residue = 10 ,
we have
CMSmean = -1.. 01..,
n' 'n
and we must conclude that its average value might be zero.
Thus, Y -y. might be zero, and Y might be the unknown
y•. Moreover, Y -y. is probably small.
Thus, in extreme cases a look at the mean squares may
tell us whether Y is quite sure not to be equal to (or near
to) y. or whether it may equal y. (and is quite surely near
it). In intermediate cases we should have to think hard, or
use a carefully worked out procedure (such as we mention shortly).

£. -

Y) 2

2

(l_~)U2

(~- ~)u2
(~- ~) u2

'YJ+£.-y

(y. _ y)2

CMS mean =

(1-~ )u

+ (~- ~) u2

Notice that the average values of the mean squares and,
at lea.st by implication, the components of mean square play
an essential role in drawing such conclusions.

Test Ratios and Confidence Intervals
Now you may wish to make a critical test of the null
hypothesis that some Y you have selected may be equal to
y •. This is ordinarily done with one of the two ratios indicated in Table Ic.
TABLE

Ic

TEST RATIOS FOR MODEL (1)

{Y.

n(y._Y)2
(y._Y)2
Y}2= t 2
1 (
)2
1"
v's2/n
--~
- s~
n-l
n
t= y.-Y
__
v's2/n
If the critical value of t = ta, then confidence limits for
y. = 'YJ £.
are given by

F

=

Yi-Y.

+

F is used frequently for what is called a variance ratio. In
this case, as usual, it is just the ratio of the two entries in
the mean square column. In this particular case (as always
when the numerator has only one degree of freedom) it is
exactly the square of the very simple ratio denoted by t.
(This is Student's t.)

99

SEMINAR

If a given value of Y gives rise to a value of t (or F)
near zero, we must, so far as this one set of observations is
concerned, admit that y. (which we don't know) might
equal this given (or contemplated) value Y. If some Y is
far enough from the observed mean y., then t (and also
F = t 2 ) will be large, and we shall be more or less sure that
this Y differs from y.. Those values of Y, which might
reasonably be the unknown value of y. form a confidence
interval for y •. The last line of Table Ic shows how easy it
is to compute such an interval. The interpretation of the
interval is merely "y. is likely to lie in here," with the
strength of the conclusion depending on ta, and getting
stronger as the interval increases in length.

The Diagram

Apparent Partition
of Variation

?,<4
"'0
~

CIS

.... The model becomes
Yi = f3>...
£i,
i = 1, 2, ... , n,
and we see that we have a special case of model (1) where
'7 = f3>.... It is a special case only because we assumed £.1. = 0

+

TABI.lC IIA
AN AI. YSIS OF VARIAN CE T ABI.E FOR MODEl. (2)

Item

DF

slope

1

residue

n-l

SS

MS

(b-B)2~'xr

(b-B)2~~

(n-l)s2

S2

residue

(B-f3)2

~xi + (1 - N~'x~)
(}"2

(b~B)2 -

(

1- Ni'x1 ~
~'xf
%2

S2
ACMS

AMS
slope

CMS

(}"2

(B~f3)2
(}"2

101

SEMINAR

TABLE IlIA
ANALYSIS OF VARIANCE TABLE FOR MODEL

SS

MS

n(ye-Ye)2

n(ye-Ye)2

DF

Item

1
1

mean
slope
residue

(b-B)2~(xi-xe)2

n(YA-Ye)2 -

residue

(*- ~)

S2

S21~(xi-xe)2~-1

S2

ACMS

(1 - ; }T2

(B-f3)2~(Xi-Xe)2

slope

(ye-Ye)2 -

S2

AMS
mean

CMS

( b--B)2~(xi-Xe)2 (b-B)2 -

(n-2)s2

n-2

(3)

+

(YA-Ye)2
(B-f3)2

(}'2

(}'2

(}'2

in model (2). But we can always force (A = 0 in the first
model by defining TJ properly. Thus, we see that fitting a
mean is a special case of fitting a slope. This may surprise
us a little, but when we think a while it becomes reasonable.
We shall see more and more of this sort of thing.

There are other procedures for more complicated cases,
and for those you may start with references 3 to 11.
The identities are, briefly,
~yr ==~ye+ ~~b(xi-%e) p
~1 (Yi-ye) - b(Xi-Xe) r2

The Linear Case

==ny! - b 2
where

We are now going to branch out boldly, and let the line
go anywhere-no longer must it go through the origin. The
model is
Yi = ~
f3.ri
(i,
i = 1,2, ... , n
1Xi ( known without error
( 3)
1(i r a sample from a population of size N, average
and variance (}'2.
This is appropriate when we wish to fit a line to observations of % and y where
1. the errors and fluctuations in X are negligible compared to those in y, and
2. the size of the errors in y do not depend on whether
% or y is large or small.

+

+
!(Xi-Xe)2 + (n-2)s2

b _ ~(Xi-Xe) (Yi-ye)
!(Xi- Xe )2
'

and if Y i = A

+

~(Yi-Yi)2 =

+ BXi is the contemplated line
n(ye-Ye)2 + (b-B)2 !(%i-xe)2
+ (n-2)s2.

YA = ~

+ f3%e + (.

to be just the average y for the fixed set of x-values that
we have observed.

TABLE IlIB

= {ye-Ye}2= t 2 ,

S2

F -_ (b-B)2~(Xi-Xe)2 {
.

S2

y s2jn

b-B
ys2/~(%i-xe)2

(3)

(for mean).

}2 -_ t2 , (for sIope.)

If the critical value of t = ta" then confidence limits for YA are
s
y ± ta

yn

and for 13 are

(3")

The tabular arrangement is given in Table IlIA, and the
test ratios and confidence limits are those of Table IIB
where YA is defined by

TEST RATIOS AND CONFIDENCE LIMITS FOR MODEL

F = n(ye-Ye)2

(3')
( defines S2)

102

COMPUTATION
Apparent Partition
of Variation

It would be natural for you to ask why we didn't take out a
piece for the intercept ex and a piece for the slope f3. The
answer would be "this would be inconvenient, because the
natural estimates of ex and f3 are not orthogonal." And so
you would wonder-what is orthogonality? Let us try to
answer this question.
First, the change toy. = ex
f3x. and f3 really corresponds to writing

+

ex

+ f3Xi =

(ex-f3x.) 1

+ f3(Xi-X.)

.

It corresponds to using 1 and (Xi-X.) as the quantities
whose coefficients are to be found. The orthogonality of
these quantities is easy to express algebraically. The condition is
t 1 (Xi-X.) = 0 ,
which we know to be true. But why should we be interested
in these quantities?
Let us write y. and b in terms of b, f3 and the £i. W e h~l.Ve

Degrees of Freedom

+ f3x.+ = y. +
+ f3Xi + £i - (ex+f3 x .+£.) ~ (Xi-X.)
=f3t(Xi- X.) + t(£i-£.) (Xi-X.) ,

y.=ex
FIGURE

3.

£.

£. -

£.

bt(Xi-X.)2=t~ex
ANALYSIS OF VARIANCE DIAGRAM

FOR MODEL

2

(3)

=f3t(Xi-X.)2 - t(Xi-X.)£i
so that

We have broken up the sum of squares in a way entirely
analogous to the one we used in the simple cases. The diagram is shown in Figure 3. Just as before, the test ratios
(Table IIIB, page 101) are ratios of slopes of traces in this
diagram. The shaded portions show how much allowance
might carelessly (triangles) be made and is actually (blocks)
made for the contribution of fluctuations to the sum of
squares for the mean and for slope. Just as before, these
allowances are correct on the average. The actual numerical
values in the diagram correspond to a B which seems likely
Bx. which is quite clearly
not to = f3 and to a Y. = A
not = y. = ex + f3x •.
It may be interesting to write down the working forms
of the three sums of squares. They are

y.-Y. =

b-f3 = ~(
~

- (nA+Bx+)
n

p

£.

1

= -n1 t
_

Xi-..t.

1(£i-£.)

)2 t(Xi-X.)£i

1
)2 t(Xi- X.) (£i-£.)
Xi-X.
Weare going to inquire about the tendency of these quantities to fluctuate together. To do this, we begin with
~(

~

average

+

~y+

£. -

1(£i-£.)2~ =

(1 - ~) (12 ,

average ~ (£i--£.) (£j-£.)

~

= -

~ (12

hence
average i (y.-y.) (b-f3) ~

t( 1
)21(12t1(Xi-x.)
O~
n
Xi-X.
which vanishes because
t 1 (xi- x .)
does.
In general, the condition that
average ~tai(£i-£.) tbj(£j-£.) ~
0
is that
t aibi 0 .
Thus, one meaning of orthogonality is "no tendency to
fluctuate together" (as measured by this particular sort of
average). Since it is clearly convenient, but not essential,
to work with quantities which do not tend to fluctuate together, this meaning will perhaps content you now. Another meaning is discussed on page 104.

+

==

==

Orthogonality
Without explanation, the last model was taken apart into
1. a piece for the mean ex
f3x., and
2. a piece for the slope f3.

+

,

SEMINAR

103

Multiple Regression
N ow we can look at the machinery for what is called
multiple regression-a very general procedure. The model
looks as follows:
Yi = f31 X li + f32 X 2i
f3mXmi + fi
i = 1,2, ... , n
i Xji ~ is known without error
(4)
i fi ~ is a randomly arranged sanlple from a population of size N, average zero and variance (]"2.
The independent variables, or carriers of regression, Xli X:.!,
••• , Xm can be anything. They can be related or unrelated,
constant or not. Thus, the case

+ ... +

the results of simple operations on them. We write
(j,k] = ~i XjiXki,
1 1)2
[3 ·12,3 ·12] (b3.12-B3.l2) 2

[1,1] (b 1-B1)2
[2·1,2·1] (b 2 • 1 -B 2 • 1 )2
[3·12,3·12] (b3.12-B3.12)2

DF

residue n-m

AMS

CMS
Xl

;?i,1])[1~21]

(b 1-B1)2 - ( 1 -

S2

[2·1,2·1]

X 2•1

(b 2 • 1 -B 2 • 1 )2

X3·12

(b 3.12 -B 3.12 )2 - [3·12,3·12]

-

[1,1] (P1- B 1)2

+(1 -

[2·1,2·1] (P2.1- B 2.1)2

N~tl])(T2

+

(T2

S2

residue

Xl
X2.1
X3· 12

residue

S2

(T2

ACMS
(P1- Bl)2
(P2.1- B 2.1)2
(P3.12- B 3.12) 2

(T2

The corresponding diagram follows along the usual pattern and is shown in Figure 4. This figure is drawn for the
special case
m = 4: Xl == 1, X2 == X, X 3
X2, X4
X3 .
Thus, the various components, those for 1, X -X., X2 . . . , X3 -- ... , are called "mean," "slope," "curvature,"
and "twist."

==

==

Geometric Interpretation
All these mysterious notations and identities have a nice
geometric interpretation. Just make a vector out of each of
the m variables-m vectors in an n-dimensional space. The
initial set of coordinates in the space is like the statistician's
sample space; the coordinates of %1 in this system are the n

observed values of Xl for the n observations, and so on

Xl

= ~X1i}

%2

= ~X2i}

.t: =
y

~Xmd'
= ~Yi} .

Now, experimenters and quantities of interest being as they
are, the m vectors :tv :t2 , ••• mthat correspond to the m
variables are unlikely to be at right angles to each other.
But the aim of our fitting process comes out to be just
finding the component of y in the m-dimensional space determined by x\, Z2' ... , ?m. This would be easy if Zv ... , xm
were at right angles. So we set out to force them to right
angles and calculate the projection all at once.

,x

105

SEMINAR

Apparent Partition
of Variation

The vectors are shown in the original coordinate system
in Figure S.
The result of passing to '12 • 1 and Y.l is to replace (2,4) by

(-1, +l)and (1, -2) by( 11, -1~). This is done through
'1
_ 201

= _
'12

yo1

=

3'11
1_

-

Y-

,

2%1

N ow we notice that
and hence

-y -- - .!
2 ."~1
Recalling that
the line

1,

%1

l2 -":;t = 4~1 - l-:;t2
2-"

-

2·1

%2

=

we see that we have fitted

%,

Y=4-

Degrees of Freedom

FIGURE

4.

ANALYSIS OF VARIANCE DIAGRAM
FOR MODEL (4)
SPECIAL CASE OF POLYNOMIAL FITTING

It is easy to replace %2' ... , %m, Y by their components
perpendicular to '11 ' vVe have only to calculate a few dot
products and proceed as follows:
_
_
('11 %2)

-"

3

Z%

to the points (2,1) and (4,-2). It is easy to see that this
is a correct fit.
The geometric interpretation is the· same for any regression problem. It merely requires more than two dimensions
for the picture.
If we had not orthogonalized, then the problem of finding
them-dimensional plane of XH :t2 , ••• ,
the projection of
'1m leads at once to m equations in m unknowns. As practical computers, how would we have solved these equations?
By some method of elimination-whether we talk of Doolittle, Crout, or Dwyer's square root method. Geometrically

yon

0

%201

=

%2 -

(_ ) %1 ,
%1 • %1

I

Second
Observation
4

_
% mol

-

Yo1

('tm°'11 ) _

_

= % m - (% % ) .r
_ Cy %1)_
= Y - (_
_)
%1
%1
%1

1

0

1

1',

,

1

0

0

2

0

Then we can shift from '1a H •
%m.1' Y.1 to their components perpendicular to %2 1. These new vectors will still be
perpendicular to '11 • And so on.
By comparison with the operations on brackets specified
above, you can see that this orthogonalization procedure is
just what we have been doing. From the vectors '11 , '12 , ••• ,
'1m, we have constructed new vectors 'tH 't2 • H %a.12 , ... ,at
right angles to each other, and we have found the components b1 x;., b2 • 1 %2.]) ... , of y along the new vectors and
the residual vector Y.l2 ... m which is the component of y
perpendicular to all the %' s.
All this still may seem complicated. So let us go back and
fit a line to two points. We take n = m = 2, %1 == 1,
%2 == % and assume the observations
(%
2, y = 1)
(% = 4, Y = -2)
o

0

0

,

//

"-

,,

///

"-

,

,

/

"-

,,

/

,
',1,/

/

/

/

/

/

/

/

/

/

o

~ j{

-2

'\

0

-4

/1

Vf

First
Observation

'\~~~

-2

f'\
,//

/

'" ,

o

,,

2

FIGURE

5

,,

,,

,,

4

106

COMPUTATION

this means by orthogonalization, by something close to our
actual procedure. What difference might there be!
Really the difference is only this: we have not abbreviated
the method, we have put down all the steps. This leaves us
with a chance to change our mind-if [7.123456,7.123456]
is very small, so small that the errors of measurement we
have been neglecting account for most of it, we can drop
variable 7 without loss of work and go on to 8.123456, and
then to 9.1234568. We have written or typed or magnetized
or punched or mercuried some extra numbers. With these
we have bought insight at intermediate points and flexibility. I like to do it this way; you may like to do it another.

References
N ow there are more complex problems in regression, and
more complex ways to do simple problems. These we can
only cover by reference.
For problems where the observations are of varying accuracy or where the coefficients appear nonlinearly, the
classical methods are set out in reference 3.
For problems involving polynomials at equally spaced
intervals, much time and trouble is saved by the use of already prepared orthogonal polynomials. These are available
as follows:
To 5th degree, to 52 points, in reference 4.
To 5th degree, to 75 points, in reference 5.
To 5th degree, 53 to 104 points, in reference 6.
To 9th degree, to 52 or more points, in reference 7.
To 5th degree, to 21 points, in reference 8.

For problems involving error in more than one variable,
the user should read references 9 and 10, and for further
study the references given there. A very condensed summary of many theoretical results may be found in reference 11.
Application of regression ideas to more general problems
can be found in reference 12, and computational suggestions
can be found in some of the texts in references 14-18.
SINGLE CLASSIFICATION

Several Groups-A Single Classification
Regression, in the sense that we have used it-curvefitting, general mopping up with sums of terms, and the
like-accounts for many physical and chemical applications.
But, in many fields the type of analysis that we now enter
is the standard. Particularly in agriculture, only slightly
less in engineering and applied sciences, and to some extent
everywhere, the comparative experiment is king.
Simple before-and-after experiments, or comparisons of
two brands, two processes, two finishes or two raw materials, are easy; by taking differences you return to the type
of case we have discussed. We need to consider comparisons of several categories. The simplest experiment is one
in which you have ni observations in category i, where ·i
runs from 1 to c. That is, n 1 observations on the first brand,
the product of the first process, material covered with the

first finish or units made from the first raw material; n 2 observations on the second brand, the product of the second
process, material covered with the second finish or units
made from the second raw material; and so made on all
categories. The model will exhibit each observation as the
sum of a contribution depending on the category and a fluctuating contribution.
Here you can use quite complicated models for the fluctuating component with good sense and good results. For
many of these models the part of the analysis which we are
going to describe is the same. So I am not going to say just
what I assume here. If you assume, for example, that all
the fluctuations for all the categories come randomly out of
one big population of fluctuations, then you will have a
model that will fit a lot of circumstances. Everything that
is going to be said here will apply to that model. If you
want more ideas about possible models, read pages 69-75 of
reference 13.
We need some sort of a model, however, so that we can
describe average values and variances of things. We specify
a simple one, namely,
Yij = A + "Ii.
f.ij, 1 < j < ni,
A fixed
(5)
1"Ii ~ a sample from a population (of categories)
of size M, average "I. and variance O'~.
1f.ij ~ a sample from a population (of fluctuations) of size N, average f.u, and variance 0'2.
This is a good standard model, but not the most general for
which our analysis is suitable.
The categories in such a comparative experiment may be
anything. In a study of screws, they might be different
automatic screw machines, where you had taken a handful
of screws from each for measurement. Or they might be
the different times at which you had taken a handful from
one machine. Or they might be different months in which
you had sampled the whole factory's production. In agriculture they may be different varieties receiving different fertilizers. The categories may be individual operators of a
chemical unit process, or they may be different fatigue
states (as measured by time on shift) of a single operator.
You have a choice of a lot of things here.

+

Identities
Our identities follow the standard pattern; here there are
three pieces, as we see in (5') and (5").

+ ~(Yi.-y•• )2 + t(Yij-Yi.)
== nyi. + ~i ni (Yi.-Y •• ) + (n-t') S2

~ylj ==tyi.

2

2

(5')
( defines S2)

==.!n y;+ +{ (~i!
·ni yl )-.!
n Y!+} +{tYlj - ~i!
ni Yi+}
~(Yij-Yi)2 == ~(Yi.-Yi)2 - ~(Yij-Yi.)2
== ~i ni(Yi.-Yi) + (n-c)s2
(5")
== n(y•• _Y.)2 +
2

ti ni(Yi.-Y•• -Yi+Y.) 2

+ (n-c)s2.

107

SEMINAR

Where the dottings in place of the i's mean weighted averaging, that is,
y•• =

~Yij

y

• =

~niYi

n = ~ni
n = ~ni
and n is defined as in these denominators.
In terms of the second line of (5') the three pieces stand
out clearly.
1. A piece depending on y•• which expresses the fact
that the sample grand mean is not zero.
2. A piece comparing the category means among themselves.
3. A piece expressing fluctuations within a category.
Just as in our first case, we have siphoned into the last
term all that we possibly could of each of the fluctuations
without getting category or grand mean effects. Likewise,
we have siphoned as much as we can of the category-tocategory differences into the second piece without getting
grand mean effects. Our purification is only partial, but it
is the best that we can do. It is the old method, applied at
two levels instead of one, in two stages instead of one. We
have isolated the fluctuations within categories from the
category means as well as possible. Then we have isolated
the differences in category means from the grand mean as
well as we are able. On two levels at once, we use the same
process which you use unconsciously when you take an
average.
Tables
The elementary question that is going to be asked is:
"Are these categories different?" This is only the first
question, and those who stop with it are probably not getting
what they should out of the observations. From the standpoint of the computing group, it doesn't make much differ-

ence, because if they have answered this they have, ready at
hand, the other numbers which might be needed. Whether
asked for or not, always send the category means back upstairs with the analysis of variance table. Don't let the
statisticians forget the means for the sake of significance
tests!
The form of the analysis of variance table is shown in
Table VA. We have shown the one degree of freedom for
the mean which many leave out. It has nothing to do with
the comparison of categories, and since that is what such
analyses are usually for, it is often omitted. But there are
analyses that come in exactly this form where this line contains key information. If you attacked the problem which
was mentioned previously at this Seminar-getting the
average height of all the men in the United States-it would
not be very practical to try to draw a random sample directly of a thousand men out of all the inhabitants of the
United States. No one has a convenient card file that you
can enter with random numbers and pull out names. You
would want at the very least to break up the United States
into pieces, and select randomly and measure two or three
men in each of several randomly selected pieces. If you did
this you would have a situation that comes under this
model; because, if you broke up the United States into
pieces in any reasonable way, the average heights of the
men in the different pieces would be different, and these
differences from piece to piece might be crucial in fixing the
accuracy of your over-all mean.
There are approximately 3,000 counties in the United
States. Some of them, like Manhattan, are a little large and
inhomogeneous. Let us think in terms of 10,000 categories.
These are to be geographical regions, each with about the
same number of men. (What is a man, anyway?) If we

TABLE VA
ANALYSIS OF VARIANCE TABLE FOR MODEL (5)

Item

DF

SS

mean

1

ny.i

categories

c-1

within

n-c

CMS

MS

ny.i
1
~ini(Yi.-y•• ) 2
~ini (Yi.-Y •• ) 2
c-1
S2
( n-c)s2

ACMS
y.1

mean
categories
within

*Best computed from the numerical values of the coefficients in the
AMS column.

(*)
(*)
S2

108

COMPUTATION

selected 100 of these at random, and then selected three
men for measurement randomly within each of the hundred,
the grand average tells us a lot about the average height of
U.S. men. The grand average is going to fluctuate for two
reasons. One reason is that if you repeat the process you
would not have the same three men in a given category.
The other is likely to be more important; if you repeated
the process you would have a different set of 100 categories.

Apparent Partition
of Variation

You have here a situation where it makes sense to write
for any individual
height = U.S. average

+ (category average + (individual height -

U.S. average)
category average) .

Now, our grand average is the sum of the grand averages
of the three contributions for each individual. If you redo
the whole process, and use a new sample of 100 categories,
then the average of the 100 category averages will be different; the grand average of the second contributions will
be different.W e must allow for this as well as for the fluctuations in the grand average of the third contribution.
Before we go on, we notice that Table VA is a little complicated, and conjecture that this is due to the possibility of
having different numbers of. observations in the different
categories. So we treat the case (Table VIA)
like (5) except ni == r for i = 1, 2, . . . , c.
(6 )
Here things are quite simple in every line, except that for
the mean.

n-C

C-l
Degrees of Freedom

FIGURE

6.

ANALYSIS OF VARIANCE DIAGRAM
FOR MODEL (6)

grand mean, and that is lost from among the c categories.
So there must be c-1 degrees of freedom for categories.
This disposes of a total of c degrees of freedom; there were
n observations. Take c from n and you have n-c, the number of degrees of freedom within categories.
We should certainly call the differences between categories "apparently negligible" if the traces for "within" and

Diagram
Having the tables, we can now set forth the diagrams,
which we do in Figure 6 for model (6).
If we examine this diagram we see that it is much like
the other diagrams. We have the traces, one for each line
of the table. Clearly, one degree of freedom goes into the

TABLE VIA
ANALYSIS of VARIANCE TABLE FOR MODEL

[tem

DF

SS

mean

1

ny.i

categories
within

AMS

MS

ny.i
(1 r
c-1 r~i(Yi.-y•• )2 - - ~(Yi.-y•• )2
c-1
S2
(n~c)s2
n-c

ACMS
mean
categories
within

(*) = *(MSmean)

(6)

'lv)

(T2

+ (1 - ~)r(T~ + nyl.
+
r(T~

(T2

(T2

eMS
(*)
_1_ ~(Yi.-y•• )2 _ !S2
c-1
r
S2

-~(1- ~) (MScategories) -(~-n~) (MSwithin).

109

SEMINAR

categories have the same slope (lay along the same line).
For this would mean that the component of mean square
for categories was z~ro (it is zero on the average only when
the 'YJi are the same), and this is a precise form of the rough
statement that the categories are just alike. So we take the
usual shaded triangle away from the triangle for categories.
\Vhen N = M = 00, and the traces for categories and the
mean fall in the same line, then the component of mean
square for the mean is zero, and we conclude that the grand
mean might be zero. Thus, we take both shaded and dotted
triangles away from the triangle for the mean. When M is
large, and many categories to be considered are not represented in the experiment, we compare the mean square for
the mean with the mean square for categories.
Another extreme is N = 00, J.H = c, when the traces for
the mean and for "within" must lie on the same line for the
compOnent of mean square for the mean to vanish. Here
only the shaded triangle is taken away from the triangle for
the mean. When all categories to be considered were represented in the experiment, we compare the mean square for
the mean with the mean square "within."
And some situations fall in between, as the diagram illustrates.

The great difference in testing the mean-the great dependence on whether
sampled categories = considered categories
or
sampled categories «considered categories
shows up in more complex designs with avidity, subtlety
and frequency. There it affects comparisons and is worthy
of the analyst's best attention.

DOUBLE CLASSIFICATION

Basis
Fifty years ago it was claimed that the way to run an
experiment was to vary one thing at a time. If the nature
of the subject is such that the results are not going to make
sense, this is still the way to run an experiment. But, if in
your subject the results make some kind of sense, it is
usually much better to vary two things at once, or three
things at once, or more. One of my friends is faced with an
engineering problem where he is planning to vary 22 things
at once. I don't advise you to start with that many, but he
will learn more per dollar than if he varied one at a time.
If it makes sense to set up a model like this,
Yij = A

Test Ratios and Confidence Limits
We can again look for appropriate test ratios and confidence limits, with,the l'esults shown in Table VIB.

TARLE

VIB

TEST RATIOS AND CONFIDENCE LIMITS FOR MODEL

(6)

Are there differences between categories!

c r 1 ~i(Yi.-y•• )2 MS categories
F =
s:!.
1MS within Might the mean equal Y? (M large)
F_
n(y•• _Y)2
MS mean
t2
r ~ (
)2 MS categories
c-l ~i Yi·-Y··
Might the mean equal Y? (M = c)
F = n(y•• _Y)2 MS mean
S2
MS within
Confidence limits for y .. ? (MOlarge)

y•• ±

= t2

t"{"n(C~l) ~(y,.-y•• )'}

Confidence limits for y .. ? (1M = c)
s
y •• ± taz yn

(For c < M < 00 combine 111S within and M S categories as suggested by AMS's of Table VIA.)

+ 'YJi + cf>j +

Eij

where i refers to the level or nature of one thing and j to
the level or nature of the other, where the plus signs are
really plus signs, and the Eij are really random fluctuations,
then it is much more efficient and useful to vary both things
in a single experiment.
If there is no semblance of a plus sign-if, for example,
y increases when i increases for one value of j, but decreases
when i increases for another value of j-then there may be
little profit in varying two at once. There is not likely to be
loss in varying only two at once, but more complex experiments (such as Latin squares, which we will not discuss)
may burn the hand that planned them.
But, fortunately, life is reasonably simple in most subjects. The plus sign will be a good approximation often
enough for the use of such experiments to pay. It may not
be gratis; Y0U may have to work for it. For example, the
y's that you finally analyze may not be those with which
you started. If you. happened to be working on blood pressure, you may have to use the logarithm of the measured
blood pressure; it is unlikely to be satisfactory to use the
raw data in millimeters. For reasons that make sense when
you think about them, factors that affect blood pressure
tend to multiply together rather than add together in their
effects, and then the logarithms are additive.
Again the statistician ought to think hard about such
matters. He ought to see the need for transformations. But
sometimes the computing people may see something going
on that will make clear to them that there ought to be a
transformation. If the plot of the effect of one variable for

110

COMPUTATION

different values of another looks like Figure 7, for example,
if the effect seems faster at higher levels, then we are a
long way from a plus sign. The cure for this particular sort
of deviation is to squeeze things closer together at the
higher values than at the lower ones. You can do this by
changing to the square root of the observed values, or to
their logarithms; one or the other may work.
Things of this sort need to be kept in mind. The honesty
of the plus sign controls the extent to which the observations are adequately squeezed dry by this procedure.

FIGURE 7

One identity is

Model, Identity, Table and Diagram

~y~

We shall treat this case briefly. A reasonable model for
many uses IS
Yij = >..
'YJi
cP + Eij, 1 .. fixed,
1'YJi ~ a sample from a population of size N 1/'
average 'YJA and variance O"~,
( 6)
1cPj ~ a sample from a population of size N cp ,
average CPA, and variance O"~,
1Eij ~ a sample from a population of size N,
average EAA, and variance 0"2.
This is a case where the number of observations with i = i o,
j=jo is g(io)h(jo)-in fact g(i)
1
h(j)-so that the
i-classification and the j-classification are orthogonal. This
simplifies matters considerably.

==rcy.i

+ +

== ==

== ~y.i + ~(Yi.-y•• )2 + ~(y.j-y•• )2

+ ~(Yij-Yi.-y.j+y•• )
+ r~(Yi.-y•• )2 - C~j(y.j_y•• )2

1 y++
2 + {1- ~~iYi+2 - - 1 y++2 }+{1- ~~jy+j2 - -1 y++2}
==rc
r
rc
c
rc

+ {~y;j _1c ~jyJJ - r!~iYl+ +~
y;+ }
rc

mean
columns

c-1

rows

r-1

residue

MS

rc(y•• -Y)2

rc(y•• -Y)'1.

CMS

(*)
1
r~i(Yi.-y •• )2 ~ ~i(Yi.-y •• )2 -(MS columns)
cr
1
-(MS rows) C~j(y.j-y•• ) 2 r c 1 ~j(y.j-y•• ) 2
C
S2
S2
(c-1) (r-l)s2

(c-1) (r-1)

(1 -

~) o~ + (1 -

;

cP) O"~ - ( 1 -

columns
rows
residue

(*) =

0"2

~
LMS mean) 12
rc '
r

1
- -S2
r
1
-S2
c

ACMS

AMS
mean

~1/) O"~

-

rc(y•• -Y)2

(y •• _Y)'1.

+ rO"~

(!~~1/) (MS columns c

MS residue)

1\

- :2 (~- ~ cP) (MS rows -

.

The analysis of variance table is given as Table VIlA, and
the diagram as Figure 8. The two classifications are conveniently referred to as "columns" and "rows."
The details of this model and those of many more complicated ones we must leave to the reader's thought and study.
No two books will give him the same account, but a few of
interest are given in references 14-18.

SS

DF

(6')

+ (r-1) (c-1)s2

TABLE VIlA
ANALYSIS of VARIANCE TABLE FOR MODEL (7)
Item

2

MS residue) -

(;c- ~) (MS rows -

]"'IS residue)

*1s most easily found numerically from AMS mean, eMS columns, eMS rows, and eMS residue.

111

SEMINAR

13. JOHN W. TUKEY, "Dyadic Anova, an Analysis of Variance for
Vectors," Human Biology, 21 (1949), pp. 65-110.
14. R. A. FISHER, Statistical Methods for Research Workers
(Oliver and Boyd, Edinburgh, 11th ed.).
15. G. W. SNEDECOR, Statistical Methods (Collegiate Press, Ames,
Iowa, 4th ed.).
16. L. H. C. TIPPETT, The Methods of Statistics (Williams and
Norgate, London, 1937, 2nd ed.).
17. A. M. MOOD, Introduction to the Theory of Statistics (McGraw
Hill, 1950).
18. A. HALD, Statistics (John Wiley, New York; in the process of
publication) .

Apparent Partition
of Variation

DISCUSSION

'0

....0

~

.9

1l

"tl
v
"tl

~

.S .S

(C-l)(r-l)

I

C-l

I r-tltl

.s:"tl
(5

~

Degrees of Freedom

FIGURE

8.

ANALYSIS OF VARIANCE DIAGRAM
FOR MODEL (7)

REFERENCES

1. R. J. MONROE, "The Applications of Machine Methods to
Analysis of Variance and Multiple Regression," pp. 113-116.
2. A. E. BRANDT, "Forms of Analysis for Either Measurement or
Enumeration Data Amenable to Machine Methods," pp. 149153.
3. W. E. DEMING, Statistical Adjustment of Data (John Wiley,
New York, 1943).
4. R. A. FISHER and F. Y Al'ES, Statistical Tables for Biological,
Agricttltural and Medical Research (Oliver and Boyd, Edinburgh; 1st ed. 1938; 2nd ed. 1943).

5. Ibid., 3rd ed. (1948).
6. R. L. ANDERSON and E. E. HOUSEMAN, "Tables of Orthogonal
Polynomial Values Extended to N = 104," Research Bulletin
297 (Agr. Expt. Sta., Ames, Iowa, 1942).
7. D. VAN DER REYDEN, "Curve Fitting by the Orthogonal Polynomials of Least Squares," Onderstepoort Journal of Veterinary Science and Animal Industry, 18 (1943), pp. 355-404.
8. W. E. MILNE, Numerical Calculus (Princeton University
Press, 1949), Table VI.
9. C. P. WINSOR, "Which Regression," Biometrics (Bulletin) 2
(1946), pp. 101-109.
10. J. BERKSON, "Are There Two Regressions ?", Jour. Amer. Stat.
Assn.4S (1950), pp. 164-180.
11. D. V. LINDLEY, "Regression Lines and Linear Functional Relationship," Supp. Jour. Roy. Stat. Soc. 9 (1947), pp. 218-244.
12. S. S. WILKS, Mathematical Statistics (Princeton University
Press, 1943, 1946).

Mr. Keller: In our steam turbine testing work we have
found, by running duplicate tests, or tests on two units that
are duplicates, that we obtain from one-half to one per cent
unexplained variation. It is quite important, in our design
work, that we take advantage of differences that may
change the performance to the extent of one or two-tenths
of a per cent .
In our talks with statisticians they usually point out that,
if we use Latin squares and change several variables at once,
we could obtain the required design information at smaller
cost and with fewer tests. I should like to ask whether the
question of how you should plan your experiment is affected
by the expected difference that a change in design would
cause, relative to the unexplained difference between two
tests on the same unit. In other words, the standard deviation is one per cent, and you want to test for three or four
items, each of which might amount to two-tenths of a per
cent; do the standard methods still apply?
Professor Tukey: Yes. This is entirely typical of agriculture, where these methods were first developed, because
that is where they first realized they were in serious trouble.
An alteration of one per cent in the yield of barley in Ireland means a considerable number of pounds, shillings, and
pence to the Irish. It is awfully hard to get anyone field
experiment to have a fluctuation as low as 10 per cent, and
these methods were developed just to get at that sort of
situation. If you are interested in only ten per cent, and
your experimental error is one per cent, it doesn't matter
how you do it; you will find out. But if you have to work
for it, things of this sort are indicated. Whether you want
to use Latin squares or not is another matter. You have to
know a lot about the situation; and I don't know about
steam turbines. So I can't tell you whether you are to use a
Latin square or not. But' I think that you would find some
design, more complicated than the one you are probably
using, is likely to help.
Mr. Keast: In the example that you have shown, where
your function Yij is affected by i and j, and the diagram
underneath, which shows greater variation at the righthand side than at the left-hand side-what is the essential
point of the diagram?

112
Professor Tukey: The essential point which I tried to
make in this diagram, is that there is greater variation at
higher levels, rather than that it happens to be at the side.
As the ievel of the original y increased, the differences became larger. If that is the case, you have some hope of controlling it by going to the square root of y or the logarithm
of y.
Mr. Keast: My problem is, if you don't know the variation in the first place, is it not that you are considering the
variation with each factor to be linear over the range with
which you are working? That is, in planning an experiment
where you had everything vary at once, are you assuming a
linearity there?
Professor Tukey: No, very definitely not, because the remark I made would apply perfectly well if the situation
went as follows: in the regions where y is low, the differences are small; and in the region where y is large the
differences are large. You are talking about a plus sign in
the way that different things interact; but you let 'YJi be any
function of i it chooses, and you have allowedcpj to be any
function of j it chooses. The problem is to make the interaction behave, and you can let the individual variables run
as they choose to make things go.
Dr. Lotkin: I would like to ask two questions pertaining
to some of the work we are doing at the moment,dealing
with measurements of angles such as you come across when
you contend with theodolite data.

In smoothing such data we have a choice of selecting successive groups of data. The question .arises: how large
should you take such groups in order to obtain feasible fits?

COMPUTATION

Because we have found that, depending on the size of the
groups you take, you get slight variations in the fit.
Second, in doing this smoothing by means of orthogonal
polynomials, the degree of the polynomial will vary on your
significant answer. In planning this for the machine, we
have a choice, then, of either varying your degree of the
polynomial-which can become quite involved-or adhering to a certain fixed prescribed degree.
Now, we are aware of the fact that, if we take a fixed
degree for this polynomial, we might run into some danger
of over-smoothing the data. What I would like to know is
if this danger is not, possibly, over-emphasized.
Professor Tukey: I don't want to try to answer this question in detail, due to time limitations, but I can say some
things about it. Basically, you have a problem where you
are getting data out of a process. You have some theodolites, and you hope they run about the same from day to
day; and what you are going to do with this ought to depend on a whole backlog of experience, and not what you
obtained on this particular run, generally speaking, unless
you have evidence that this run is out of line in some way.
What is needed here, then, is to find out the essential
characteristics of the situation, and make up your mind
what smoothing you· want to do on the basis of that-not
just to apply some sort of test to this strip of data and work
with it accordingly.
You are raising the question, really, of how should this
kind of data be analyzed. How should one analyze the data
on soybeans compared to data on potatoes? That requires
going back and looking at the essentials of the·data.
I think that trying to get at the power spectrum is. the
way to find out what you want to do in this case.

The Applications of Machine Methods to Analysis
of Variance and Multiple Regression
ROBERT

J.

MONROE

Institute of Statistics , North Carolina State College

THE generally-recognized machine methods which have
been adapted to statistical calculations were first outlined
by A. E. Brandt! in 1935, although some of the methods
are known to have been in use before then. Dr. Brandt described the use of progressive digiting methods to obtain
sums of squares and sums of products which were required in the statistical analyses. Since that time little has
been added to the methods, save some improvements in
details as a result of the steadily increasing efficiency of
the newer models of machines. The following is essentially
a description of the applications of the progressive digiting
methods.
The methods of analysis of variance and multiple regression are a part of what have been called "standard methods
of analyzing data." The two methods are closely related
mathematically,. i.e., the analysis of variance can be regarded as a special case of multiple regression in which the
independent variables are arbitrary or dummy variates. It
is usual, however, to think of the analysis of variance as
concerning itself with a single variable, while the purpose
of multiple regression is to relate one or more dependent
variables with two or more independent (or "causative")
variables. In either case it is conceptually easy to regard
either problem simply as a breakdown of the "total variation" in a single variable into several component parts, regardless of how this breakdown is accomplished.
The example chosen for this paper came from a problem
where both of the above-mentioned techniques were found
useful.
In a continuing project at the North Carolina Agricultural Experiment Station the attempts to improve on the
present varieties of peanuts involve experiments embracing
large numbers of measurements on individual plants. Consider, for example, an experiment made up of four different
crosses from which were selected seven different strains.
From each of the 28 seed stocks plantings were made to
allow the measurement of ten plants, and each seed stock
was planted in five different replications (locations). This
kind of an experimental design is called, in statistical parlance, a "randomized block."

113

The model for this design may be written as.
Yijk = p..

p..

+ Pi + Yj + ~jk + (PY)ij + (p~)ijk + £ijkl·

= unspecified mean parameter

Pi = effect of ith replication (i = 1, ... , 5)

Yj
~jk

= effect of jth cross (j = 1, ... ,4)
= effect of kth strain in the jth cross
k = (1, ... , 7) for each j

(PY)ij = effect of interaction of jth cross with ith repli-

cation
(p~)ijk

= effect of interaction of kth strain with ith repli-

£ijkl

cation for each of j crosses
a random error NID (0, (12) associated with
each plant (l = 1, ... , 10).

=

The analysis of variance, with associated degrees of freedom, is derived from the model, each line in the analysis
being associated with the indicated parameters of the model.

ANALYSIS OF VARIANCE

S ouree of Variation

Degrees of Freedom

general mean
replications
crosses
replications X crosses
strains within crosses
replications X strain within crosses
individual plants within plots
Total

1
4
3
12
24
96
1260

C.F.
SS(R)
SS(C)
SS(RC)
SS(SC)
SS(RSC)
SS(IP)

1400

SS(T)

The sums of squares for each of the above effects may
then be segregated.

1. General mean: (l/ijkl)
2. Replications: (

1
.k

J'

(~y)' =

C.F.

t) 2: (2:y)2 - C.F. = SS (R)
i

jkl

114
3.

COMPUTATION

crosses:('/'~1)L(Ly)2
1,

ikl

j

GJ~;:(~Y)'

4. R X C

- C.F. = SS(C)

-

ill

= SS(RC)

5. Strains in crosses:

f.:(4,:Y)' - C.F. -

t~(~=Y )' -il~(.~;:Y)'-

Ii~
.~1;1:::::
~ I~~

i.~
J::..Q
o::U lete item, transferring at the end of
each control break enough items so that, when you are finished, you will have on your minor breaks, a summation of
the terms themselves, and on the major, a summation of
their squares.

Chairman Hurd: I might make a note here of something
that Mr. Bailey told me. He indicated that he had prepared
a control panel for the 604 which would handle all analysis
of variance procedures up to a reasonable three classifications, with one pass of the machine.

REFERENCES

Mr. Bailey: The technique that we generally use is to
prepare one card for each item in the table. We prepare one
card for each of those ten items through all the replications.
Ordinarily, we subtract a base from the items and code
them to reduce the number to a reasonable size. We have it
set up so that, on one card, we can handle numbers with as
many as four or five digits, subtracting a base from each of
those numbers. In one pass through the 604, the base is subtracted from each of the five-digit numbers, then the differences and the squares are summed and punched in the detail
card. We sort the detail cards on our first classification,
summary punch, repeat the process for the second classification and the third classification, then sort on two classifications at a time, summary punch, and finally summary
punch with all controls off. Next, the summary cards and
the detail cards are put together and fed into the 604, using
the same control panel, with a switch where each of the
sums is divided by the number of items represented on the
summary card and obtain the means. Multiply the mean by
the sum again to obtain the correction factors.

1. A. E. BRANDT, The Punched Ca.rd Method in Colleges and Universities (Edited by G. W. Baehne, pp. 423-436, Columbia University Press, 1935).
2. P. G. HOM1WER, MARY A. CLEMM, and W. T. FEDERER, Research
Bulletin 347 (April, 1947), Iowa State College Agricultural Experiment Station.

All these correction factors and sums of squares are listed
both for summary cards and detail cards, and it is a simple
matter, just by a process of subtraction, to obtain all the
sums of squares in the analysis of variance table.

LXjXk = LXkXj, the two operations being accomplished
.
independently.
The matrix inversion required in the regression analysis
is a more difficult thing to accomplish without special equipment. It is sufficient here to mention that this can be done
with the IBM Type 602 Calculating Punch, but the process
is quite involved. This job is, perhaps, one best done on
the newer models which have been demonstrated at this
semmar.
The foregoing discussion was intended as a brief summary of the application of IBM equipment to statistical
analysis. Only a few of the basic operations were described.
To anyone familiar with both the analytical and computational problems many short cuts and improvements will
suggest themselves-especially in particular problems. 2

116
Mr. Clarl?: There is a method of obtaining sums of
squares and cross products, called digiting without sorting,
in which the multiples are represented as combinations of
1, 3, 5, the same way that you would a binary code 1,2,4,8.
In other words, 9 is 5 + 3 + 1, 8 is 5 + 3, and so on. If
there is enough capacity you can, at the same time, digit for
10, 30, 50, and then 100, 300, 500. Really, there are four
methods: the multiplication, the summary punching, the
card total transfers, and the digiting without sorting. And
one should always weigh the economics of which way to do

COMPUTATION

it. However, you can make a general statement to the effect
that, if you have a large number of variables and a large
amount of data per dollar of computing, your best method
is summary punching with the method that you outlined.
Mr. Belzer: The first method you described requires
three counters for each bank?
Mr. Clark: That is right. It is an extravagant method.
The beauty of it is that when you have an enormous number
of cards, a small number of variables, you don't want to
sort.

Examples of Enumeration Statistics
W.

WAYNE

COULTER

International Chiropractors Association

MAT HEM A TIC I A N S are working with numerical
constants. Once a specific problem is solved, the same formula will apply to similar problems. This is not true when
multiple variables are introduced from problem to problem.
We, in the healing arts, deal with such a great number of
variables within the human body, as well as from individual
to individual, that it has not been possible to apply mathematics to the human race as a mass; trends, indications or
approximations are the best we can hope to obtain.
Our IBM installation consists of a Type 16 Motor Drive
Duplicating Punch. The problems we have encountered
have been merely to standardize methods and procedures of
collecting data, proper coding and punching, so that the
IBM Service Bureau can compute averages or percentages
on several pertinent items.
Our field research program is now in its third year and
is a cumulative study. With each passing year, the study
will become more useful as the number of cases in each
diagnosis increases. The field research data booklet contains
the information concerning the cases compiled. The first
year, 1947-1948, this program was in operation we studied
700 cases with 16 diagnosed conditions. By longhand methods it took us 800 man hours to calculate our data. During
the year 1948-1949, we processed 3,400 cases on 38 diagnosed conditions in 400 man hours by IBM methods. This
is exclusive of the two hours required by the Service Bureau to tabulate the data. What this amounts to, roughly, is
4~ times the work load in one-half the time required by
longhand methods.
Since this program was started previous to switch-over
to IBM cards, and since it was a cumulative study, it became necessary to have our codes made up into rubberstamp form in order to bring our previous cases up to date.
The case history of each patient studied in the research
program is recorded on a form such as Figure 1, page 118.
This wealth of information may be placed on one IBM card,
as indicated in the right-hand side of the figure. Each case
is coded as to:
1. Industry. There are 13 classifications which indicate
the field of work in which the patient is engaged.
Thus, percentages in each different type of industry
may be determined~ (At a later date if we should re-

117

quire data on specific types of industry, special studies
may be conducted.)
2. Occupation. The type of activity the patient pursues
is indicated by 11 categories. The occupation code
enables the determination of percentagewise distribution.
3. Injury. Ten classifications indicate the nature of the
injury, while 16 other classifications give the way in
which the injury was incurred.
4. Chiropractic Analysis. The analysis of the patient's
condition after spinal analysis is coded into one of 16
categories.
S. Diagnosis. The coding of diagnosis of the patient's
condition consists of merely assigning 1 to anemia,
2 to angina pectoris, 3 to arthritis, 4 to asthma, etc.
6. Patient' s Condition. This is coded for before chiropractic care, with chiropractic care, and after chiropractic care.
7. Insurance. Information as to whether claims were
paid, the compensation involved, and the type of insurance policy.
Other necessary information for research analysis such
as case number, age, sex, days under medical care, number
of office visits, number of X-rays, and number of days
( working) lost before chiropractic care and while under it
is in actual numbers.
The Service Bureau sorts like numbers together according to diagnosis, and from there on it is a simple matter of
tabulating like data in the same columns, with totals, so
that we can obtain various data and averages or percentages
such as:
1. Average age.
2. % females-% males.
3. Average number of days under chiropractic care.
4. Patients' condition at end of chiropractic care.
S. % well.
% much improved.
% slightly improved.
% same.
% worse.
6. Average number of years the diagnosed condition had
existed previous to chiropractic service.

DO NOT WRITE IN
THIS SPACE

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

12.
13.
14.

15.

16.
17.

•

1
2
3
4
INDUSTRIAL RESEARCH (Form IR4) Revised
5 Case No.
6
7
Name or Case No.. _ ............................................................ Date ...__./.._ ...._./ 19........
8
Employer ...........-..............................-...................................................................................................... 1--_ _ 9
10 Indust.
Age........................ Sex............_........ Occupation ................................................................. ~ ..............
11
12 Age
Nature of Injury ....................... _ ..................................................._.._..._............................. _..............
13 Sex
14
How Was Injury Incurred ..............................................._.....................................................................
15
16
17
Type of Work Being Perfonned When Injured ..............................................................................
18 Occu ation
19
Date of Injury . . . . . . . •................../..... _ ......../19........ Time of Day .....................
20
21
Date Reporting to Chiropractor •................../..... _ ......../19........ Time of Day .....................
22
23 In'ur
Date of Discharge • . . . . •................../....._ ......../19........
24
25 In'. Inc.
Number of Days Under Chiropractic Care ..................... Number of Office Visits ................
26
27 Days
Patient's Condition Mter Chiropractic Service ....................... _.....................................................
28 Unaer
(Well, improved, temporary partial disability, temporary total disability, permanent
29 Chiro.
30
partial disability, permanent total disability, or other.)
31 Off. Visits
32
Chiropractic Analysis ......................................................................................................... ___...............
33 Pat. Condo
34
Number of X-rays for Chiropractic Analysis ............................................................... __...............
35 Chiro. Anal.
Number of Days Lost From Work
(8) Before Chiropractic Care ...............
36 No. x37 Be ore
on This Case:
(b) Under Chiropractic Care ..............
D
L
38 Chiro.
A
0
39
y
S
40 Under
Number of Days Under Medical
(a) Before Chiropractic .........................
S
T
41 Chiro.
Care, If Any, In Connection With
42
(b) With Chiropractic ....................... _.
This Injury:
43 Before D
(c) After Chiropractic ........... _............
44 Chiro.
A
y
. $............--.--........ .
X-ray Costs . . . . . • •
45 Care
S
With
46
Service or Adjustment Costs • . . $................... _--.-U
47 Chiro,
N
48 Care
Total Costs $....................__ ."".
D
49 After
E
Chiro.
50
R
Name of Insurance Company ............................................................................................ __................ ..
~
51 Care
52
Was Claim Paid In Full ................................................................. _................................ _..................
53
54
If Additional Infonnation Given on Back of This Sheet, Check This Space ............... _ ...... '
55 X-Ra Cost
56
Industrial Case
57
Claim Paid by Insurance Company D
58
WHEN COMPLETED MAIL TO:
59 Ad'. Cost
Claim Denied by Insurance Company D
60
INDUSTRIAL RESEARCH,
61
Industrial Case
ICA, 838 Brady Street,
62
Claim Not Presented for Legal Reasons
Davenport, Iowa
63
64
Total Cost
Health & Accident PolicY-Claim Paid D
65
66
(Not Industrial)
Claim Denied D
67
68 nsur
Chiropractor ..................................................................................... _...................................................... .
69 Ins. Cl.
70 Ins
1
d
71
Address ....................................__.__.............................. City ....................._......_._....... State ... _................
72
P
73 H & A Denied
PLEASE CHECK ALL INFORMATION FOR ACCURACY
74 tate
75

INTERNATIONAL CHIROPRACTORS ASSOCIATION

I
X

18.
19.
20.

D

D
D

FIGURE

118

1

Transforming Theodolite Data*
HENRY

SCHUTZBERGER

Sandia Corporation

A THE 0 DOL I T E is a precise instrument for measuring the azimuth and elevation angles of an object. It is desired to reduce these angular measurements, arising from
two or more theodolites, to the rectangular Cartesian coordinates of the object. The quantity of computation for
several hundred observations from more than two theodolites becomes overwhelming when these calculations are
performed on hand calculating machines. However, the
computation for these same observations becomes quite
feasible with automatic computing machines. The method
discussed here has been used with as many as five theodolites with much more satisfactory results than previously
used two-theodolite procedures.
The instruments most generally used to obtain the observations are the Askania theodolites. They have proved
to be the most valuable of the cine-theodolites available.
Captured from the Germans after the war, they are used
extensively on test ranges in this country. Attempts have
been made to duplicate them, but with little success up to
the present time.

4. Collimation Error. This error occurs when the line of sight
down the instrument telescope is not exactly perpendicular to
the horizontal axis of the instrument.
S. Standard Error. This error occurs when the horizontal axis
of the instrument does not lie exactly parallel to the base
plate of the instrument.

6. Tilt Correction. Because the local zenith at the instrument is
not parallel to the zenith at a removed origin, owing to the
curvature of the earth, this correction must be applied.
7. Refraction Error. This error is due to the bending of light
rays when passing through media of changing density.
8. Scale Error. The Askania cine-theodolite has an extremely
precise scale, but the optical system of several prisms, transmitting the scale to the film, may be out of adj ustment and so
introduce an error in the scale reading.

9. Bearing Error. As the instrument is rotated through the azimuth, its weight is supported by a main thrust bearing. Any
irregularities in this bearing, or in the ways in which it rises,
introduces an error in the elevation angle.

For the accuracy of measurements desired, each of these
corrections must be taken into account. At present these
corrections are made by hand computations, as it was
not considered efficient to perform them on mechanical
equipment. However, a device built by the Telecomputing
Corporation, known as an Askania Reader, has been ordered. This machine, which is connected to an IBM Type
517 Summary Punch, enables an operator to make the necessary measurements on the film, and records these measurements and the instrumental constants automatically on
an IBM card. With these data on cards, it will be possible
on the IBM Card-Programmed Electronic Calculator to
make all necessary corrections.

Theodolite Instrumental Errors
The Askania cine-theodolite, like any other high-precision instrument, is subject to many errors. Frequently these
errors arise, not from any inherent defects in the instrument, but from the fact that the instrument can be read
more accurately than the adjustments of the instruments
can practically be made.
The errors to which the Askania is subject are:

A Solution Used in the Past to the Two-theodolite Problem

1. Tracking Errors. These are not really errors in the ordinary

sense but arise from the fact that the operators are not able
to keep a moving obj ect exactly on the center of each frame
of the film. Thus, it is necessary to correct for this displacement on each frame.
2. Orientation Error. This error occurs because the instrument
is not oriented to read the proper predetermined elevation
and azimuth angles when set on a given target.
3. Leveling Error. This error occurs when the base plate (azimuth circle) of the instrument is not exactly level. The base
plate error consists of two parts: the angle of inclination of
the base plate with the true horizontal plane, and the azimuth
direction of the intersection of the base plate and this horizontal plane.

Let S

= azimuth angle measured from the positive X

direction
Let H = elevation angle
0- XYZ = right-handed reference frame in which Z is
vertical
X, Y, Z = space coordinates of object
x, y, z = space coordinates of observation point
Subscripts 1 and 2 = quantities pertaining to theodolite
1 and theodolite 2, respectively.

*This paper was presented by Kenneth C. Rich.

119

120

COMPUTATION

The usual relations yielding X, Y, and Z from a pair of
theodolite observations are set down for reference:

z

Po (Xo,Yo,Zo)

(1)

(2)
Z =

Zl

tan Hl
+ (Y -Yl)---::--S
=
SIn
1

Zl

tan H1
+ (X -x1)--S(3)
COS

~--~---7~--------~--~---X

1

or

The order in which the preceding relations are given is
convenient for computations. Under certain conditions, it
is necessary to change the form of these relations,a but the
significance of the method remains unchanged. Relations
( 3) and (4) give the same value of Z only when the lines
of sight make a true intersection.
It will be noted that a redundancy exists, in that four
quantities (SH HH S2' H 2 ) are given from which three
quantities (X, Y, Z) are to be determined. Except when
the lines of sight make a true intersection, this problem has
no proper solution.

Derivation of a Least-squares Method of Data Reduction

The system and angles are defined in exactly the same
manner as before. The direction cosines of the line of sight
of each theodolite may be determined from the Hand S
angles and are:
Ii = cos Hi cos Si
mi = cos Hi sin Si
ni = sin Hi
where the su bscript i denotes the
number of each
theodolite. For
convenience in notation, the space
coordinates of the
ith theodolite shall
be denoted as ( Xi,

z

Yi, Zi).
FIGURE

1.

DIAGRAM ILLUSTRATING GEOMETRY

FOR DERIVATION of DIRECTION COSINES

aShould S 1 approach 90 degrees, i.e., X-Xl be very small, it is
better to compute Y -Y1 from a relation similar to Equation 1 but
involving cotangents of the angles, and then to compute X -.t"1
from a relation similar to Equation 2.

FIGURE

2.

The coordinates of
any point lying on
the ith line of sight
may be expressed
as:
Xi = Xi + liSi
Y i = Yi + miSi
Zi = Zi + niSi
where Si is the distance along the line
of sight from this
point to the theodolite at (Xi, Yh
Zi)'

DIAGRAM SnOWING LINES OF SIGHT

AND LOCATION OF DESIRED POINT

N ow, if the several theodolites are pointing at a fast moving object in space at a considerable distance, the lines of
sight in general will be skew with respect to each other. Let
the coordinates, to be determined, of the object be denoted
by Po(X o, Yo, Zo). From the point Po, consider the construction of lines perpendicular to each line of sight. Denote
the intersection of each line of sight with its perpendicular by
Pi (Xi, Vi, Zi). By this notation, then, the distance from
the point Po to each line of sight may be expressed by
d~ = (Xo-Xi-liSi)2 + (YO-Yi-misi)2
+ (Zo-Zi-nisi)2 (5)
For the determination of the point Po which best fits the
lines of sight, a least-squares approach is believed to give
the closest representation of the actual condition; i.e., the
sum of the squares of the distances from this point to each
of the. lines of sights is to be minimized. Thus, since in
equation (5) the values of (Xi, Yi, Zi), the iththeodolite
coordinates, and Ii, mi, ni, the direction cosines of the line of
sight from the ith theodolite, are determined by the position
of the theodolite and the direction in which it is pointing;
the values of (X 0, Yo, Z 0) and Si are the variables which
may be changed to minimize the sum of the squares of the
distances from point Po to the lines of sight. Thus,
d: = f(X o, Yo, Zo, Si) .
A set of direction numbers for the ith line of sight is
Ii, mi, ni, and a set of direction numbers of the line joining
the point in space (Xo, Yo, Zo) to any point (Xi, Vi, Zi)
on the ith line of sight is Xo-Xi' Yo-Vi, Zo-Zi, or
Xo -Xi"-liSi, Yo - Yi - miSi, Zo -Zi- niSi.
A necessary and sufficient condition that these lines be
perpendicular to each other is that the sum of the products
of corresponding direction numbers be zero; i.e.,
li(Xo-Xi-liSi) + mi(Yo-Yi-misi)
+ ni(Zo-zi-nisi) = 0
(6)
Solving for Si
Si = h(Xo-Xi) + mi(Yo-Yi) + ni(ZO-zi) .
(7)
'/,

121

SEMINAR

Thus, the parameter Si may be eliminated by making use
of the condition of perpendicularity.
Substituting this value of Si in (5), letting
= Pi) and simplifying

h~i

+ miYi +

niZi

d~ =

+ liPi]
(8)
miniZO - Yi + miPd

[(I-In Xo -limiYo - liniZo-xi

+

[-limiXo

+

+ [ -liniXo -

2

(l-mD Yo miniYo

+

(1-

(14)
2

nn Zo -Zi + niPi)2 .

Rewriting (12), (13), and (14)
n

n

n

.L(1-1D XO +.L( -hmi) Yo +.L( -lini) Zo
i=l

i=l

i=l

In order to minimize the sum of the squares of the distances to each line of sight, i.e.,

(15)

n

(9)

.Ld1 = F (Xo, Yo, Zo) = min. ,
i=1

the following condition is necessary:

aF = aF = aF =
axo ayo az o

0

n

n

n

i=l

i=l

i=l

.L( -limi) Xo +.L(1-mf) Yo +.L( -mini) Zo

(10)
n

n

n

i=l

i=l

i=l

.L( -lini) Xo +.L( -mini) Yo +.L(l-n1) Zo

Summing di for all theodolites
n

( 11)

.Ld= F(X o, Yo, Zo) =
i=l

~ ~ [(1-4') Xo -

+[+[ -

l,mXo

l,m,Yo - l,n,Zo - ...,

+ !;P x. Similar
monotonic properties hold for the errors as functions of x
when ao is fixed.

The Best Starting Values
We now state the problems: Given a radicand in the
interval (N 1 ,N2 ) xi = N v xi = N 2 ; to find a single starting value a o which for a will minimize the maximum absolute error in the kth approximation and find likewise a
starting value which will minimize the maximum relative
error for all choices of radicands in the interval.
To solve these problems, we first define an auxiliary
function
P(a,x,k) = (a-x)2 k/(a+x)2 k , then

Errors in an Iterative Method for Square Roots
The iterative method here discussed is the classical one
given by the formula:
ak = .5 (x 2/ak_l
ak-l)
where x 2 is the radicand, ak, k = 1,2, ... , is the kth approximation to the square root x, and a o is the starting
value. The error ak-X is readily calculateda by observing:

+

-' -

I

.t-

+

=

2x P(ao,x,k) and
(6)
1 - P( af),x,k)
ak-X = 2 P(ao,x,k)
x
1-P(ao,x,k)'
(7)
From the monotonic properties of the errors as functions
of the starting values we may state that for any starting
value ao the largest absolute error will occur for x = Xl or
x = x 2 • Hence, the largest error will be minimized by
equating the errors at the upper and lower extremes of the
range of the radicand. The same statement will apply to
relative errors. Hence, we have the theorems:
THEOREM 1. The starting value a o which minimizes the
maximum absolute error in the nth approximation for a
range of radicands between N 1 = xi and N 2 = xi is the
solution of the equation
2x J P( aO,x}1n) 2X2 P( aO ,x2 ,n)
(8)
1 - P(ao,xvn) 1 - P(aO,x2,n)'
In general, these solutionsao depend on the number of
iterations n, and a o is a decreasing function of n. The quantity on the left in equation (3) gives the actual maximum
error when the solution a o is substituted.
THEOREM 2. The starting value a o which minimizes the
maximum relath1e error for a range of radicands between

r

1/2 (ak-l - x
ak-l
ak+ x = 1/2 (ak- r
X)2
ak-l
From which, by division and extension, we have

ak

ak- X

(1)

k

ak-X =(ak-1-x) = ... = (a o-x)2
(2)
ak+x
ak-l +x
ao+x .
Solving (2) for ak and then for ak-X in terms of a o and x
we have
x[ (a o+x)2k + (a o-x)2k]
ak =
.
(3)
(a o+x)2k .- (a o-x)2k
2x( a o -x )2k
(4)
ak- x = (a o+x)2k _ (a o-x)2k
ak-X =
2(ao-x)2k
k = 1,2, .... (5)
.t'
(ao+x) 2k - (ao.-x) 2k
Now, we observe that for k = 1,2, .... the error always
has the positive sign. It is also readily proved that the error
*This paper was presented by title. Work was done under U. S.
government contract.
aThis method appears in Whittaker and Robinson, The Calculus of
Observations.

132

133

SEMINAR

N1 = xi and N2 = x; is ao = VNIN2 = YX 1X 2J independent of the number of iterations k. The maximum relative error then is

We want to show that this relative error is always positive
unless a = x, when it is zero. Let y = x/a and denote the
relative error by z, then (12) becomes
n z = yn

(9)

+ (n -

1) - n y .

(13 )

y

I-P( YX 1 X 2 ,X p k) I-P( YX 1X 2 ,x2 ,k)
It is easily seen that the ratio ao/ Xl depends only on the
ratio X 2/ Xl in both theorems; hence, one may multiply
aO,xpx2 by a positive number and still have theorems 1 and
2 valid for the new quantities. The maximum relative error
does not change under a scalar change.
To give an idea of the maximum relative errors in using
Y 10 as a starting value for radicands between 1 and 100,
we have that the third approximation gives a relative error
less than 1.08 X 10-2 , the fourth approximation a relative
error less than 5.7 X 10-5 , the fifth approximation a relative
error less than 1.5 X 10-9 , and the sixth approximation
a relative error less than 1.26 X 10-18 • The best integer
starting value for radicands between 1 and 100 for six
iterations is 3 for either absolute error or relative error.
If the absolute error is to be minimized, the best starting
x 2 )/2. As the number of
value for one iteration is (Xl
iterations increase, the best starting value decreases toward Y X 1 X 2 ' although we have not succeeded in showing
actual convergence to VX 1 X 2 •
If the machine under consideration can discriminate
among several classes of radicands, then one can use the
method here proposed to determine the starting value associated with each class.

This relative error z is positive, zero, or negative accord(n-1) - ny, is positive, zero,
ing as the numerator, yn
or negative. Setting the first derivative of this function
equal to zero, we have
n y n-I - n = 0
whence
\y\ = 1 .
(14)
The second derivative is
n ( n -1) yn-2
( 15 )
which is positive for y > O.
Hence as z = 0 for y = 1 is a minimum, and the first derivative of the numerator is negative for 0 < y < 1 (x < a ) ,
but positive for y > 1, we conclude that z = 0 if, and only
if, a = x, and it is otherwise positive. Furthermore, it is
easily established that z is a monotonic increasing function
of \a - xl for either X fixed and a varying, or a fixed and X
varying.
In view of the above, one may proceed as with the square
roots. For radicands between NI = x~ and N2 = x~ any
starting value a will give a maximum relative error in the
first approximation for either Nl or N2 by the monotonic
property mentioned above. Hence, the minimum largest
relative error in the first approximation will occur when
the error using NI is the same as the error using N 2 • Equating these relative errors, and using the fact proved above
that the relative error is positive or zero, we have from (12)

lligh Order Roots

(:' )n-1 +(n_l)(:1)_n=(:2)n-1 +(n-1)(:2)-n .(16)

+

From (16), one finds the unique solution

If one uses the iteration for nth roots

l( X21 +

(n-l) ak-1 ) , n = 2,3, ... (10)
n ak-l
the formulas for errors are not simply derived. However, to
minimize the maximum relative error one has the following
theorem:
THEOREM 3. The single starting value for taking nth
roots which minimize the 11wximum relative error for all
radicands between N 1 = x~ and N 2 = x~ is
ak =

1l~

I

a - '1
0-

X1X2 (x~ - x~)
(x2-x1)(n-l)

(11 )

regardless of the number of iterations used.
Proof: Let a be any positive number. Consider the relative error in the first approximation a 1 reSUlting from using
a as a "guess" for the nth root of N = xn. This relative
error IS
at - X xn
(n - 1) an - n an-I X

+

x

n a n- 1

X

=~ [(~) n-1 + (n -1) (~) -

nJ .

+

(12)

(17)
N ow we have still to prove that this same value a o is the
best starting value, regardless of the number of iterations.
Before proving that, we remark that the best starting
value a o to minimize the maximum absolute error with one
approximation is
n-~ f x~ - x~
(18)
a o = 'J n(x -x ) .
2
1
It may be demonstrated, although we here omit the details of proof, that the best starting value aM to minimize
the maximum relative error in k iterations, is that value of
ao which yields approximations to Vx~ and Vx~ at the kth
iterate such that the relative errors in both are equal. The
following lemma then shows that a o of (17) is the best
starting value to minimize the maximum relative errors
independently of the number of iterations.
Lemma I. If the relative errors in using a l as an approximation of Vxn and b1 as an approximation of Vyn are
equal, then the relative errors in a2 and b2 are equal, where

134

COMPUTATION

C!2 and b2 are the iterates, obtained respectively by substituting al for ak-I in formula (10) and Y for x, bi for ak-I.
Proof: We have given that (al-x)/x = (bl-y)/yor
what is equivalent, aI/x = bl/y. Now
a~
d b
b~
a2 -a n2 n a'ri
n b'r"l
hence,

xn + (n-l)

yn + (n-l)

;=~[(~)n-l+(n_l)(;)]

;=;[(ll)

n-l

and

(19)

+ (n-l) (;)]

as aI/x = bl/y we have a2 /x = b2 /y, which is equivalent
to the conclusion of the lemma.
This proves the lemma. Now, as ao was chosen so that
the relative errors of the first approximations to V N 1 = X]
and to y N2 = X 2 were equal, it follows from Lemma 1
that all successive approximations to Xl and X 2 will have
the same relative error at a given iteration. This concludes
the proof of theorem 3.

Concluding Remarks
We have called attention to formulas for the errors in
the classical iterative method of taking square roots, and

applied these, formulas to determination of the best starting values to use, if one wishes to obtain a certifiable
accuracy in the smallest number of steps. While the determination of errors for higher order roots was not given
algebraically, this may be done numerically for particular
circumstances. For example, in taking cube roots with
NI = 1, N2 = 1000, the best value for minimizing maximum relative errors for a fixed number of iterations is
! /10 ( 102 -1 ) 3 -5
ao = ~ 2(10-1) = Vs .
If one chose to use 4 for a starting value, then the maximum relative error would occur at x 3 = 1 as V55 is a
dividing point and 4 > V55.
I t should be pointed out that the methods here discussed
do not apply if one has a machine for which the number
of iterations need not be fixed, but which will test the accuracy at each iteration. In that case, a choice of a best
starting value would be better determined by minimizing
the average length of time of calculations for the distribution of radicands at hand and for the accuracy desired.
However, the results stated here will be of some use in
proceeding with the minimum calculation time determination.

Improvement in the Convergence of Methods
of Successive Approximation*
L.

RICHARD

TURNER

Lewis Flight Propulsion Laboratory, NACA

THE MET HOD of successive approximation is frequently used in numerical mathematics, but in some cases
the rate of convergence is discouragingly slow. Professor
Southwell has shown that the rate of convergence of Liepmann's method of solving partial differential equations may
be substantially improved by examining the distribution of
"residuals" (increments in Liepmann's method) and applying local corrections.
Southwell's "relaxation" technique is not readily adaptable to machine computing methods. It is possible, however,
by examining the whole solution to determine the rate of
disappearance of the currently dominant error terms and
then to remove such dominant terms in a single step of
calculation.

so that
X(n+3) _

X(n+2)

X- 2 -----X(n+l) _ X(n)

(4)

and
(5)
Now, in general, the operation indicated in equation (4)
is not even defined. Therefore, some adequate working approximation must be substituted for equation (4). Two of
these appear to be worthwhile. We define
8jK+1) = %jK+l) _ %jK)
(6)
and in terms of these 8's which are defined for each point
for which a calculation is made

Theory of the Method
Let the ultimate solution of a given problem be the
K -dimensional vector %:
which is obtained as the limit of the convergent sequence
x = ll'm )J X(O) , X(1) , X(2) , ... , X(n) , .... l( .
(1)

(7)
or

n--7OO

( It will be assumed in the analysis that the components
%(~) of the mth iteration are all real numbers, although this
restriction can be removed.)
We now suppose that, at the nth step, that X(n) is composed principally of the solution X and two error terms E(n)
and F(n} of a form such that
E(1&+1)

=

and
F(n+l) =
Then it is found that

XE(n)

(2)

-XF(n)

+
+
X +
X +

X(n)

= X

E(n)

+

X(n+1)

= X

xE(n)

-

X(n+2)

=

X 2 E(n)

X(n+3)

=

+

X 3E(n) -

Fin)
XF(n)
X2F(n)

X3F(n)

(3)

*This paper was presented by title.

135

(8)

Equation (7) is meaningful only if the 8's are real numbers.
Equation (8) makes sense for any definition of 8j for which
a complex conjugate 8j and the operation ab are defined.
Equation (8), which corresponds to taking a first moment
in statistics, is more elegant than equation (7) but involves
much more effort [and is really not much better, because it
is only on rare occasions that the initial hypothesis, equation (2), is sufficiently near to the truth to justify the use
of great precision in the adjustment indicated in equation
(5) ]. For this reason it is recommended that, where at all
possible, equation (7) be used. This rule should not be

136

COMPUTATION

applied if K is a small number. In this case equation (8) is
a much safer rule.
When X2 has been found, equation (5) is applied to each
of the elements of X(ft.+3) and X(n+l) to obtain an improved
iterant X'.

It is suggested that if ~t2 is a small number, say 0.1, the
correction technique should not be applied unless it is found
that A2 is substantially constant for several iterations.
In the use of the method by the author, no case has occurred in which X2 fell outside the range 0 to 1. Such cases
will form a fresh field for the experimentalist.

Applica#on Notes

I !lustrations
Three charts illustrate the method for the solution of
Laplace's equation, with the boundary conditions shown
around the edge of the L-shaped domain of the figures. The
initial approximation (Figure 1A) was taken to be zero at
all interior points. The (n + 1) st iteration n = 0 to 2 was
computed as .

Strictly speaking, the basic hypothesis of the method can
be met only for linear algorithms, that is, algorithms in
1st iterant is the result of linear operations
which the n
on the elements of the nth iterant. In practice the method is
found to apply satisfactorily to various nonlinear processes
such as the calculation of the latent roots of matrices by
repeated multiplication and normalization of an arbitrary
vector.

+

o

o

3

2

Error

+o

0.5
1.00
1.2656
1.8253

2

9.2183

65.5

+o

+o

6

+o

+o

+o

+o

9

+o

+o

10

+o

3.00
3.75
4.3125
5.2719

6

32.2
28.7
28.1

5

1.00
1.9375
2.6875
3.9211

5

1.5
1.9375
2.2813
2.8524

77.0000
57.25
37.1875
26.7342

7

1.25
2.125
2.5156
3.4408
8

3

1.50
2.00
2.7813
3.7179

o

1.50
2.5781
4.4627

+o

2.00
3.3125
4.4063
5.9247

8

1.50
3.0625
4.3750
6.4766

2.25
4.375
5.4844
7.8487

9

4

4.00
5.625
6.4063
7.9247

11

(n) ]

%i,j+l

5

6

•

3

2

Reduction

%

9.2183
6.6071
4.9555
3.6595

28.3
25.5
26.1

2.1042

42.5

+1.8253

+2.8524

1.8385
1.8421
1.8749
1.8934

2.8165
2.8391
2.8717
2.8998

+2.5016

+3.4408

5

2.5517
2.6604
2.7425
2.8396

3.5180
3.6448
3.7382
3.8503

+2.9408

+3.7179

6

7

8

3.2852
3.4829
3.6382
3.8179

4.2111
4.4533
4.6154
4.8212

+3.9211

+4.4627

+6.4766

.+7.9247

9

4.1689
4.4392
4.6145
4.8413

5.00
6.5625
7.5000
9.3275

10

+

Error

2

7

(n)

%i,j-l

%

+o

0.75
1.50
2.0156
2.9408

4

4

. (n).-L
• '(n)+
= 4"1 [.ti-l,j
---. .ti+l,j

Reduction

+o

0.5
1.125
1.6563
2.5016

3

+o

..,.(n+l)
A _ _-

~.J;:;;;;:::;;;::;:;:~;="'e""=~;;:::~
~ D~
~~~~:;~~;==~~IcJ;X480 ~SE

CC

X

o

0

0

IS

0!2!0

ECTOR

X

0 _ 0 _ 0 _ 0 _ 0 _ 0 _ 0 _ 0
TOTAL SYMBOL CONTROL

0

+0 +0 +0 +0 +0 +0 +0

~C _ 20D _ ~C _ 4: _60C _ 6: _ ~C _8:
TOTAL SYMBOL CONTROL-

DD

NX
OOfOO~EE

000

oC o

0

-N!!MFR'C'l TIFt

.COC4l-~

oC o

•
;--..:~------ 5-r-~~---

BAR TOTAL E TRY

000000000000

ALPHAMERICAL TYPE BAR TOTAL E N T R Y - - - - - -

o

0

_ _ _ _ _ _ _ _ _ l:XI X 3 _ _ _ _ _ _ _ _...1

,I' 1"141 > I0 1'1"1 I'" I"I"1"1
y

14

110 1'01"1

'"I'YI'"I"I"I"I .412512612712812913°13113213313413513613713813914°1411 42 143 11111 1 I 21314151617181911°1 n 1121131141151'61171'811912°121122123124 j25j261 271281291 301 31132133jJ41351361 3713813914°1 411421431 44145

il~f~t\~i~:igf;::11111111111 11111 II III 11111 III 1111111111111111111111111111111111111111111111111111111111111111
FIGURE

1.

SUMS OF SQUARES AND PRODUCTS

(5

VARIABLES)

SUM OF PRODUCTS ANDIOR SQUARES BY CARD CYCLE TOTAL TRANSFER

25---UPPER 8RUSH£S-·-35-----40

000000000 000 000 0 0 0 0 0

In several types of mathematical and statistical calculations it is desired to obtain the
sum of produets or squares. Normally for such calculations first the individual multiplications
are carried out 01' the squares are calculated and then their sum is obtained by adding up the
individual products, although actually only the sum 'of the products is desired. For example
when an inventory is taken the quantity of each item. is multiplied by its unit price and the
sum of the products is performed to arrive at the total value of the items. In psychological
tests, when calculating the standard deviation, only the sum of squares has to be obtained.
For correlation analysis the sum of the squares and products of scores (rating of individuals)
are needed as ~X,', ~X, X 2 , :sX 1 X a, ~X, X 4 , etc .. Several IBM methods are known by which
the total of products or squares is obtained on Electric Accounting Machines without carrying out the individual calculations. Such methods, as digiting, progressive digiting and
digiting without sorting, are also discussed in Pointers. Generally one run through the Accounting Machine is required for each position of the multiplier for calculating the sum of
products or squares. The partial totals obtained have to be added up in order to secure the
final total.
The sum of squares and or products can be obtained by a single run of the IBM cards
through the Type 405 Accounting Machine. The cards are sorted together with a set of
X-punched "digit cards" in descending order (from highest to lowest) on the field of the
multiplier (or field to be squared). If the multiplier field. consists of one column, 9 digit cards
(9-1), two columns 99 digit cards, three columns 999 digit cards, etc., must be sorted in. If
there is no detail card for a number, at least a digit card for that number will be present.
After the file of cards is sorted, all digit cards preceding the first detail card are removed.
The last card of the file will be the digit card number "1". The cards are tabulated on a Type
405 Accounting Ma:chine and totals accumulated by multiplier group are card cycle total transferred from one Counter-Group into another to obtain the final total of squares and/or products. By this procedure different sums of squares and/or products may be obtained simultaneously. This is especially important for correlation analysis.

N

4::0..
4::0..

Exhibit A shows a plugging for obtaining the sum of squares of variable X, and the sum
of products of variables X, X 2 • Two Counter-Groups are necessary to obtain each sum. One
of each pair of Counter-Groups must be eQuipp'ed with Card Cycle Total Transfer Device.
The'multiplicand X 2 , (Columns 29-31)is plugged directly to Counter-Group 8B (plugging 1).
Columns 26-28 ( as multiplicand) are plugged to Counter-Group 8A through the NO-X side
of Selector E in order to prevent accumulating from digit cards (plugging 2). The add and
subtract impulses are also under control of this Selector (plugging 2*,). When a digit card
passes the upper brushes the X in column 80 sets up Selector E so that when this card passes
the lower brushes Selector E is in controlled position and accumulation ftom this digit card
is eliminated. When a digit card passes the iower brushes Counter-Groups 8A and 8B subtract
and the totals standing in them are transferred to Counter-Groups 8C and 8D respectively
which add these totals (plugging 3). Simultaneously, these totals may be listed (plugging
4). After the transfer, Counter-Groups 8A and 8B (equipped with Card Cycle Total Transfer
Device) will contain the same figures that they did before the transfer. If there are no detail
cards but only digit cards present for a group, the totals transferred for the previous O'roup
will be transferred again for this group. After all cards have passed through the ma~hine
Counter-Group 8C contains the sum of squares (:sX,'), and Counter Group 8D the sum of
products (~X, X 2 ). These totals are printed as final totals (plugging 5). The final figures
standing in Counter-Groups 8A and 8B are the totals of the single items (:sX, and-:SX .• )
accumulated from the fields .26-28 and 29-31 respectively. These totals are the last items
listed by plugging 4. Counts by multiplier group will be obtained in Counter-Group 6B and
total counts in Counter-Group 6D (plugging 6). Counter-Group 6B must also be equipped
with Card Cycle Total Transfer Device for restoring itself after each transfer. Exhibits B
and C illustrate an example for this application.
STUDENT INTELLIGENCE
SERIAL
NO.

001
002
003

XI

SOCIAL

NATURAL

STUDY

SCIENCE

39
15
08

38
31
32

MATH

HISTORY

X4

X5

Xs

39
21
29

40
18
38

17
15
20

104~I

36
35 1

o 0 0 0 0 0 0 0 0 0 0 0 0 0 000 0 0 0

~1

41

~I

120

241
35

13
11 1

~-~

~

I

DI' "' j I'"'
o

cI
0.01

SE;IECTORlsX0
0 0,0
0,0
0
o 0'0
0
o 0-0
0

o

_ _

- "~

ALPJ

_

_

--

o

000

:::X,t

PRODUCTS OF VARIABLE ONE SCORE WITH SCORES OF
THE REMAINING FIVE VARIABLES AND WITH THE SUMS
OF ALL SIX VARIABLES.

:::X, Xl
'59

38
37
36
35
34
33
32
31
30
29
28

CHECKING

S

IEXHIBIT B I
'-0 7 ~I
078

ALPHAMERICAL TYPE-eAR ZONE MAGNETS--_ _ ~

It

TOTALS

X3

X2

LANGUAGES

ooooooooooooooooooo~

212
129
168

2

2
1
1
5
4
3
7
1

78
154
154
190
225
225
390
518
611
821
850
8.1.2

74
152
152
186
224
224
394
528
621
854
887
887
8.!7

:::X, X.
65
122
122
~42

167
167
278
347
432
578
595
595
595

:::X, X4
74
144
144
178
209
209
364
472
581
818
855
855
Ri;O:

lEXHIBIT
8
7
6
5
4
3

-

-

2

'1 619

6
2

~

142
158

1619

78

1619
1619
1643
1649
1649
1649
43163
--

FIGURE

2

ZJ47
2347
2347
2347
2545
2615
2615
2615
5553"

1479
1479
14'79
1479
1623
1663
1663
1663
.5 585.4.

2218
2218
2218
2218
2424
2478
2478
2478
52913

:::)(, Xs
2·8
59
59
70
82
82
150
212
261
345
356

~~~

cl
~~

958
958
1037
1067
1067
1067
22398

:::X,

XS

:::X,S

86
165
165
213
262
262
475
586
720
1001
1029
1029
1029

405
796
796
979
1169
1169
2051
2663
3226
4417
4572
4572
457

-:648
:<648
2646
2648
2880
2948
2948
2948
63·277

11269
11269
1 1269
1 1269
12152
12420
12420
12420
273142

C.

.x
x
0,00,0-0
0.0 0.0-0
0.0 0'0-0
0-00-0-0
o~o 0'0-0

145

SEMINAR

CORRELATION OF ENTRANCE TESTS TO GRADE POINT AVERAGE
XI

X2

X3

AC E

ACE

ACE

Q

L

T

X4

X~

X6

X7

Xa

COOP COOP COOP COOP MATH
ENG
ENG
ENG
TOT
ENG
USE SPELL VOC
TOT

X9
GPR

00 0000 000 o 0 0 000 000 000 000 000 00000000000000000000000000000000000000000000000000000
11 3456 ) 8 9 101111 131415 1617 18 192021 222324 252621 28 29 30 31 32 33 34 35 36 31 38 39 40 41 42 43 44 45 46 41 48 49 50 51 52 53 54 55 56 51 sa 59 60 61 62 63 64 65 66 61 68 69 10 11 12 13 14 15 16 17 18 19 80

11 1111 111 111 111 111 111 111 111 11111111111111111111111111111111111111111111111111111

22

2 2 2 222 222 222 222 222 222 2)222222222222222222222222222222222222222222222222222
T22
333333

3 3 3 333 3 3 3 3 3 3 333 33 3 333 33333333333333333333333333333333333333333333333333333

44 4444 444 444 444 444 444 444 444 44444444444444444444444444444444444444444444444444444

55 5 5 5 5 5 5 5 555 555 55 5 555 555 555 55555555555555555555555555555555555555555555555555555
66 6666 666 666 666 666 666 666 666 66666666666666666666666666666666666666666666666666666
77 7777 777 777 777 777 777 777 777 77777717771777777777777777777777777777177717777777777

8 8 8888888888888888888 888 888 88888888888888888888888888888888888888888888888888888
9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 99 919 9 9 999 999 99999999999999999999999999999999999999999999999999999

12 3 4 5 6 1 8 9 10 II 12 13 1415 16 17 18 1920 21 22 23 24 25 26 21 292930n~~34~36V38394O~U~44~46U484850~~~54~56~sa~60~~6364~66~68~ron12nM~~17n~80
IBM

FIGURE

Remove all digit cards preceding the first detail card and
tabulate on an IBM Type 40S Alphabetical Accounting
Machine. The following results are obtained:
a. Count of detail cards used
b. ~Xl

c. lXr
d. ~XIX2
e. lXtXs
f. lXIX4
g. lXIX5
Post these results to the first horizontal line opposite X
in the correlation chart illustrated in Figure 4, page 146.
6. Change wires in accounting machine control panel as
follows:
Variable field 6 to control field 2
"
"7"
"
"3
8 "

"

4

9 "
" S
Tabulate cards the second time in the same sequence. The
following results are obtained:
a. C.C. of detail cards
b. ~XI

c. ~xi
d. ~XIX6
. e. lXIX7
f. lXIXS
g. ~XIX9
7. Sort column 80, separate digit and detail cards. Place digit
cards in front of detail cards and sort the variable X 2 fi~
in descending sequence, the same as the variable Xl fiel
was sorted.
8. Change wires in control panel as follows:
Variable field 2 to control field 1
"
"3"
"
"2
" 4"
" 3
S"
" 4
6 /(
" S
Remove all digit cards preceding the first detail cards and
tabulate. The following results are obtained:
a. Card count of detail cards
b. ~X2

c. ~X~
d. lX2XS
e. lX2 X4

3
f. lX2X5
g. lX2 X 6

Post these results to the second line of correlation chart.
9. Change wires as follows:
Variable field 7 to control field 2
"
"8"
"
"3
9 "
" 4
1 "
" S
Tabulate with cards in same sequence. The following results are obtained:
a. C.C. of detail cards
b. lX2
c.

lxl

d. lX2X7
e. lX2XS
f. lX2X9
g. lX2 X l
Check items 1, 2, 3 to the posting of the second line of correlation chart. Complete posting of the second line with
items 4, 5, 6, 7. Check posting of X 2X I on the second line
with posting of XIX2 on first line for verification.
10. Sort, tabulate, and post the results of each variable Xs
through X 9 the same as X I and X 2 to complete table shown
in Figure 4. A study of this table will readily reveal the
ease with which the data may be substituted in the various
multiple correlation formulas by the calculator. No attempt,
at this institution, has been made to compute the formulas
themselves by punched cards.

Advantages of the foregoing methods are:
1. Accuracy
a. All sums of cross products are totaled twice, in
separate machine operations, with the cards in a
different sequence and through separate machine
circuits. There can be no doubt of the accuracy if
all cross checks (X1Xa against XaXv etc.) are in
balance.
b. Card counts for each tabulation prove cards are
all present.
c. The sums of each variable, which are wired to a
control field, are produced at the end of each run.

146

COM PUT A T ION

CORRELATION TABLE
Card
Count

lX

lX2

Xl
117170

4634S/ 563215"4

Idoo

'2lr h

Xl
--

X2

117170

X4

2~'/""

-

1711942/53

//UJ/1

£'7A/7t9. ImrS432189 fI~~217g9 l""I:jIS"7S4 7/!942/.'P,

//)/)0

4657/«

17~~

._.1",," 17.:7.5hA~~ t4.lA

/00,/1

/5.~7~/6

12;.: = All'
Note also that if Aji = Aij then AU) = A~A) for i, j

Step
i

2.

> 1.

Find the Ag) of the largest absolute value, for

= 2, 3, ... , p. This coefficient may be labeled Ag), with-

out loss of generality. Form the factors
(9)
and put
Then

Aij2) = AU) ri 2) = r~l) -

m~2)

Ag),

m~2)r~l),

Ag) = 0, while A g)

i = 1, 3, 4, ... , p.

... , p, let A~e;1.e+1 be the one of largest absolute value. Then
the relations
mi H1 ) = A~~~+l/A~e;1.C+1
(10)
Air1 ) = AIr - mie+l) A~el1.j, i =F e 1,
will carry the diagonalization process one step further.
After p such steps the matrix (A ij , ri) will look like this:

+

0

0
0

Ag)

0
0
( 11)

0

0

0

A(p)
pp

The desired solutions are, therefore,
Cj

= - rjP)//AW.

For purposes of checking calculations it js useful to carry
along at each step the sums of all coefficients in each row.
If, namely, one computes at the start
(12)
and, during step 1, transforms the Ti together with the r~l) :
Til) = Ti-m~l)TH
(13)
then, obviously,
L:AJ])
r~l) T/1) = O. (14)

+

A similar relationship holds for each step.
Once the first solution C(1)
(ci1 ), cP), . . . , c;1)) has
been obtained, it takes considerably less time to get an improved solution c = C(l) + d. If, namely, equations (4) are
written in matrix form:
Ac + r = 0,
and the residuals due to c(1) are t
(tv t2 , • • • , tp ) :
Ac(1) + r = t,
then
Ad
t = 0,
and since A has been diagonalized previously, it is necessary
only to transform the column t.
The relay calculators, completing a whole step with each
machine run, took about 20 hours to solve the problem with
the desired degree of accuracy. From the experience gained
thus far, it may be stated that the coupling of the IBM
Relay Calculators has resulted in a decidedly superior computing machine.

==

= Ag).

Also,
Ag) = AiI) = All' but A112 )
0 for i =F 1,
since A n-) = 0 for these i.
Thus, the transformed matrix has the form
o
AW
Al~)
A2~2)

A~:)

o

Aii)

A 3(32 )

A~i)

o

AU)

AJi)

==

+

r~2)

Again, if the original matrix Ai} is symmetric, then AiP =
Ajl) for i, j > 2.
Generally, let us assume that the first e columns have
been so transformed. Among the AI.~~H i = e
1, e
2,

+

+

+

J

An Improved Punched Card Method for
Crystal Structure Factor Calculations*
MANDALAY

D.

GREMS

General Electric Company

THO S E of you who have worked with calculations dealing with the structure of complex crystals, are reminded,
probably, of the long monotonous operations involved. For
this reason, a few persons here and there have attempted to
find methods for simplifying the tremendous amount of
hand calculations. Shaffer, Schomaker and Pauling, at the
California Institute of Technology, were the first to report
a method using the IBM equipment for this purpose. However, at our own Research Laboratory at the General Electric Company in Schenectady, there is a group of scientists
who have spent considerable time and effort on this work,
both analytically and theoretically.
After a few discussions of their problem, it seemed more
efficient and better suited to the IBM equipment to begin
with the general expression,

3. Parameter cards. One card for each set of trial parameters x, y, z. The number of cards depends upon the
unit of structure. These cards are used for a specific
calculation.

First, reproduce the set of reflection cards as many times
as there are sets of parameter cards, gang punching a set of
x, y, Z values on each reflection deck. If there are 400 reflections and eight sets of parameters, then there are 3,200 detail cards each containing an h, k, I, x, y, and z.
There are four main machine operations in the solution
of this problem. The two important steps, or the two contributing the most to a mote compact and general procedure, are steps I and III.

a,

Fhkl

=

+

I. Forming the quantities
= (hx, + ky, lz,)
II. Obtaining the cosines and sines of
III. Multiplying the trigonometric functions by the
scattering factors, f,
IV. Summing the previous products

N

Lf, cos 2'7T (hx,+ky,+lz,)
+ i Lf' sin27r (hx,+ky,+lzj)
J=1

N

a,

a,

j=1

Step I indicates the formation of the quantItIes
=
(hx,+ky,+lz,). Using the IBM Type 602 Calculating
Punch with the above detail cards, it is possible to find
b" c"and d, at the same time-that is, with only one
passage of the cards through the machine, where

rather than to use a specific and modified expression for
each type of structure factor calculation.
This expression doesn't look difficult until you consider
that it involves many combinations of the refleCtions h, k, I
with the trial parameters x, y, Z to find the best sets of
x, y,z.
At the beginning, three separate decks of cards are key
punched:
1. Table cards. For sin 27r1% and cos 21r1X, where IX ranges
from 0.001 to 1.000, in intervals of 0.001. This pack
is used for all crystal structure calculations.
2. Reflection cards. One card for each reflection h, k, l.
This card also contains the scattering factor for each
kind of atom, the temperature factor, and the absorption factor (if known) for that particular reflection.
These reflection cards are used for all trials for a
specific crystal structure factor.

a"

+ kYI+ IZj ,
+ ky, - IZj ,
hXj ky, + IZj,

= hx,
bj = hx,

aj

CJ

=

d,

=-

hx,

+ ky, + IZj •

As the next step involves looking up the sine or cosine of
the quantities a, b, c, and d, it is sufficient to carry only the
decimal places in the product and sums. Therefore, multiply
h by x, and carry the three decimal places to the four summary counters, adding the product in counters 14, 15, and
16 and subtracting the product in counter 13. Then multiply k by y, add the three decimal places of the product
into counters 13, 15, and 16, and subtract it into counter 14.
In the same manner, multiply, I by z, add the three decimal places of the product into counters 13, 14, 16, and subtract it into counter 15. To eliminate the possibility of cal-

*This method was presented atthe American Society for X-ray and
Electron Diffraction in Columbus, Ohio, on December 16, 1948. It
also appeared in the December issue of Acta Crystallographica,
V 01. 2, Part 6.

158

159

SEMINAR

culating a negative value, add 1.000 in each of the counters
13, 14, 15, and 16 on the reading cycle. Now it is unnecessary to include negative ~'s in the sine and cosine table.
When these four sums are punched, each card contains a
positive number for a, b, c, and d.
For certain symmetry elements the structure factor will
contain any or all of the terms b, c, d, as well as a; so there
is a considerable reduction in the number of atomic parameters necessary when all can be found at one time.
This, also, makes the procedure general for most structures. The effect of symmetry is illustrated by the soace
group P mmm ,
atoms at x y z, x y z, x y z, % y z
% y Z, oF Y z, oF y z, x y Z ,
NI8

= 8 ~fj(cos 2-rraj

+

+ cos 2-rrbj cos 27rcj + cos 2-rrdj).
j=l
As you notice, the structure factor can be written in
terms of a, b, c, d. Therefore, the parameters of only 1/8 of
the atoms in the unit cell need be considered.
N ow look at another space group P nnm, for example:
For (h
k + l) = 2n,
Fhkl

+

NI8

Fhkl

= 8 ~fj( cos 2-rraj

For (h

j=1
k + l)

+

+ cos 27rb + cos 27rcj + cos 2-rrd j ).
j

= 2n +

NI8

= 8~fj( cos 27raj

+

1,

cos 2-rrbj - cos 2-rrcj - cos 27rdj ).
j=1
For this space group, there are two different expressions
for F hkl , depending upon whether (h+k+l) is odd or even.
They both contain a, b, c, and d, but the algebraic combinations of the cosines differ. This does not change our general
procedure, however, and it is a simple matter for the machines to separate the two groups at the proper time.
At step II, use the previously prepared sine or cosine
table deck; sort, collate, and intersperse gang punch the
table values on the detail cards for the four factors a, b, c, d
where
cos 27raj = A j ,
cos 27rbj = B j ,
cos 27rcj = Cj ,
cos 27rdj = Dj .
Each detail card should now contain the following:
h,k,l,x,y,z,a,b,c,d,A,B,C,D as well as a code for the type and
number of the atom.
Until step III there was no need for a particular arrangement of the cards. At this point, the cards must be
sorted on the column indicating the kind of atom. The different kinds are kept separate, and these packs are sorted
according to reflection h, k, l.
Using the accounting machine with the summary punch
attached, punch on blank cards the (A, B, C, D) for each
Fhkl

reflection. Repeat this operation for each kind of atom separately, where

~ cos 2-rra]

=

~A1

= Aml ,

and

~ cos 27ra2 = ~A2 = A m2 , etc.
Therefore, each summary card for atom ( 1) now contains
the code for kind of atom, the reflection, and (A, B, C, D) m 1 ;
summary card for atom (2) contains (A, B, C, D) m2 substituted for (A, B, C, D) mL
At this time, it is necessary to· refer to the expression for
the particular structure being studied, in order to determine
how to combine (A, B, C, D) m1 or m2.
Referring to space group P mmm, (A + B C D) m 1
or m 2 is necessary.
Referring to space group P nnm, (A + B C D) m 1
or m 2 for 2n ; and
(A+B) - (C+D) m 1 or m 2 for 2n + 1.
This is only a minor change on the IBM type 602 control
panel to perform either operation. Another variation for a
complex group can be done simply and easily at this time,
if both F hkl and Fiikl are required.
The sums of the A, B, C, and D's can be calculated for
both at the same time. Only one set of reflection cards (hkl)
are required until the final stages of the work.
After the A, B, C, and D's are combined properly, the
sum is multiplied by the proper scattering factor, fml or fm2.
fml(A+B+C+D) m 1 = Rm 1, etc.
The final step consists of simply adding these products
together for the prop~r reflection and multiplying by a
factor for that reflection,
T hkl (Rm 1 Rm 2 ) = F hkZ •
It is usually of interest to note the contributions of
each kind of atom to the final result; so it is advisable to
list the factors Rm1 and Rm 2 , as well as F hkZ, on the record
sheets.

+ +

+ +

+

DISCUSSION

Mr. Smith: How long did it take you to calculate, say,
for the order of 600 reflexes for your space group P nnn or
P mmm , or that order between 100 and 600?
Miss Grems: It first took me twice as long as it did later,
because I checked, and after I had done quite a number of
these I found there was no point in checking the calculation
lz or a, b, c, and d, because if
of the quantities hx + ky
there was an error it wouldn't make much difference. I
would say roughly about three and a half days-perhaps
four or four and a half. It added about an extra day to carry
on the checking, although a good part of the checking could
go on at the same time. I always did check the last part of
calculations after I found the A's. Of course, it was a simple
check.

+

160
Mr. S11tith: That was for roughly about 500 reflexes?
Miss Grems: That is right; and breaking it down to
about eight y, X', z's.
Afr. Smith: That would be roughly about a fifth of the
time it would take you with a hand calculator, maybe less?
Miss Grems: For the particular case about which I was
talking, we found for both the Fhkl plus and F hkl minus, it
took only a half-hour longer to get the F hkl minus.
Chairman Hurd: Is the method which you have used,
Mr. Smith, roughly analogous to this?
lWr. Smith: Unfortunately, no. I have been using a
method somewhat similar to the one they use at California
Tech., which differs somewhat from this; and, unfortunately, most of it has been done on the hand calculator. Also,
unfortunately, the last case, instead of having, say, eight

COMPUTATION

terms, had twenty terms in the general space group. It was
a little more involved than that, but I was able, by using
some Fourier transforms, to eliminate the necessity of calculating those two longer terms.
Mr. Thompson: Regarding the layout cards for master
cards, which most of us use, our local IBM man made a
very good suggestion of which some of us may not have
thought. He suggested that we punch every hole in the card.
When you want to read a detailed card, you put the layout
card right over the detail card as a mask, and this makes it
very quick to read. A couple of warnings, however: When
you do this, don't punch every hole at once. If you punch
them all simultaneously, two things hlight happen. The
punches may stick in the dies or, as a matter of fact may
come out of the left-hand side. It is advisable to send them
through about eight times.
/

The Calculation of Complex Hypergeometric Functions
with the IBM Type 602-A Calculating Punch
HARVEY GELLMAN
University of Toronto

THE hypergeometric function F (a,b ; c ; z) is usually defined, for purposes of calculation, by the hypergeometric
senes:

FC a, b',c,. z )

- 1
-

+ a·b
+ a(a+1)
b(b+l) 2
1 z
1 2 (
1) z

·c
. ·c c+
+ a(a+l) (a+2) b(b+l) (b+2) 3
(1)
1.2.3'c(c+l)(c+2)
z + ....
this series being convergent for Iz I < 1.
Many physical problems lead to integrals which can be
expressed in terms of hypergeometric functions, and many
important functions employed in analysis are merely special
forms of F (a,b ;c ;z). Thus:
(1+z)n = F( -n,p;{3 ;-z)
log (1+z) = zF(1,1;2 ;-z)
1" (z) = lim !(z)v
(
Z2 )
r(v+l) F A,p.;v+l; - 4Ap. .
The purpose of this paper is to describe a method for
computing F(a,b;c ;z) from (1) when a, b, c, and z are each
of the form x+iy;x,y real, i2 = -1. We were confronted
with these functions through the problem of internal conversion of y-rays emitted by a radioactive nucleus. The
radial integrals arising from the calculation of transition
probabilities were expressible in terms of hypergeometric
functions. Our problem involved 90 hypergeometric functions, and on the basis of a preliminary estimate, 99 terms
of the series were used. Such complex hypergeometric functions cannot be conveniently tabulated since they involve
eight variables, and so a method is required which will
readily yield F(a,b; c ;z) for special values of a, b, c, and z.

+

The six numbers representing the real and imaginary
parts of a, band c are key punched on cards, and the coefficients a2 , aH • • • , do are computed by basic multiplications and additions. This computation requires eight sets of
cards which are later combined on the reproducing punch
to yield a set of master cards for the coefficients of the
polynomials in n. The layout for this master set is given
below:
MASTER SET 9

(2)

[~: + i ~:]

[ (FaFl) Zl

Z2

(F2)]
Fa

+ .[
$

Z2

(FI)
Fa +

ZI

+

Machine Procedure

+ ...

=

+
+

+ +
+
+ +

where a = A 1 +iA 2 ; b = Bl +iB 2 ; c = C 1 +iC2 ; z = Zl+ iz2·
ThenF(a,b;c;z) = l+fofl+fofd2+foflf2f3
(3)
The expanded form of fn can be written as:

fn = (zl+iz 2)

+

+

e:

Calculations
\Ve begin by defining
.
(a+n) (b+n)
fn = gn
$hn = (n+l) (c+n) z

whereF I ='Fl(n) = n3+n2(Al+Bl+CI)
+ n[AI (BI+C1 ) - A 2(B 2-C2) + B1C1 + B 2C2]
+ [AI (BIC1 +B 2C2) + A2(BIC2-B2Cl)]
(5)
2
3
= n
a2n + a1n
ao
F2 = F2(n) = n 2(A 2+B 2-C2)
n[A 2 (B I+CI )
Al (B 2-C2 ) + B 2 CI - B 1C2 ]
+ [A2(B1Cl+B2C2) - Al (B IC2-B 2C I )]
(6)
= b2n2
bIn
bo
2
3
Fa = Fa(n) = n
n (2C I+l)
n(Cl+C~+2CI)
+ (Cl+CD
(7)
= n3
d2n 2
dIn + do
Thus, our object is to compute (7), (6), (5), (4) and (3)
in that order.

(F2)](4)
Fa

161

Card Columns

Data

Remarks

1-2
3-10
11-18
19-26
27-34
35-42
43-50
51-58
59-66
67-74

group number
a2
al
ao
b2
bl
bo
d2
d1
do

each F (a,b ;c ;z) defines a
group; we required 90
values of the hypergeometric function

79
80

'X'
set number 9

162

COMPUTATION

Computation of Polynomials
A set of detail cards (set 10) containing n, n 2 , and n S for
n = O( 1) k is generated on the reproducing punch. This
set contains (k+ 1) cards for each hypergeometric function
to be evaluated, k being the highest value of n used. In our
calculation each group contained the same number of cards
(i.e., the same value of k was used throughout) to minimize
the handling of cards by the operator.
The master set 9 cards are inserted in front of their
appropriate groups in detail set 10, and three separate
group multiplications are performed on the 602-A calculating punch to yield F l , F2 and Fs. The planning chart and
control panel wiring for F s is shown in Figure 1.
Sign Control in Group Multiplication
Since the coefficients on a master set 9 card may be positive or negative, their sign must be retained in the machine
for the complete group of detail set 10 cards following the
master card. This is achieved by dropping out pilot selectors
4 to 7, which control the sign of multiplication, through the
transfer points of pilot selector 1. Pilot selector 1 is picked
up at the control brushes by the master card and is dropped
out in the normal manner.
Computing Fl/Fs and F2/Fa
Since the polynomials in n can have from 6 to 12 significant digits, 12 columns are assigned to them. Detail set 10
cards are sorted on Fa into separate groups which have
12, 11, 10, 9 and 8 significant digits, respectively. Treating
each of these groups separately, the above quotients are
easily formed through basic division. The layout of detail
set 10 is given below:
Card Columns

DETAIL SET 10
Data

1-2
3-4
5-8
9-14
15-26
27-38
39-50
51-58
59-66

group number
n
n2

79-80

set number 10

Remarks

n, n 2 and n S reproduced
from card table of nil!

nS
Fl
F2

Fa
F1/Fa
F2/Fs

Computation of fn
From equation (4) it is seen that the computation of in
requires a complex multiplication of (F1/Fs) + i (F2/Fa)
by (Zl + i Z2)' and this is equivalent to four real multiplications grouped together as shown in (4). The quantities:
group number, n, Fl/Fa and F2/Fs are reproduced from
set 10 into a new set, 11, of detail cards. The values of
Zl and Z2 are key punched into a new set, 12, of master

cards. By performing a complex group mUltiplication from
set 12 to set 11 as shown in Figure 2, the values of in are
generated. In our case Zl was positive, and Z2 negative for
all the groups, so that sign control on group multiplication
as shown in Figure 1 was unnecessary in this operation.

Consecutive Complex Multiplication
Having obtained the fn in the previous operation, we now
require the products fo, fofH fofd2' etc. The method used for
this computation is given below in schematic form:
Card No.

Quantity Read from Card

1
2
3

fo

go+ih o
fl = gl+ih l
f2 = g2+ ih 2
=

Operation

1· fo=Ro+il n
fofl = Rl +iIl
fofd2 = R 2 +iI2

The planning charts and control panel wiring for this operation are shown in Figures 3 and 4, pages 165, 166. The
essential features here are the retention of the calculated
result from one card to act as a multiplier for the following
card, and the conversion to true form of this multiplier if it
should turn out originally as a complement figure in the
counter. (The machine ordinarily converts complement
numbers during the punching operation only.) In addition
to this we must "remember" the sign of, say fofl' when we
multiply it by f2 to form fofd2' The· scheme is started by
feeding a blank card under the direction of a separate control panel which reads a 1 into storage 4R and resets all
other storage units, counters and pilot selectors to normal.
The panel of Figure 4 is then used with one group of set
11 cards. The first card of this group reads into the machine
the numbers fo = go+ihoand has punched on it fo = Ro+ilo.
The quantities go and ho are retained in the machine and
are used to multiply f1 = gl+ih l from the second card
which has punched on it fofl = Rl +iIJ> and so on.
At the end of program step 6, counters 1, 2 and 3 contain
R k , the real part of (fOflf2 .... h). If Rk is negative, the
counter group will contain the complement of R k • On program 7, pilot selector 3 is picked up by an NB impulse, is
transferred on program 8 and is dropped out on program 7
of the following card. Thus, the sign of Rk is remembered
both during the conversion of Rk on program 8 and the
multiplication by Rk on programs 2 and 6. A similar procedure is used for I k.
The final step in the calculation of F (a,b ; c ; z) consists in
summing Rn and In separately on the accounting machine
and recQrding the value:
k

F(a,b ;c;z)

=(1

k

+2:Rn )+ i(2:In).
n=o
n=O

~i~m

R£AI)

CYCU

,

COUNTER

STOlAGEUNIT
OPERATION

..

DlYR.-MULT.

I(S

NX

•8)

,

n2

,

DlVIDIND
2

I

I

I

MVLTIPLY

I
I

~

I

I

I

I

I

I

I

I

6

21.

:Ill

I
I
I

SL

ilJn 4)

n

I

I
I

I

!

I

I

I

I

I

I

I

!tkz

I

~~,

I

I

I

~NSFEI?

4

t
I

..It,

I

I

I

I

I

I

~
~

6

7

8

9

I
I

I

I

I

I

I

I

I

I

I

I

I

.

.UNITS POSlT~H ~!!,.m TO PUNCH

4

6L

,

6R

7L

~~

SB)

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

.

fI'f-~~
I

I

I
I

~O

I
I

I

RO

I
I
I

RoO

!

f,;

i

~

ft

I

I

I

7R

I

I

I
I

I

I
I
I

nl

I

I

I

I

i

I

PUNcH

I

I ~.

I
~.

I

4L

I

!

TIfW(SFEI?

3R

I

n~

17
I

I

5

5

I

3 MVLIPLY

.-

I

I
I

I

T~ANSFER

STORAGE UNITS

.-

I

X

I

2

3

(3'1-S~

:

I

I

I
I

I
I
I

I

I

I

I

I

I

CALCULATING PUNCH - TYPE 602A - CONTROL PANEL
25

o

_ 0
-(4,5,6)

OTO

0

0

0

0

0

26

27

28

37

38

39

40

41

42

43

44

o

0

0

o

0

oNo

oeo

0

0

0

~-(4,5,6)0

0

0

0

0

0

~ :(4'~'I~)0

0

0

0

0

0

-00

0

0

0

0

0

-J~-=='---lCOUPLt

t(4,~,t)0

EXIT - - 1 0 - - - - - - - - - 1 1

To Pilot Sel.
Z and Col. Split

o

0000000

0

--+------EXITS0000000

7

8

9

10

11

12

0000000
0000000

FIGURE

1.

CUBIC POLYNOMIAL (SIGN CONTROL IN GROUP MULTIPLICATION)

n3

+ d2n2 + d1n + do =
163

Fs(n)

if
fa

COUNTER

STORAGE UNIT

~

OPERATION
1l

DIVR.-MULT. I
lR

3
READ
CYCLE

1

DIVIDEND
2

1

10)

'i!!,I+J

X

I

I

I

I

I

I

I

I

~:i

71.

Vrt()LTIPLY
~2

MULTIPlY

Z,

I

I

I

I

l

I

7

I

I

I
I

I

I

'j'

I

I

I

I

o

5

6

7

8

9

10

lJ

1

I

1

I

1

I

.f!

I

I

I

~

I

I

I

....:'J

I

I

I

I

I

I

I

I

I

I

I

RESET

o

IMP
0

0

0

0

0

0

0

0 C0

o

TO

5

I
I

~

I

I
I
I

1,'

I
I

~
PUNCN

I
I

~

1
I

1

37

38

39

40

41

42

43

44

o

0

0

,..

01'TO

0

0

0

0

I

0 NO

®
.Co

1

0

0

oTo

0

0

~--,-,_.......,.":"-;:~rl~__~

J----'-'-t---'----7L----:::-0

0

OXO

YY

0

000

b b

o

~RO~ O~T I~PU~SE

o

1

:

:

CALCULATING PUNCH - TYPE 602A - CONTROL PANEL
15
16
17 18 19 20 21
22
u U R~U V ~ ~ ~

:

I
1
I

I
I

I

I

I

1

I
I
I
I

:

I

"I

(:U-: Z8)

I

I
I

I

000--0---0
0

I

I
I

I

I

I

I

I

I

I

I

I

o
0
0
0
0
0
0
0
0
0
0
0
-----DROPOUT---------D

I

RP
:
:

O~:III tOO
0 PICKUP
0
0 _
0 ____
0 _
0 rgllII'!~.DIGIT
0
0
0 IMP
0
~I
OR BALANCE
DO

RE

Rp

I

ZO 5----cONTROL READING--15--~X=79~-~PUNCH

00000
DIGIT PICKUP
~~_ _"fY""""!r'-0
IMMEDIA TE PICKUP

1

I
I

I

I

14

?IJ~CII

I

R:O

I

13

g~

I

I

I

I
I

:y

12

1

I

I

I

I

I
I

R:O

I
I

I

I

I

I

I

I

I

I

I
I

1

I

i}

I

I

I

I

I
I

I
I

::J

I

I
I
I

I
I

I
I

I

I

I

I
I

I
I

I

I

7R

7L

I

I
I
I

~I'

I

6R
I

I

1

PEADCYCL<

IZ

I

.

POSITI~~T~~IRED TO PUNCH

(~J;)

I
I

I

MVLTIPLY

Read

I

I

'UNITS
6L

I

RI?I

I

I

r.,

:

OR

I

R'(J

I

1

r.2

Tl?ANSFEI?

I

I

I
I

I

I

I

79

f.!

I

I

2

f-!i

I

I

8 TRANSFEK

9

1

OL

Cf1;!

I

I
I
I
I
I

3R

I
I

:

I

I

I

I

I

I

3L

(11-:'8).

I

I

MULTIPLY

2R

I

I

I

I
I

2L

i!zl(-)

I

I

6

I

I

Tl?ANSFEI?

6

5

I

NX

.. TfG4NSFER

5

I

I

I

3

..

3

-;:'

I

2

STORAGE UNITS

I

I

I

0

0

0

0

0

0

0

0

0

0

0

0

~

~..........._4...__,C>--O-_<>_--c...__,C>--O-_<>___O""""":>__O()__o___o__<>~
- - - - - -------10
o
0
0
OTO

1
0 10

0

0 2 0

0

T
0

0

0

0

0

0

o

0

0

oNo

0

0

0

0

0

0

o

0

0

Oco

0

0

0

0

0

0

o

0

0

0

0

0

00000

0

os070RI~

o

0

0

0

0

0

0

Y Y

! b

0
030
0
S
0
oEOOOEO
L
M5~-~-4~
0~0501 0
T
T 6
00060 T o
R
E 7

080

0
9

_ _ _ _ _. . . .
0 CO0SEL~c~gRS 0

OTO

To
--,
-=~~~~~!:~~:-I

o

·0

0

o

0

0

12
000
o

O_C_O_B~S-O_~04P. . . .~+_~~..L____~~~~+_~----r---_t_

o

0--0--0--0--0

o

0--0--0--0--0

FIGURE
Zl

2.

AA

COMPLEX GROUP MULTIPLICATIDN

(~:) - Z2(~:) =

gn;z2 (~:) + Zl(~:) = hn
164

Pi~ot

~d

Sel. 2

Col. Spill

8fill~~... 8~f!*...

OPERATION

!--=-ST.....:O.....:R~AGE~U~N~IT:-::II-F=:::::====C=O=UNhT-fER=::;===;===Hl-_ _-,-_ _---,-_ _- r_ _-r--.:S:.:.TO:.:R:::.A:.;:GE:...::UN:..::I.:..:TS:..;{l~'£;::nt,.::.:"".::"d:.....:.:fr.:::.",::.:",_a p~rc~~'::;s"'°ntr..1 /,4/74.1)
lL

DIYR~MULT.

1

DIYID~D

,3

4: 5

:

6

2L!

2R

3L

3R

1H-++++....:,+++++-':+++-t++-+-t:++-,j--jI~-H-H1(21-2~)1(29-34)
I
I
+;1
go
h. o

READ
CYCLE
CARD I

I

I

"J"

1 TRANSFER

4R /

:

"i~'

"U6~ITS POSITI~~ M * T BE ~~RED TO PU~~H

I "0"

"
I
I

~~ D

I

4L,

•

r1o'~::~; ~

, I'
: RO

: RO

;(Ro

Ifor n=o

; n ¥- 0)

for

I

+1.:.

2 MULTIPLY

RO
I

:

1 !/!
ttf C D

3 TRANSFER

I

4 M()LTIPLY

I

I

I

1.11.

I

RO

I
I

!

!

I

5 TRANSFER

,
I

I

RO

6 MULTIPLY
I

TRANSFeR

7

RO

6,

."

/

PUNCH

.t"

I

RO

I

I '

liPI/NCIf
( :(37- 42)

I

,I

~1~~~.~~·~~~:++++++~:++~:~-HhHl

I

,

8 TRANSFER

I

~.

9 TRANSFER

RO
I

10

RO

MULTIPLY

,

TRANSFER
11

&.

:+RO

Pi/NCII

~l

.

I

12

$Ie

I

11

I

TRANSFEIC

i~ ifh OPERATION ~ST:..:O:.:R~:GE=IY:"':R~:':'::::~__LT#-.-f'===D=Y:::D=EN=D==C::0=iUN..:;T..:::ERi=::;:::=::::;::===If-#----r,----r-----,------r--.:S:.:.TO.:.R:.::.A:.;:GE::...:.U:.::NI..:::TS~r-.-lU-tNI-TS-PO-S-lTIo"~~t~TU~I~IRED TO PUNCH

fa ft;;

lL

1

2

I

3

.4:

5

:

6

2L;

2R

3L

lR

II-++++t.!.'++++++'++++-H-+t'-H-j--ll-+t+l-HI(ZI - Z6)!(Z9 - 34)

READ
CYCLE
CA~D

lR

:
2

I

I + I.

I

I

I

I

~I

4L:

,

4R

6L

6R

*

1:"

: RO

I

: RO
I for n=o

7R

I

I

1 TRANSFER

,

:[t.1o)

: ['(1

Itl

7L

I
I

RO

:'or n:l=e
I

RO

2 MtJLTIPLY

3

TRANSFER

4

MVLTIPlY

,
RO

iJ.!

!

I

,: RO

5 TRANSFER

RO

6 MVLTIPLY
,

TRANSF#

7

1R,

I

6,

(:PU,yCH

PI/NCH
8 TRANSFER

I

\1~~~J~;,~~-~~~':++"-H-h:rlH~:rlHHrHHI

,

-'
Ro

9 TRANSFER

10

11

'MULT1'::LY

~I

TRIINSFEI<

'>1..

!

RO

~

PVNCf{
12

TRANSFEi
I'

43~48

3. CONSECUTIVE COMPLEX
Card 1 : fo
go + iho
Card 2: f1
gl + ih1 ; fof1
Card 3: f2
g2 + ih2 ; fof1f2

FIGURE

165

MULTIPLICATION

Ro
RI
R2

+ ilo
+ ill

+ iI2

166

COMPUTATION

~

______ Co_CO _________________________

I

2

gn

h"

; Set. Set.

~~~

j

6

7

8

9

10

JJ

12

13

14

15

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

16

________

17

18

19

~~

20

____

21

22

0

0

r-~~~~~~-~~

oFoto; ~t~BA~AN~E pf;:;pTR~L R:AD~NGo-o-~-------'--PUNCH
r-rr1
1

Rn-I In-I DIGIT PICKUP

1+---11
@-@
o

P
I
L
0

T

DR

00

0

0

0

0

POUT

0

0

0

0

READ DROP OUT IMPULSE

2

1~50
OT~

0

0

0

10
OTO

0

0

0

0

0

ON

0

0

0

oNo

0

0

0

0

0

0

0

0

oco

0

0

0

0

0

0

0

0

oTo

0

0

0

0

0

0

0

0

0

0

0

0

0

oc-0°

0

S,Z)

o T...-c;-..

0

oTo

0

oNe

0

C

0

9
0

~

71 IiI I IaI I I

o

X

RE~DING_15
0

3

o·

0

0

h"

0
000
L 0-9
0 5 0
0

r

0

~el.

zoo

o

0

0

o
o

Q

o Co

0

0

010

0

0

0

0

0

o NO

0

0

0

0

o Co

0

0

0

0

OTO

0

0

0

oNO

0

0

0

0

0

0

0

10

- - - -- 5

6

- 0

7

0

0

--- 9

0

10

- -

0---0---0

n=O

+ a2

L

4.

0

0

CONSECUTIVE COMPLEX l\.t[ULTIPLICATION

where k is the last value of n III the series. F 2 and F 3 are
checked in a similar manner.
Since the third differences of a cubic polynomial are constant, an alternative check consists of finding
6,F 1 (n)

F 1 ( n+ 1) - F 1 ( n )

6,2F 1 (n)

6,F 1 (n+l) -

6,3F 1 (n)

6,2F 1 (n+l) -

6,F 1 (n)
6,2F 1 (n)

= constant.

k

(n2)

n=O

+a

1

L

(n)

n=O

+ ao (k+l)

=[k(ki 1

0

SKIP OUT

k

(n 3)

0

11

TENTHS

Checking of Cmnputations
The coefficients a2 , a v . . . , do of the polynomials in n are
checked by manually checking one or two cards, performing
the machine operation twice, and testing the resulting
punches for double punching.
The polynomials F 1 , F2 and F3 were checked separately
by summation on the accounting machine according to the
following formula:

L

60
0

ao

X5
0---0---0

FIGURE

k

o

0

0

6L
0---0---0

12

0

0

55

oTo

0

40

C

T

20
0

35 \

)J2+ a [k(k+l)6(2k+l)]
2

1
+ a1 [k(ki )]

+ ao(k+l)

Generating Differences
Generally, functions which are tabulated at equal intervals of the argument can be conveniently checked by taking
differences up to an order which is sufficient to show a
"jump" indicating an error.
For this reason a planning chart and control panel wiring
scheme is shown in Figure 5 for finding first differences of

iii!

COUNTER

STORAGE UNIT
OPERATION

..

DIVR.-MULT.

1

lit

2

I

I

CYCLE

SELECTOR

1 TRANSFER

"

3

I

TRANSFER
AND

2

PVN&II

SELECTOR

1 Tfi>ANSFER

I
I

I

I

I

I

I

I

I

I

I

I

I

TRANSFER

:

I
I

I
I

I

I

I

I

I

I

I

I

I

I

I

I

SELECTOR

I

I

I
I

I

~~

:,

I
I

I
I

I

I

I

I

I

I

I

I

1 TRANSFER

I

I

Ll,

I

PUNt'1I

-r

I

TRANSFER

I

I

I

AND

I

I

I

PI/NCN

!

!

!

I

:

}j

I

I

,
I

RO

I

,

2

: (15-16)

YZ

I

I

I

IT

I
I

RO

I

I

: PUNCH

I

I

I

:

I

I

!

3

I

:+ ~

I
I

I
I

I

I
I

I

I

42

I

I
I

:

•

PUNCII

I

Col

6

~_lIl(aa~illll!~ OROBA~AN~E pfCK~P~_ _ _ _ _O_airj0~.
000000000000
o 0
DIGIT PICKUP
000000000000
IMMEDIATE PICKUP
o
0
0
0
0
0
0
0
0
0
0
0
0
0
PUNCH CONTROL EXIT
o
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-----4----DROPOUT-----------------------

o

0

0

0

0

0

0

0

10
OTO

0

0

0

0

0

0

0

o 10TO

oeo

o

0

0

0

0

0

0

0

0

010

0

030

S
o

0

o

0

0

0

0

oNe

0

0

0

0

0

0

o

OCO

0

0

0

0

0

0

o

0

0

oTo

0

0

0

0

0

0

oE o • OEO
L
M 5 1------,--,10"'-,-

0~0501 0
T
T 6
00060 T

I I III I III r
~
IILl;~el ~

0

0000

0

oNO

0

0

000
0
L 0·9
O~O 0

o CO

0

0

0

0 TOO

0

0

7
0

8
0

o

0

0

0

0

0

OCO

0

0

0

0

11

12

X5

o

0

0

0--0----0

oTO

0

0

0

TENTHS

0--0----0

SKIP OUT

0

0

0

0

0 N0

0

OCO

0

11

oTO

0

0

0

o

ONO

0

0

0

o

oCo

0

0

0

1

II I
I 11 I
1·1

IJ

8

READINGlilll

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

7R

8R
0000000000
XIT

~~~~t~~~ro-~~~~~~~
flI
---~--~

II 1 1 III I-I I
I III I I I I I
3

...

~~____ co SElECToJ 7

12

o

3

111

DIVISOR·MULTIPlIER_STORAGE ENTRY
0
0
0
0
0
0
0
0
0
0
0
0
4L
3R
0
0
0
0
0
0
0
0
0
6R
7L

X2

10

8R

75

C

"+1IiI01IiIIIiI0~~_1IIIIIfIII

10
0 NO

0000

00000000000000
30
35
40
0
0
0
0
0
0
0
0
0
55
60
0
0
0
0
0
0
0

n

C

OTO

DIVIDE

o

0

0000

121

151

L

0--0----0

0

0

6R
00000000
5R
8
000000000
COUNTER ENTRY

0
4

9

0--0----0

0

o

2

oR:::Rlf

o
0
0
0
0
0
0
--------------EXITSo
0
o 0

o

OCo

0

-------------0---0·090
~
_ _ _ _ _~. . CO SELECTORS

0--0----0

ONO

o

000

0

T
0

o

OXO

EADDROPOUTI~~020
-----1----

.

T

I

J
I

7R

I

I

I

TnPUNCH

7L

!I

I
I

I

6R

,

I

I

I

I

6L

RO

I
I

.

'''''''T< on > 10) is
notoriously sparse in the literature of the subject, and because of the great importance of the problem in many fields,
such a paper should be of interest to this seminar. In addition, it is hoped that research organizations will be stimulated to make investigations, which we, not being a research
organization, are neither qualified for, nor encouraged to
carry out.
When simultaneous linear algebraic equations are met in
engineering work, as they frequently are, it is usually necessary to solve a number of systems having the same coefficients, but different constants on the right-hand sides.
It is shown in reference 1 that if the number of systems is
at least four (i.e., if there are at least four columns of constants), it is economical to compute the inverse matrix. In
our work this is generally the case, and for this reason matrix inversion is important.
It is not practical to perform the inversion process manually (with desk machines) unless n < 10, and even for
these small matrices, if it is necessary to invert a large
number of them at anyone time, the work is done much
more quickly, and accurately with the help of IBM. Therefore, both systems of high order (n > 10) and large groups
of small systems are handled by the IBM group.
The small systems are handled quite successfully by a
variant of the well-known Aitken method. To give an idea
of the efficiency with which this work is done, 48 fifth-order
matrices were recently inverted and checked in 16 hoursan average of 20 minutes per inversion. This work was
done using the IBM Type 602 Calculating Punch; use of
the IBM Type 604 Electronic Calculating Punch would
cut down the time somewhat, but not greatly so because of
the large amount of card handling involved. Although this
is far from being the most efficient use of the machines,
those with experience in numerical inversion will recognize
it as being amply justified.
The large systems present special difficulties which remain to be solved. Not only does the number of operations

171

increase enormously with the order, making the process
very slow, but also· such systems rapidly tend toward instability as the order increases-i.e., the rounding errors,
which are inevitable-accumulate in a serious way. In this
connection, see references 1 and 2.
Because of the inherent instability of the direct methods,
several well-known iterative methods (classical iteration,
and the method of steepest descents) were tried. For large
systems convergence of these methods is. much too slow.
Convergence is theoretically assured for positive definite
matrices. Positive definiteness was guaranteed by the expedient of taking A'A, where A' is the transpose of A, the
matrix in question, and
A-l = (A'A)-l A'.
Despite the theoretical assurance of convergence, in this
case, numerous iterations showed no evidence of convergence at all ! We probably did not give sufficient trial to this
approach, but our negative results with the methods tried
are confirmed by other investigators. s Iteration has its value
in the improvement of an approximate solution obtained by
some other means.
In choosing among the many direct methods, several considerations are important: (1) The number of elementary
operations should be a minimum. (2) The arrangement of
work and order of operations should be convenient with
respect to the peculiarities of the machines.
Concerning the first requirement, it is asserted in reference 2 that the elimination method requires fewer operations than other known methods. There are numerous
methods, however; which can be considered to be but slight
variations of the elimination method. The method of partitioning of matrices, for example, is a generalization of the
elimination method, and the various methods which involve
pivotal reduction-those of Aitken, Crout, Doolittle, Gauss,
etc.-are closely similar and require about the same number
of operations. The method of determinants is an example
of a method definitely inferior to those mentioned above.
With respect to the second requirement, suitability for
the machines, methods which include such things as repeated cross multiplication are to be avoided.
*This paper was presented by title by Paul E. Bisch.

172

COMPUTATION

A direct method which fits these requirements is a new
variant (as yet unpublished) of the elimination method developed by Mr. Charles F. Davis of our IBM group. Although this method has several features to commend it for
use with IBM machines (and with desk machines as well)
it is not claimed that the successful inversion of several
large matrices could not have been achieved by other methods. The method used "yas simply the elimination method
with some new twists. Since the details are still being
written up they cannot be given here.
The point of most interest is that inversion of a matrix
as large as 88 by 88 has actually been carried out satisfactorily, using standard IBIVI equipment. The prevailing
opinion among the authorities is that the inversion of a
matrix of this order is a practical impossibility (d. reference 3, page 2 and pages 6-8). In a sense this may be true,
for this first attempt at inverting an 88th order matrix took
about nine weeks and involved between 60 and 70 thousand
cards. The better performance which should come with experience might still be prohibitively long and expensive
from the engineering point of view. Nevertheless, the degree of success we have had seems hopeful.
A word of explanation should be given about what we
have considered "satisfactory" in the way of accuracy. Unfortunately this is a difficult question if one demands
precise limits. The question might be phrased this way: If
a solution 9f a linear system is substituted in the original
equations and all the remainders are small, is the error in
the solution small? How small? If the system is

AX -B = 0
and a solution Xl is substituted there results a column of
remainders R 1
AX1-B = R 1 ·
Elementary matrix theory gives the answers to the above
questions in terms of the norms of the column vectors Rl
and E1 = X - X H and the quantities A and 11', the upper
and lower bounds, respectively, of the matrix A.
1/,\ IRll < IE11 < l/p.I R l\ .
From this we see that if A is large and 11' small, the limits of
error can be very wide; in particular if 11' is sufficiently
smClll, lEI I may be large, although IRll is small. Thus,
what is often considered to be good check may conceal large
errors. The main difficulty, however, is that the quantities
A and 11' are not known, and the work required to find good
estimates of them is usually 'prohibitive.
We have been obliged to get around this difficulty in a
manner rather unsatisfying to the mathematician but wholly
acceptable to the engineer. The engineer looks at the numerical results, and, with physical intuition as a guide, decides whether they are reasonable. To take an example, the
solution of a set of 66 equations gave us the stress distribution of a sheet stringer combination. The regularity of the
results and confirmation of what should be expected on the

basis of experimental evidence convinced engineers that
this was the "right" answer." Whether a given numerical
value found in this way is correct to two, three, or more
significant figures is not known.
There is still another rough indication of accuracy. The
experienced IBM operator working in a fixed field (8 digits
in our case) can fairly well tell when things are behaving
nicely or not. Common to all variants of the elimination
method is a division at each reduction. The continued product of these divisors is the determinant, and although the
determinant is large, some of these divisors may be small.
The occurrence of small divisors means the loss of significant figures, i.e., the process blows up. In the reduction
process of the 88 equations, as well as the 66, this difficulty
was not apparent. Furthermore, iteration of solutions obtained showed convincing evidence of rapid convergence.
Strictly speaking, the inverses of these large matrices
were only partially determined. The earlier statement that
it is economical to compute the inverse when there are more
than four columns of constants needs qualification. It is not
strictly true for the matrices of quasi-diagonal character
(explained later) dealt with by us. However, the process
effectively leads to a decomposition into diagonal and semidiagonal factors, from which with some additional work the
inverse can be found explicitly.
It is true that no general conclusions can be drawn from
such limited experience as ours with matrices of a particular type. But matrices of this type occur frequently in structural analysis and elsewhere. The matrices spoken of may
be considered to be made up of the finite difference approximants of linear partial differential equations. Each stepwise
approximant has the important feature that coefficients of
all but a few of the unknowns are zero. Thus, these coefficients can be arranged in such a way that the matrix has
large triangles of zeros in the upper right and lower left
corners. Such a matrix might be called quasi-diagonal at}d
is certainly one of the most important types. Since the limit
case, a diagonal, is stable (unless some diagonal element is
zero) and trivially easy to invert, it is plausible to suppose
that quasi-diagonal matrices are particularly stable. It is
suggested that our experience lends weight to this supposition and that a study of quasi-diagonal matrices will yield
much more optimistic estimates of precision than are found
in the literature.
On page 1023 of reference 2, von Neumann and Goldstine conclude that, "for example, matrices of the orders 15,
50, 150 can usually be inverted with a (relative) precision
of 8, 10, 12 decimal digits less, respectively, than the number of digits carried throughout." By "usually" is meant
that "if a plausible statistic of matrices is assumed, then
these estimates hold with the exception of a low probability
minority." These conclusions, which are the result of a
thorough analysis, are valuable to those who are anticipat-

SEMINAR

ing the automatic computing machines of the future, but to
those who think it might be practical to invert certain types
of large matrices, using standard IBM equipment here and
now, they seem unduly pessimistic.
The critical question is that of "a plausible statistic of
matrices." The estimates of von Neumann and Goldstine
are made in terms of A and p., the upper and lower bounds,
respectively, of the matrix--quantities not known in advance and very difficult to determine. The numerical estimates quoted above are the results of introducing statistical
hypotheses concerning the sizes of these quantities. This is
done in the form of the results of V. Bargmann concerning
the statistical distribution of proper values of a "random"
matrix. It is possible that a similar study of the quasidiagonal type, defined in this paper, might lead to less discouraging conclusions. It is stated in reference 1, page 59,
that these estimates "can also be used in the case when the
matrix is not given· at random but arises from the attempt
to solve approximately an elliptic partial differential equation by replacing it by n linear equations." But no reason is

173
given, and perhaps this manner of stating the point indicates a certain lack of conviction. However, th~ original
source of these estimates is not accessible to this writer.
To summarize: On the basis of limited experience inverting matrices, it appears to us at North American that, contrary to prevailing opinion, it might be practical to invert
certain important types of matrices of high order, using
standard IBM-eguipment. What is needed is further statistical study of the~e" types, and, if the estimates of precision
so obtained are fav6'r:able, a comparative study of known
methods of inversion.

1. V. BARGMANN, D. MONTGOMERY, and ]. VON NEUMANN, "Solution of Linear Systems of High Order," U. S. Navy Bureau of
Ordnance Contract NORD 9596, 25 (October, 1946).
2. ]. VON NEUMANN and H. H. GOLDSTINE, "Numerical Inverting
of Matrices of High Order," Bulletin of the American Mathe'matical Society, Vol. 53 (November 11,1947), pp. 1021-1099.
3. H. HOTELLING, "Some New Methods in Matrix Calculation,"
Annals of Mathel1wtical Statistics, Vol. 14 (1943), pp. 1-34.
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c043 52.372728, 2009/01/18-15:56:37 Create Date : 2010:10:08 19:20:25-08:00 Modify Date : 2010:10:08 20:56:33-07:00 Metadata Date : 2010:10:08 20:56:33-07:00 Producer : Adobe Acrobat 9.34 Paper Capture Plug-in Format : application/pdf Document ID : uuid:4ff907ca-f45d-4095-bd3c-196c35c6291b Instance ID : uuid:2dd76ead-88a3-4403-a34e-029297e14c72 Page Layout : SinglePage Page Mode : UseNone Page Count : 173
EXIF Metadata provided by EXIF.tools
IBM_Computation_Seminar_Dec49 IBM Computation Seminar Dec49

IBM_Computation_Seminar_Dec49 IBM_Computation_Seminar_Dec49

Navigation menu

Versions of this User Manual:

Views

Navigation