1989_TI_SN74ACT8800_Family_Data_Manual 1989 TI SN74ACT8800 Family Data Manual

User Manual: 1989_TI_SN74ACT8800_Family_Data_Manual
Open the PDF directly: View PDF .
Page Count: 754
Download
Open PDF In Browser	View PDF
. . . TEXAS
INSTRUMENTS

SN74ACT8800 Family
32·Bif CMOS Processor
Building Blocks

1989

1989

Overview

SN74ACT8818

16-Bit Microsequencer

SN74ACT8832

32-Bit Registered ALU

SN74ACT8836

32- x 32-Bit Parallel Multiplier

SN74ACT8837

64-Bit Floating Point Processor

SN74ACT8841

Digital Crossbar Switch

SN74ACT8847

64-Bit Floating Point/lnteger Processor

Support

Mechanical Data

SN74ACT8800 Family
32·Bit CMOS Processor
Building Blocks
Data Manual

-1!1
TEXAS

INSTRUMENTS

IMPORTANT NOTICE
Texas Instruments (Til reserves the right to make changes to or
to discontinue any semiconductor product or service identified
in this publication without notice. TI advises its customers to
obtain the latest version of the relevant information to verify,
before placing orders, that the information being relied upon is
current.
TI warrants performance of its semiconductor products to current
specifications in accordance with TI's standard warranty. Testing
and other quality control techniques are utilized to the extent TI
deems necessary to support this warranty. Unless mandated by
government requirements, specific testing of all parameters of
each device is not necessarily performed.
TI assumes no liability for TI applications assistance, customer
product design, software performance, or infringement of patents
or services described herein. Nor does TI warrant or represent that
any license, either express or implied, is granted under any patent
right, copyright, mask work right. or other intellectual property
right of TI covering or relating to any combination, machine, or
process in which such semiconductor products or services might
be or are used.

Copyright © 1988, Texas Instruments Incorporated
March 1988
First edition:
First revision:
June 1988
Second revision: June 1989

INTRODUCTION
In this manual, Texas Instruments presents technical information on the TI
SN74ACT8800 family of 32-bit processor "building block" circuits. The
SN74ACT8800 family is composed of single-chip VLSI processor functions, all of which
are designed for high-complexity processing applications.
This manual includes specifications and operational information on the following highperformance advanced-CMOS devices:
•
•
•
•
•
•

SN 7 4ACT881 8
SN74ACT8832
SN74ACT8836
SN74ACT8837
SN74ACT8841
SN74ACT8847

16-bit
32-bit
32- x
64-bit
Digital
64-bit

microsequencer
registered ALU
32-bit parallel multiplier
floating point processor
crossbar switch
floating point/integer processor

These high-speed devices operate at or above 20 MHz, while providing the low power
consumption of TI's advanced one-micron EPIC'· CMOS technology. The EPIC'· CMOS
process combines twin-well structures for increased density with one-micron gate
lengths for increased speed.
The SN74ACT8800 Family Data Manual contains design and specification data for
all five devices previously listed and includes additional programming and operational
information for the '8818, '8832, and '8837/'8847. Two application notes,
"Chebyshev Routines for the SN74ACT8847" and "High-speed Vector Math and 3D
Graphics Using the SN74ACT8837/8847 Floating Point Unit" are also included.
Introductory sections of the manual include an overview of the '8800 family and a
summary of the software tools and design support TI offers for the chip-set. The general
information section includes an explanation of the function tables, parameter
measurement information, and typical characteristics related to the products listed
in this volume.
Package dimensions are given in the Mechanical Data section of the book in metric
measurement (and parenthetically in inches).
Complete technical data for any Texas Instruments semicondutor product is available
from your nearest TI field sales office, local authorized TI distributor, or by calling Texas
Instruments at 1-800-232-3200.

EPIC is a trademark of Texas Instruments Incorporated.

v

vi

Overview

1-1

o
...
<
<
(1)

en'

:E

1-2

Overview

1-3

o
<

...<

CD

ai'

:E

1-4

Introduction
Texas Instruments SN74ACT8800 family of 32-bit processor building blocks has been
developed to allow the easy, custom design of functionally sophisticated, highperformance processor systems. The '8800 family is composed of single-chip, VLSI
devices, each of which represents an element of a CPU.
Geared for computationally intensive applications, SN74ACT8800 devices include highperformance ALUs, multipliers, microsequencers, and floating point processors.
The '8800 chip set provides the performance, functionality, and flexibility to fill the
most demanding processing needs and is structured to reduce system design cost
and effort. Most of these high-speed processor functions operate at 20 MHz and above,
and, at the same time, provide the power savings of TI's advanced, 1 I!m EPICTM CMOS
technology.
The family's building block approach allows the easy, "pick-and-choose" creation of
customized processor systems, while the devices' high level of integration provides
cost-effectiveness.
Designed especially for high-complexity processing, the devices in the '8800 family
offer a range of functional options. Device features include three-port architecture,
double-precision accuracy, optional pipelined operation, and built-in fault tolerance.
Array, digital signal, image, and graphics processing can be optimized with '8800
devices. Other applications are found in supermini and fault-tolerant computers, and
I/O and network controllers.
In addition to the high-performance, CMOS processor functions featured in this data
manual, the family includes several high-speed, low-power bipolar support chips. To
reduce power dissipation and ensure reliabilty, these bipolar devices use Tl's proprietary
Schottky Transistor Logic (STL) internal circuitry.

EPIC is a trademark of Texas Instruments Incorporated.

1-5

~
oS;
Cii

>

o

At present, TI's '8800 32-bit processor building block family comprises the following
functions:
•
•
•
•
•
•
•

o
<

...<

(1)

CD'

:e

SN74ACT8818 16-bit micro sequencer
SN74ACT8832 32-bit registered ALU
SN74ACT8836 32· x 32-bit parallel multiplier
SN74ACT8837 64-bit floating point processor
SN74ACT8841 digital crossbar switch
SN74ACT8847 64-bit floating point and integer processor
Bipolar Support Chips
• SN74AS8838 32-bit barrel shifter
• SN74AS8839 32-bit shuffle/exchange network
• SN74AS8840 16 x 4 crossbar switch

20 MIPS and Low CMOS Power Consumption
With instruction cycle times of 50 ns or less and the low power consumption of EPIC'·
CMOS, the '8800 chip set offers an unrivaled speed/power combination. Unlike
traditional microprocessors, which require multiple cycles to perform an operation,
the' ACT8800 processors typically can complete instructions in a single cycle.
The ' ACT8832 registered ALU and ' ACT8818 microsequencer together create a
powerful 20-MHz CPU. Because instructions can be performed in a single cycle, the
8832/8818 combination is capable of executing over 20 million instructions per second
(MIPS).
For math-intensive applications, the ' ACT8836 fixed-point multiplier/accumulator
(MAC), ,ACT8837 64-bit floating point processor, and' ACT884 7 64-bit floating point
and integer processor offer unprecedented computational power.
The exceptional performance of the' ACT8800 family is made possible by TI's EPICTlO
CMOS technology. The EPIC™ CMOS process combines twin-well structures for
increased density with one-micron gate lengths for increased speed.

Customized Solution
The '8800 family is designed with a variety of architectural and functional options
to provide maximum design flexibility. These device features allow the creation of
"customized" solutions with the '8800 chipset.
A building block approach to processing allows designers to match specialized hardware
to their specific design needs. The '8818/8832 combination forms the basis of the
system, a high-speed CPU. For applications requiring high-speed integer multiplication,
the' ACT8836 can be added. To provide the high precision and large dynamic range
of floating point numbers, the 'ACT8837 or 'ACT8847 can be employed.

EPIC is a trademark of Texas Instruments Incorporated.

1-6

To ensure speed and flexibility, each component of the '8800 family has three data
ports. Each data port accommodates 32 bits of data, plus four parity bits. This
architecture eliminates many of the I/O bottlenecks associated with traditional singleI/O microprocessors.
The three-port architecture and functional partitioning of the '8800 chip-set opens
the door to a variety of parallel processing applications. Placing the math and shifting
functions in parallel with the ALU permits concurrent processing of data. Additional
processors can be added when performance needs dictate'.
The 'ACT8800 building block processors are microprogrammable, so that their
instruction sets can be tailored to a specific application. This high degree of
programmability offers greater speed and flexibility than a typical microprocessor and
ensures the most efficient use of hardware.
A separate control bus eliminates the need for multiplexing instructions and data, further
reducing processing bottlenecks. The microcode bus width is determined by the
designer and the application.
Another source of design flexibility is provided by the pipelined/flowthrough operation
option. Pipelining can dramatically reduce the time required to perform iterative, or
sequential, calculations. On the other hand, random or nonsequential algorithms require
fast flowthrough operations. The '8800 chip set allows the designer to select the mode
(fully pipelined, partially pipelined, or nonpipelined) most suited to each design.

Scientific Accuracy
The '8800 family is designed to support applications which require double-precision
accuracy. Many scientific applications, such as those in the areas of high-end graphics,
digital signal processing, and array processing, require such accuracy to maintain data
integrity. In general-purpose computing applications, floating point processors must
often support double-precision data formats to maintain compatibility with existing
software.
To ensure data integrity, '8800 devices (excluding the barrel shifter and
microsequencer) support parity checking and generation, as well as master/slave error
detection. Byte parity checking is performed on the input ports, and a parity generator
and a master/slave comparator are provided at the output. Fault tolerance is built into
the processors, ensuring correct device operation without extra logic or costly software.

1-7

3:
Q)

.~

~

o

The SN74ACT8800 Building Block Processor System
Some of the high-performance '8800 devices are described in the following paragraphs.

SN74ACT8818 16-Bit Microsequencer

~
~

<
ai'
:e

In a high-performance microcoded system, a fast microcode controller is required to
control the flow of instructions. The SN74ACT8818 is a high-speed, versatile l6-bit
microsequencer capable of addressing 64K words of microcode memory. The
ACT881 8 can address the next instruction fast enough to support a 50-ns system
cycle time.

'

The' ACT8818 65-word-deep by l6-bit-wide stack is useful for storing subroutine
return addresses, top of loop addresses, and loop counts. Addresses can be sourced
from eight different sources: the three I/O ports, the two register counters, the
microprogram counter, the stack, and the l6-way branch.

SN74ACT8832 Registered ALU
The SN74ACT8832 is a 32-bit registered ALU that operates at approximately 20 MHz.
Because instructions can be performed in a single cycle, the' ACT8832 is capable of
executing 20 million microinstructions per second. An on-board 64-word register file
is 36-bits-wide to permit the storage of parity bits. The 3-operand register file increases
performance by enabling the creation of an instruction and the storage of the previous
result in a single cycle. To facilitate data transfer, operands stored in the register file
can be accessed externally, while the ALU is executing. To support the parallel
processing of data, the' ACT8832 can be configured to operate as four 8-bit ALUs,
two l6-bit ALUs, or a single 32-bit ALU. The' ACT8832 incorporates 32-bit shifters
for double-precision shift operations.

SN74ACT8836 32- x 32-Bit Integer MAC
The SN74ACT8836 is a 32-bit integer multiplier/accumulator (MAC) that accepts two
32-bit inputs and computes a 64-bit product. The device can also operate as a 64-bit
by 64-bit multiplier. An onboard adder is provided to add or subtract the product or
the complement of the product from the accumulator.
When pipelined internally, the l.",m CMOS parallel MAC performs a full 32- x 32-bit
multiply/accumulate in a single 36-ns clock cycle. In flowthrough mode (without any
pipelining), the' ACT8836 takes 60 ns to multiply two 32-bit numbers. The' ACT8836
performs a 64- x 64-bit multiply/accumulate, outputting a 64-bit result, in 225 ns.
The' ACT8836 can handle a wide variety of data types, including two's complement,
signed, and mixed. Division is supported via the Newton-Raphson algorithm.

SN74ACT8837 64-Bit Floating Point Unit
The SN74ACT8837 is a high-speed floating point processor. This single-chip device
performs 32- or 64-bit floating point operations.

1-8

More than just a coprocessor, the' ACT8837 integrates on one chip a double-precision
floating point ALU and multiplier. Integrating these functions on a single chip reduces
data routing problems and processing overhead. In addition, three data ports and a
64-bit internal bus architecture allow for single-cycle operations.
The' ACT8837 can be pipelined for iterative calculations or can operate with input
registers disabled for low latency.

SN74ACT8841 Digital Crossbar Switch

~
Q)
'S;

..

Q)

>

The SN74ACT8841 is a single-chip digital crossbar switch. The high-performance
device, cost-effectively eliminates bottlenecks to speed data through complex bus
architecture.
The' ACT8841 is ideal for multiprocessor applications, where memory bottlenecks
tend to occur. The device has 64 bidirectional I/O ports that can be configured as 16
4-bit ports, 8 8-bit ports, or 4 16-bit ports. Each bidirectional port can be connected
in any conceivable combination. Any single input port can be broadcast to any
combination of output ports. The total time for data transfer is 20 ns.
The control sources for ten separate switching configurations are on-chip, including
eight banks of programmable control flip-flops and two hard-wired control circuits.
The EPIC'" CMOS SN74ACT8841 and its predecessor, SN74AS8840, are based on
the same architecture, differing in power consumption, number of control registers,
and pin-out. Microcode written for the ' AS8840 can be run on the ' ACT8841 .

SN74ACT8847 64-Bit Floating Point Unit
The SN74ACT8847 is a high-speed 64-bit floating point processor. The device is fully
compatible with IEEE standard 754-1985 for addition, subtraction, multiplication,
division, square root, and comparison. Division and square root operations are
implemented via hardwired control.
The SN74ACT8847 FPU also performs integer arithmetic, logical operations, and logical
shifts. Registers are provided at the inputs, outputs, and inside the ALU and multiplier
to support multilevel pipelining. These registers can be bypassed for nonpipelined
operations.
When fully pipelined, the' ACT884 7 can perform a double-precision floating point or
32-bit integer operation in under 40 ns. When in flowthrough mode, the' ACT884 7
takes less than 100 ns to perform an operation.

1-9

0

Bipolar Support Chips

~
~

The SN74AS8838 high-speed, 32-bit barrel shifter can shift up to 32 bits in a single
instruction cycle of Linder 25 ns. Five basic shifts can be programmed: circular left,
circular right, logical left, logical right, and arithmetic right. The' AS8838 offloads the
responsibility for shifting operations from the ALU, which increases shifter functionality
and system throughput.

<

(ii' The SN74AS8839 is a 32-bit shuffle/exchange network. The high-speed device can
perform data permutations on one 32-bit, two 16-bit, four 8-bit, or eight 4-bit data
words in a single instruction cycle of under 25 ns. The shuffle/exchange network is
designed primarily for use in digital signal processing applications.

:e

1-10

SN74ACT8818

16-Bit Microsequencer

2-1

en

:2
"'-I
~

l>

(")

-I

00
00
....a

00

2-2

SN74ACT8818
16·8it Microsequencer
•

Addresses Up to 64K Locations of Microprogram Memory

•

CLK-to-Y

•

Low-Power EPIC'· CMOS

•

Addresses Selected from Eight Different Sources

•

Performs Multiway Branching, Conditional Subroutine Calls, and Nested
Loops

=

30 ns (tpd)

•

Large 65-Word by 16-bit Stack

•

Cascadable

co
.-

CO
CO
lt)



n

-I
CO
CO
~

CO

Continue .........................................
Continue and Pop ..................................
Continue and Push .................................
Branch (Example 1) .................................
Branch (Example 2) .................................
Sixteen-Way Branch ................................
Conditional Branch .................................
Three-Way Branch .................................
Thirty-Two-Way Branch .............................
Repeat ..........................................
Repeat on Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Repeat Until CC = H ...............................
Loop Until Zero ....................................
Conditional Loop Until Zero ...........................
Jump to Subroutine ................................
Conditional Jump to Subroutine ........................
Two-Way Jump to Subroutine .........................
Return from Subroutine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conditional Return from Subroutine . . . . . . . . . . . . . . . . . . . . .
Clear Pointers .....................................
Reset ...........................................

2-6

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

2-40
2-40
2-40
2-40
2-42
2-42
2-42
2-44
2-44
2-44
2-46
2-46
2-48
2-48
2-50
2-52
2-52
2-52
2-54
2-54
2-54
2-54

List of Illustrations
Figure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Title

' ACT8818 GC Package ......................... .
' ACT881 8 FN Package ......................... .
' ACT881 8 Logic Symbol ........................ .
' ACT881 8 Functional Block Diagram ............... .
Continue .................................... .
~ontinue and Pop ............................. .
Continue and Push ............................ .
Branch Example 1 ............................. .
Branch Example 2 ............................. .
Sixteen-Way Branch ........................... .
Conditiohal Branch ............................ .
Three-Way Branch ............................. .
Thirty-Two Way Branch ......................... .
Repeat ....................•.................
Repeat on Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Repeat Until CC = H ........................... .
Loop Until Zero ............................... .
Conditional Loop Until Zero (Example 2) ............. .
Jump to Subroutine ............................ .
Conditional Jump to Subroutine ................... .
Two-Way JUnip to Subroutine .................... .
Return from Subroutine ......................... .
Conditional Return from Subroutine ................ .
Clear Pointers ................................ .

Page

2-14
2-16
2-17
2-27
2-41
2-41
2-41
2-43
2-43
2-43
2-45
2-45
2-45
2-46
2-47
2-49
2-49
2-51
2-53
2-53
2-53
2-55
2-55
2-56

2-7

CO

or-

CO
CO
I(,)
c:(

~
.....

Z

CJ)

en
2

-...J
~

l>
(")

-f

00
00
...Ii

00

2-8

List of Tables
Table
1
2
3
4

5

6
7
8

9
10
11

Title
'ACT8818 Pin Grid Allocation ....................
' ACT881 8 Pin Functional Description ...............
Response to Control Inputs ......................
Y Output Controls (MUX2-MUXO) .................
Stack Controls (S2-S0) .........................
Register Controls (RC2-RCO) .....................
Decrement and Branch on Nonzero Encodings ........
Call Encodings without Register Decrements .........
Call Encodings with Register Decrements ............
Return Encodings without Register Decrements .......
Return Encodings with Register Decrements ..........

Page
.
.
.
.
.
.
.
.
.
.
.

2-15
2-18
2-26
2-32
2-33
2-33
2-36
2-37
2-38
2-38
2-39

2-9

~
CO
CO
~

()

~

"""
Z

CIJ

2-10

Introduction
The SN 7 4ACT8818 microsequencer is a low-power, high-performance microsequencer
implemented in TI's EPICT. Advanced CMOS technology. The 16-bit device addresses
up to 64K locations of microprogram memory and is compatible with the SN74AS890
microsequencer.
The 'ACT8818 performs a range of sequencing operations in support of TI's family 00
of building block devices and special-purpose processors such as the SN74ACT8847 ~
Floating Point Unit (FPU).
~

I-

Understanding the ' ACT8818 Microsequencer

U

The' ACT8818 microsequencer is designed to control execution of microcode in a
microprogrammed system. Basic architecture of such a system usually incorporates
at least the microsequencer, one or more processing elements such as the' ACT8847
FPU or the SN74ACT8832 Registered ALU, microprogram memory, microinstruction
register, and status logic to monitor system states and provide status inputs to the
microsequencer.
The' ACT8818 combines flexibility and high speed in a microsequencer that performs
multiway branching, conditional subroutine calls, nested loops, and a variety of other
microprogrammable operations. The' ACT8818 can also be cascaded for providing
additional register/counters or addressing capability for more complex microcoded
control functions.
In this microsequencer, several sources are available for microprogram address
selection. The primary source is the 16-bit microprogram counter (MPCl, although
branch addresses may be input on the two 1 6-bit address buses, ORA and ORB. An
address input on the ORA bus can be pushed on the stack for later selection.
Register/counters RCA and RCB can store either branch addresses or loop counts as
needed, either for branch operations or for looping on the stack.
The selection of address source can be based on external status from the device being
controlled, so that three-way or multiway branching is supported. Once selected, the
address which is output on the Y bus passes to the microprogram memory, and the
microinstruction from the selected location is clocked into the pipeline register at the
beginning of the next cycle.
It is also possible to interrupt the' ACT881 8 by placing the Y output bus in a highimpedance state and forcing an interrupt vector on the Y bus. External logic is required
to place the bus in high impedance and load the interrupt vector. The first

EPIC is a trademark of Texas Instruments Incorporated.
2-11

«

'I:t

I'
Z
(J)

microinstruction of the interrupt handler subroutine can push the address from the
Interrupt Return register on the stack so that proper linkage is preserved for the return
from subroutine.

Microprogramming the 'ACT8818

~

Microinstructions for the' ACT8818 select the specific operations performed by the
Y output multiplexer, the register/counters RCA and RCB, the stack, and the
bidirectional DRA and DRB buses. Each set of inputs is represented as a separate field
in the microinstructions, which control not only the microsequencer but also the ALU
or other devices in the system.

-...J

The 3-port architecture of the 'ACT8818 facilitates both branch addressing and
register/counter operations. Both register/counters can be used to hold either loop
C") counts or branch addresses loaded from the DRA and DRB buses. Register/counter
-t operations are selected by control inputs RC2-RCO.
CX)
~

»
CX)

-' Similarly, the 65-word by 16-bit stack can save addresses from the DRA bus, the
CX) microprogram counter (MPC)' or the Interrupt Return register, depending on the settings
of stack controls S2-S0 and related control inputs. Flexible instructions such as Branch
DRA else Branch to Stack else Continue can be coded· to take advantage of the
conditional branching capability of the 'ACT8818.
Multiway branching (16- or 32-way) uses the B3-.80 inputs to set up a 16-way branch
address on DRA or DRB by concatenating B3-BO with the upper 12 bits of the DRA
or DRB bus. The resulting branch addresses DRA' (DRA 15-DRA4::B3-BO) and DRB'
(DRB15-DRB4::B3-BO) are selected by the Y output multiplexer controls MUX2-MUXO.
A Branch DRB' else Branch DRA' instruction can select up to 32 branch addresses,
as determined by the settings of B3-BO.

Design Support
TI's '8818 16-bit microsequencer is supported by a variety of tools developed to aid
in design evaluation and verification. These tools will streamline all stages of the design
process, from assessing the operation and performance of the '8818 to evaluating
a total system application. The tools include a functional model, behavioral model,
and microcode development software and hardware. Section 8 of this manual provides
specific information on the design tools supporting Tl's SN74ACT8800 Family.

2-12

Systems Expertise
Texas Instruments VLSI Logic applications group is available to help designers analyze
TI's high-performance VLSI products, such as the '8818 16-bit microsequencer. The
group works directly with designers to provide ready answers to device-related
questions and also prepares a variety of applications documentation.
The group may be reached in Dallas, at (214) 997-3970.

....00

00

00
l-

e.>

«
~
,.....

Z

CIJ

2-13

'ACT8818 Pin Grid Allocation
(TOP VIEW)
2
A
B

c
D

en

E

2

-...J
~

l>

G

~

H

C")

00
00

J

...,\

00

K

3

4

5

6

8

9

10 11

.
• • • • •
• • • • • • • • •
• • • • • • • • •

.~

• (!) •

• •
• • •
• • •
• • •

•
•
•
•

• •
• •
• •

• •
• •
• • • • • •
(!) • • • • • • • (!) •
• • • • • • • • •

• •
• • •
•

Figure 1. 'ACT8818.

2-14

7

. GC Package

Table 1. 'ACT8818 Pin Grid Allocation

PIN
NO.
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
C1

NAME
RC2
Y1
Y3
Y5
Y6
Y8
Y11
Y13
NC
DRB15
RC1
YO
Y2
Y4
YOE
Y9
Y12
Y14
Y15
ZEROIN
DRB14

PIN
NO.
C2
C3
C5
C6
C7
C9
C10
C11
D1
D2
D9
D10
D11
E1
E2
E3
E9
E10
E11
F1
F2

NAME
RCO
GND
GND
Y7
Y10
GND
VCC
RE
DRB12
DRB13
GND
COUT
INC
DRB9
DRB10
DRB11
INT
B3
B2
DRB7
DRB8

PIN
NO.
F3
F9
F10
F11
G1
G2
G3
G9
G10
G11
H1
H2
H10
H11
J1
J2
J3
J5
J6
J8
J9

NAME
RBOE
BO
B1
MUX2
DRB6
DRB5
GND
CLK
MUXO
MUX1
DRB4
DRB3
CC
ZEROUT
DRB2
DRB1
VCC
GND
RAOE
DRA1
GND

PIN
NO.
J10
J11
K1
K2
K3
K4
K5
K6
K7
K8
K9
K10
K11
L2
L3
L4
L5
L6
L7
L8
L9
L10

NAME
51
5TKWRN/RER
DRBO
5ELDR
DRA14
DRA12
DRA10
DRA7
DRA5
DRA3
DRAO
50
52
DRA15
DRA13
DRA11
DRA9
DRA8
DRA6
DRA4
DRA2
05EL

en
0ren
en
l-

e,)


(')
-t

00
00

.....

00

PIN

GC

FN

NAME

NO.

NO.

BO

F9

22

Bl

FlO

23

B2

Ell

24

B3

El0

25

ClK

G9

18

COUT

Dl0

28

DESCRIPTION

I/O

I

Input bits for branch addressing (see Table 3)

System clock
Incremerit~r

0

'carry-out. Goes high when an attempt is

made to il")crement microprogram counter beyond
addressable micromemory.

CC

Hl0

DRAO

K9

9

DRAl

J8

8

DRA2

19

7

DRA3

K8

6

DRA4

l8

5

DRA5

K7

4

DRA6

l7

3

DRA7

K6

2

DRA8

l6

84

stack or register/counter A (RAOE = 0) or inputs

DRA9

l5

83

external data (RAOE = 1).

DRA10

K5

82

DRAll

l4

80

DRA12

K4

79

DRA13

l3

78

DRA14

K3

77

DRA15

l2

76

DRBO

Kl

73

DRBl

J2

72

DRB2

Jl

71

DRB3

1i2
Hl

70

DRB4
DRB5

G2

69
67

DRB6

Gl

66

DRB7

Fl

65

DRB8

F2

63

DRB10

E2

61

2-18

15

I

I/O

Condition code

Bidirectional DRA data port. Outputs data from

Bidirectional DRB data port. Outputs data from
I/O

register/counter B
(RaOE = 0) or inputs external data

Table 2. 'ACT8818 Pin Functional Description (Continued)
PIN

GC

FN

NAME

NO.

NO.

I/O

DESCRIPTION

ORB11

E3

60

ORB12

01

59

ORB13

02

58

ORB14

C1

57

ORB15

B1

56

GNO

C3

10

co

GND

C5

30

GND

C9

33

GND

09

46

CO
CO
I-

GNO

G3

52

GND

J5

68

GNO

J9

81

INC

011

27

I

INT

E9

26

I

MUXO

G10

19

MUX1

G11

20

MUX2

F11

21

OSEL

L10

11

I

RAOE

J6

1

I

ORA output enable, active low

RBOE

F3

64

I

ORB output enable, active low

RCO

C2

55
I

Controls for register/counters A and B

I

INT RT register while a low input passes Y to INT RT

RC1

B2

54

RC2

A2

53

RE

C11

29

SO

K10

12

Bidirectional ORB data port. Outputs data from
I/O

register/counter B (RBOE = 0) or inputs external data
(RBOE = 1).

...

I

Ground pins. All pins must be used.

()

Incrementer control pin

z"""
en

«
~

Selects INT RT register to stack, active low (see
Table 3)
MUX control for Y output bus (see Table 4)
ORA output MUX select. Low selects RCA, high
selects stack.

INT RT register enable, active low. A high input holds
register (see Table 3).
S1

J10

13

S2

K11

14

SELDR

K2

75

I

J11

16

0

STKWRN/
RER
VCC

C10

31

VCC

J3

74

I

Stack controls
Selects data source to ORA bus and ORB bus (See
Table 3)
Stack warning signal flag
Supply voltage (5 V)

2-19

Table 2. 'ACT8818 Pin Functional Description (Concluded)
PIN

GC

FN

NAME

NO.

NO.

I/O

DESCRIPTION

YO

B3

Y1

A3

50

Y2

B4

49

Y3

A4

48

Y4

B5

47

Y5

A5

45

Y6

A6

44

~

Y7

C6

43

(")

Y8

A7

41

-I
CO
CO

Y9

B7

40

Y10

C7

39

Y11

A8

38

Y12

B8

37

Y13

A9

36

Y14

B9

35

Y15

B10

34

YOE

B6

42

I

ZEROIN

B11

32

I

Forces internal zero detect high

ZEROUT

H11

17

0

Outputs register/counter zero detect signal

en

2
......

l>

.....

CO

2-20

51

I/O

Bidirectional Y data port

Y output enable, active low

'ACT8818 Specification Tables
absolute maximum ratings over operating free air temperature range (unless
otherwise noted) t
Supply voltage, VCC . . . . . . . . . . . . . . . . . . . . . . . . . . . .. -0.5 V to 6 V
Input clamp current, ',K (V,VCC) ................ ±20 mA
Output clamp current, 10K (VO < 0 or Vo > V CC . . . . . .
± 50 mA
Continuous output current, 10 (VO = 0 to VCC) . . . . . .
± 50 mA
Continuous current through VCC or GND pins. . . . . . . .
± 100 mA
CO
Operating free-air temperature range. . . .
. . . . . . .. 0 DC to 70°C ....
Storage temperature range . . . . . . . . . . . . . . . . . . . . . . .. 65 DC to 1 50 DC CO
tStresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device.
These are stress ratings only and functional operation of the device at these or any other conditions beyond
those indicated under "recommended operating conditions" is not implied. Exposure to absolute maximum
rated conditions for extended periods may affect device reliability.
.

PARAMETER
Supply voltage

V,H

High-level input voltage

V,L

Low-level input voltage

IOH

High-level output current

IOL
V,

Input voltage

MIN

NOM

MAX

4.5
2
0

5

5.5

V

Vee

V

0.8
-8
8

mA
mA

Vee

V

Low-level output current

Va
dt/dv

Output voltage

TA

Operating free-air temperature

Input transition rise or fall rate

(.)

«
..t

"
en
Z

recommended operating conditions

Vee

CO
I-

0
0
0
0

Vee

15
70

UNIT

V

V
ns/V
°e

2-21

electrical characteristjcs over recommended operating free-air temperature
range (unless otherwise noted)
TA - 25°C
PARAMETER

..

TEST CONDITIONS

VCC
4.5 V

IOH = -201lA
VOH
IOH = -8 rnA

CJ)

:2

IOl = 20llA

-..J

VOL

~

»
("')

IOl = 8 rnA

-I
CO
CO

...

CO

TYP

MAX

MIN

TYP

MAX

UNIT

4.48

5.5 V

5.46

4.5 V

4.15

5.5 V

4.97

V

3.76
4.76

4.5 V

0.014

5.5 V

0.014

4.5 V

0.15

0.45

5.5 V

0.13

0.45
±1

IlA

98

200

Il A

II

VI = Vee or 0

5.5 V

lee

VI = Vee or 0

5.5 V

ei

VI = Vee or 0

5V

~Ieet

One input at 3.4 V, other
inputs at 0 or Vee

MIN

5.5 V

V

pF

3
1

rnA

tThis is the increase in supply current for each input that is at one of thE! specified TTL voltage levels rather
• than 0 V or Vee.

2-22

maximum switching characteristics

PARAMETER

(INPUT)

CC
CLK

tpd

TO

FROM

(OUTPUT)

V

ZEROUT

ORB

STKWRN

24

16

25

27
30 t
23

ORB15-0RBO

22

MUX2-MUXO

22

RC2-RCO

26

S2-S0

25

B3-BO

19

OSEL

25

ZEROIN

25

SELOR

23

23 t

18
19
ns
20

INC

20

Y

ten

16
16

RAOE

18

YOE
tdis

ns

17

RBOE
RAOE

COUT

23

ORA15-0RAO

YOE

UNIT

ORA

14
13

RBOE

ns
14

tOecrementing register/counter A or B and sensing a zero.

2-23

setup and hold times
PARAMETER

FROM (INPUT)

TO (OUTPUT)

CC

Stack

15

Stack

9

DRA15-DRAO

DRB15-DRBO

RCA

6

INT RT

9

RCB
INT RT

7

MPC
Stack

7

Stack

15

OSEl
B3-BO
SElDR
ZEROIN

RCA, RCB

6

INT RT

16

Stack

13

INT RT

13

Stack

12

INT RT

13

Stack

8

INT RT

14

Stack

10

INT RT

10

Stack

14

INT RT

13

Y

MPC

6

RE

INT RT (ClK)

7

MUX2-MUXO

INT RT

12

Any

Any

Input

Destination

th

UNIT

7

INT

S2-S0

MAX

11

INC

RC2-RCO

tsu

MIN

ns

0

ns

clock requirements
PARAMETER

MIN

MAX

UNIT

tw1

Pulse duration, clock low

7

ns

tw2

Pulse duration, clock high

9

ns

tc

Clock cycle time

33

ns

2-24

Architecture
The' ACT8818 microsequencer is designed with a 3-port architecture similar to the
bipolar SN74AS890 microsequencer. Figure 4 shows the architecture of the
'ACT8818. The device consists of the following principal functional groups:
1. A 16-bit microprogram counter (MPC) consisting of a register and
incrementer which generates the next sequential microprogram address
2. Two register/counters (RCA and RCB) for counting loops and iterations,
storing branch addresses, or driving external devices
3. A 65-word by 16-bit LIFO stack which allows subroutine calls and interrupts
at the microprogram level and is expandable and readable by external
hardware

CO

(')
-t

5. B3-BO, whose contents can replace the four least significant bits of the
ORA and ORB buses to support 16-way and 32-way branches

..
CO
CO

6. An external input onto the bidirectional Y port to support external
interrupts.

CO

Use of controls MUX2-MUXO is explained further in the later section on
microprogramming the' ACT8818.

Microprogram Counter.
Based on system status and the current instruction, the microsequencer outputs the
next execution address in the microprogram. Usually the incrementer adds one to the
address on the Y bus to compute next address plus one. Next address plus one is
stored in the microprogram register at the beginning of the subsequent instruction cycle.
During the next instruction, this 'continue' address will be ready at the Y output MUX
for possible selection as the source of the subsequent instruction. The incrementer
thus looks two addresses ahead of the address in the instruction register to set up
a continue /increment by one) or repeat (no increment) address.
Selecting INC from status is a convenient means of implementing instructions that
must repeat until some condition is satisfied; for example, Shift ALU Until MSB = 1,
or Decrement ALU Until Zero. The MPC is also the standard path to the stack. The
next address is pushed onto the stack during a subroutine call, so that the subroutine
will return to the instruction following that from which it was called.

Register/Counters
Addresses or loop counts may be loaded directly into register/counters RCA and RCB
through the direct data ports ORA 1 5-DRAO and ORB 1 5-DRBO. The values stored in
these registers may either be held, decremented, or read. Independent control of both
the registers during a single cycle is supported with the exception of a simultaneous
decrement of both registers.

2-28

Stack
The positive edge clocked 16-bit address stack allows multiple levels of nested calls
or interrupts and can be used to support branching and looping. Seven stack operations
are possible:
1. Reset, which pulls all Y outputs low and clears the stack pointer and read
pointer
2. Clear, which sets the stack pointer and read pointer to zero
3. Pop, which causes the stack pointer to be decremented
4. Push, which puts the contents of the MPC, interrupt return register, or
DRA bus onto the stack and increments the stack pointer
5.

Read, which makes the address indicated by the read pointer available
at the DRA port

6.

Hold, which causes the address of the stack and read pointers to remain
unchanged

7.

Load stack pointer, which inputs the seven least significant bits of DRA
to the stack pointer.

Stack Pointer
The stack pointer (SP) operates as an up/down counter; it increments whenever a push
occurs and decrements whenever a pop occurs. Although push and pop are two event
operations (store then increment SP, or decrement SP then read), the' ACT8818
performs both events within a single cycle.

Read Pointer
The read pointer (RP) is provided as a tool for debugging microcoded systems. It permits
a nondestructive, sequential read of the stack contents from the DRA port. This
capability provides the user with a method of backtracking through the address
sequence to determine the cause of overflow without affecting program flow, the status
of the stack pointer, or the internal data of the stack.

Stack Warning/Read Error Pin
A high signal on the STKWRN/RER pin indicates a potential stack overflow or underflow
condition. STKWRN/RER becomes active under two conditions. If 62 of the 65 stack
locations (0-64) are full (the stack pointer is at 62) and a push occurs, the STKWRN/RER
pin outputs a high signal to warn that the stack is approaching its capacity and will
be full after two more pushes.
The STKWRN/RER signal will remain high if hold, push or pop instructions occur, until
the stack pointer is decremented to 62. If a push instruction is attempted when the
stack is full, the new address will be ignored and the old address in stack location
64 will be retained.

2-29

The 5TKWRN/RER pin will go high when the stack pointer is less than ;or equal to one
and a pop or read from stack is coded on the 52-50 pins. The pin will go high after
reading the next to the bottom stack address (1). When the 52-50 pins are set to pop
or read the last address (0) or to pop or read an empty stack, the 5TKWRN/RER pin
. will go high. The pin depends only on the setting of the 52-50 pins and the stack pointer,
not on the clock.

Interrupt Return Register

en

:s

:t

Unlike the MPC register, which normally gets next address plus one, the interrupt return
register simply gets next address. This permits interrupts to be serviced with zero
latency, since the interrupt vector replaces the pending address.

The interrupting hardware disables the Y output and forces the vector onto the
microaddress bus. This event must be synchronized with the system clock. The first
---t address of the service routine must program INT low and perform a push to put the
00
00 contents of the intetrupt return register on the stack.
C')

~

00

2-30

Microprogramming the ' ACT8818
Microprogramming is unlike programming monolithic processors for several reasons.
First, the width of the microinstuction word is only partially constrained by the basic
signals required to control the sequencer. Since the main advantage of a
microprogrammed processor is speed, many operations are often supported by or
carried out in special purpose hardware. Lookup tables, extra registers, address
generators, elastic memories, and data acquisition circuits may also be controlled by
the microinstruction.
The number of slices in a bit-slice ALU is user-defined, which makes the microinstruction
width even more application dependent. Types of instructions resulting from
manipulation of the sequencer controls are discussed below. Examples of some
commonly used instructions can be found in the later section of microinstructions and
flow diagrams. The following abbreviations are used in the tables in this section:
BR A
BR A'
BR B
BR B'
BR S
CALL A
CALL B
CALL A'
CALL B'
CALL S
CLR SP, RP
CONT/RPT
ORA
ORA'
ORB
ORB'
MPC
POP
PUSH

RCA
RCB
REAO
RESET
RP
SP
STK

00
or-

00
00
lt.)

«

Y Y Y
Y -

ORA
ORA'
ORB
ORB'
Y - STK
Y ~ ORA; STK - MPC; SP - SP + 1; RP - RP + 1
Y
ORB; STK - MPC; SP - SP + 1; RP - RP + 1
Y - ORA'; STK - MPC; SP - SP + 1; RP - RP + 1
Y - ORB'; STK - MPC; SP - SP + 1; RP - RP + 1
Y - STK; STK - MPC; SP - SP + 1; RP - RP + 1
SP - 0; RP - points to TOS register
Y - MPC + 1 if INC = H; Y - MPC if INC = L
Bidirectional data port (can be loaded externally or from RCA)
ORA 15-0RA4::B3-BO
Bidirectional data port (can be loaded externally or from RCB)
ORB15-0RB4::B3-BO
Microprogram counter
SP - SP - 1; RP - RP - 1
STK - operand; SP - SP + 1; RP - RP + 1
Register/counter A
Register/counter B
ORA - STK; RP - RP - 1; SP - SP - 1
Y - 0; SP - 0; RP - points to TOS register
Read pointer
Stack pointer
Stack

~

"Z

tJ)

2-31

Address Selection
V-output multiplexer controls MUX2-MUXO select one of eight 3-source branches as
shown in Table 4. The states of CC and ZERO determine which of the three sources
is selected as the next address. ZERO is set at the beginning of any cycle in which
a register/counter will decrement to zero. This applies to both internal ZERO and external
ZEROUT signals.
Table 4. Output Controls (MUX2-MUXO)

MUX2RESET
MUXO

tn

Z
.....

XXX
LLL
LLH
LHL
LHH
HLL
HLH
HHL
HHH

~

»
n
-I
CO
CO

......

CO

Yes
No
No
No
No
No
No
No
No

Y OUTPUT SOURCE
CC - L
ZERO - L ZERO - H CC - H
All Low
All Low All Low
STK
MPC
ORA
STK
MPC
ORB
STK
ORA
MPC
STK
ORB
MPC
ORA
ORB
MPC
ORB,:j:
ORA't
MPC
ORA
STK
MPC
ORB
STK
MPC

tORA 15-0RA4::B3-BO
*ORB15-0RB4::B3-BO

By programming CC high or low without decrementing registers, only one outcome
is possible; thus, unconditional branches or continues can be implemented by forcing
the condition code. Alternatively, CC can be selected from status, in which case Branch
A on Condition Code Else Branch B instructions are possible, where A and B are the
address sources determined by MUX2-MUXO.
Decrement and Branch on Nonzero instructions, creating loops that repeat until a
terminal count is reached, can be implemented by programming CC low and
decrementing a register/counter. If CC is selected from status and registers are
decremented, more complex iflstructions such as Exit on Condition Code or End or
Loop are possible.
When MUX2-MUXO = HLH, the B3-BO inputs can replace the four least significant
bits of ORA or ORB to create 16-Way branches or, when CC is based on status, to
create 32-way branches.

Stack Controls
As in the case of the MUX controls, each stack-control coding is a three-way choice
based on CC and ZERO (see Table 5). This allows push, pop, or hold stack operations
to occur in parallel with the aforementioned branches. A subroutine call is accomplished
by combining a branch and push, while returns result from coding a branch to stack
with a pop.
2-32

Table 5. Stack Controls (S2-S0)
STACK OPERATION
S2-S0

OSEL

CC - L
ZERO = L
ZERO - H

CC .. H

LLL

X

Reset/Clear

Reset/Clear

Reset/Clear

LLH

X

Clear SP/RP

Hold

Hold

LHL

X

Hold

Pop

Pop

LHH

X

Pop

Hold

Hold

HLL

X

Hold

Push

Push

HLH

X

Push

Hold

Hold

HHL

X

Push

Hold

Push

HHH

H

Read

Read

Read

HHH

L

Hold

Hold

Hold

....

ex)
ex)
ex)

....
()

~

A branch or jump to a given microaddress can also be coded several ways. RCA, ORA,
RCB, ORB, and STK are possible sources for branch addresses (see Table 4). Branches
00 to register or stack are useful whenever the branch address could be stored to reduce
~ overhead.
00
The simplest branches are to ORA and ORB, since they require only one cycle and
the branch address is supplied in the microinstruction. Use of registers or stack requires
an initial load cycle (which may be combined with a preceding instruction). but may
be more practical when an entry point is referenced over and over throughout the
microprogram, for example, in error-handling routines. Branches to stack or register
also enhance sequencing techniques in which a branch address is dynamically
computed or multiple branches to a common entry point are used, but the entry point
varies according to the system state. In this case, the state change might require
reloading the stack or register.
In order to force a branch to ORA or ORB, CC must be programmed high or low. A
branch to stack is only possible when CC is forced low (see Table 4).
When CC is low, the ZERO flag is tested, and if a register decrements to zero the
branch will be transformed into a Decrement and Branch on Nonzero instruction.
Therefore, registers should not be decremented during branch instructions using
CC = 0 unless it is certain the register will not reach terminal count. Call (Branch and
Push MPC) instructions and Return (Branch to Stack and Pop) instructions are discussed
in later sections.

2-34

Conditional Branch Instructions
Perhaps the most useful of all branches is the conditional branch. The' ACT8818
permits three modes of conditional branching: Branch on Condition Code; Branch
16-Way from DRA or DRB; and Branch on Condition Code 16-Way from DRA Else
Branch 16-Way from DRB. This increases the versatility of the system and the speed
of processing status tests because both single-bit and 4-bit status are allowed.
Testing single bit status is preferred when the status can be set up and selected through
a status MUX prior to the conditional branch. Four-bit status allows the' ACT8818
to process instructions based on Boolean status expressions, such as Branch if Overflow
and Not Carry if Zero or if Negative. It also permits true n-way branches, such as If
Negative then Branch to X, Else if Overflow, and Not Carry then Branch to Y. The
tradeoff is speed versus program size. Since multiway branching occurs relatively
infrequently in most programs, users will enjoy increased speed at a negligible cost.
Call (Branch and Push MPC) instructions and Return (Branch to Stack and Pop)
instructions are discussed in later sections.

Loop Instructions
Up to two levels of nested loops are possible when both counters are used
simultaneously. Loop count and levels of nesting can be increased by adding external
counters if desired. The simplest and most widely used of the loop instructions is
Decrement and Branch on Nonzero, in which CC is forced low while a register is
decremented. As before, many forms are possible, since the top-of-Ioop address can
originate from RCA, DRA, RCB, DRB, or the stack (see Table 4). Upon terminal count,
instruction flow can either drop out of the bottom of the loop or branch elsewhere.
When loops are used in conjunction with CC as status, B3-BO as status and/or stack
manipulation, many useful instructions are possible, including Decrement and Branch
on Nonzero else Return, Decrement and Call on Nonzero, and Decrement and Branch
16-Way on Nonzero. Possible variations are summarized in Table 7. Call (Branch and
Push MPC) instructions and Return (Branch to Stack and Pop) instructions are discussed
in later sections.
Another level of complexity is possible if CC is selected from status while looping.
This type of loop will exit either because CC is true or because a terminal count has
been reached. This makes it possible, for example, to search the ALU for a bit string.
If the string is found, the match forces CC high. However, if no match is found, it
is necessary to terminate the process when the entire word has been scanned. This
complex process can then be implemented in a simple compact loop using Conditional
Decrement and Branch on Nonzero.

2·35

00
r00
00

t;


S2-S0

OSEl

CC - H
BR A
CALL A

(')

HLL

HLH

X

CALL A

CONT/RPT

~

HLL

HHL

X

CALL A

CONT/RPT

CALL B

HLH

HLH

X

CALL A' (16-way)

CONT/RPT

BR B' (16-way)

HLH

HHL

X

CALL A' (16-way)

CONT/RPT

CALL B' (16-way)

HHL

HLH

X

CALL A

BR S

CONT/RPT

HHL

HHL

X

CALL A

BR S

CONT/RPT: PUSH

HHH

HLH

X

CALL B

BR S

CONT/RPT

HHH

HHL

X

CALL B

BR S

CONT/RPT: PUSH

CO
CO
~

CO

Subroutine Returns
A return from subroutine can be implemented by coding a branch to stack with a pop.
Since pop is also conditional on CC and ZERO, the complex forms discussed previously
also apply to return instructions: Decrement and Return on Nonzero; Return on
Condition Code; Branch on Condition Code Else Return. Return encodings are
summarized in Tables 10 and 11.
Table 10. Return Encodings without Register
Decrements

2-38

MUX2-MUXO

S2-S0

OSEl

cc - L

LLL

LHH

X

RET

CC - H
BR A

LLH

LHH

X

RET

BR B

LHL

LHH

X

RET

CONT/RPT

LHH

LHH

X

RET

CONT/RPT

Table 11. Return Encodings with Register Decrements
MUX2-MUXO

S2-S0

OSEl

cc ZERO - l

l
ZERO = H

CC

= H

LLL

LHH

X

RET

CONT/RPT

BR A

LLH

LHH

X

RET

CONT/RPT

BR B

LHL

LHH

X

RET

BR A

CONT/RPT

LHH

LHH

X

RET

BR B

CONT/RPT

HHL

LHL

X

BR A

RET

CONT/RPT: POP

HHH

LHL

X

BR B

RET

CONT/RPT: POP

co
~

CO
~

Reset
Pulling the S2-S0 pins low clears the stack and read pointers, and zeroes the Y output
multiplexer (See Table 5).

«
(.)
~

"
en

Clear Pointers

2:

The stack and read pointers may be cleared without affecting the Y output multiplexer
by setting S2-S0 to LLH and forcing CC low (see Table 5).

Read Stack
Placing a high value on all of the stack inputs (S2-S0) and OSEL places the' ACT8818
into the read mode. At each low-to-high clock transition, the address pointed to by
the read pointer is available at the ORA port and the read pointer is decremented. The
bottom of the stack is detected by monitoring the stack warning/read error pin
(STKWRN/RER). A high appears on the STKWRN/RER output when the stack contains
one word and a read instruction is applied to the S2-S0 pins. This signifies that the
last address has been read.
The stack pointer and stack contents are unaffected by the read operation. Under
normal push and pop operations, the read pointer is updated with the stack pointer
and contains identical information.

Interrupts
Real-time vectored ihtern,ipt routines are supported for those applications where polling
would impede system throughput. Any instruction, including pushes and pops, may
be interrupted. To process an interrupt, the following procedure should be followed:
1. Place the bidirectional Y bus into a high-impedance state by forcing YOE high.
2. Force the interrupt entry point vector onto the Y bus. INC should be high.
3. Push the current value in the Interrupt Return register on the stack as the
execution address to return to when interrupt handling is complete.
The first instruction of the interrupt routine must push the address stored in the interrupt
return register onto the stack so that proper return linkage is maintained. This is
accomplished by setting INT and B1 low and coding a push on the stack.
2-39

Sample Microinstructions for the ' ACT8818
Representative examples of instructions using the' ACT8818 are given below. The
examples assume a one-level pipeline system, in which the address and contents of
the next instruction are being fetched while the current instruction is being executed,
and an ALU status register contains the status results of the previous instruction.

en
~

-'="
»

(')

-I

00
00

...a

00

Since the incrementer looks two addresses ahead of the address in the instruction
register to set up some instructions such as continue or repeat, a set-up instruction
has been included with each example. This shows the required state of both INC and
CC. CC must be set up early because the status register on which V-output selection
is typically based contains the results of the previous instruction.
Flow diagrams and suggested code for the sample microinstructions are also given
below. Numbers inside the circles are microword address locations expressed as
hexadecimal numbers. Fields in microinstructions are binary numbers except for inputs
on ORA or ORB, which are also in hexadecimal. For a discussion of sequencing
instructions, see the preceding section on microprogramming.

Continue
To Continue (Instruction 10)' INC and CC must be programmed high one cycle ahead
of instruction 10 for pipelining.
Address
(Set-up)
10

Instruction
Continue

MUX2-MUXO S2-S0 R2-RO OSEL
XXX
110

XXX
111

XXX
XXX

X
0

CC
1
X

INC
X

ORA

ORB

XXXX XXXX
XXXX XXXX

Continue and Pop
To Continue and decrement the stack pointer (Pop), INC and CC are forced high in
the previous instruction.
Address

Instruction

(Set-up)
10
Continue/Pop

MUX2-MUXO S2-S0 R2-RO OSEL
XXX
110

XXX
010

XXX
XXX

X
X

CC

INC

1
X

X

ORA

ORB

XXXX XXXX
XXXX XXXX

Continue and Push
To Continue and push the microprogram counter onto the stack (Push), INC and CC
are forced high one cycle ahead of Instruction 10 for pipelining.
Address

Instruction

(Set-up)
10 Continue/Push

2-40

MUX2-MUXO S2-S0 R2-RO OSEL
XXX
110

XXX
100

XXX
XXX

X
0

CC
1
X

INC
X

ORA

ORB

XXXX XXXX
XXXX XXXX

>-----

IMPOSSIBLE

co

....

CO
CO
I-

U

«
~

"Z

CJ)

Figure 5. Continue

Figure 6. Continue and Pop

Figure 7. Continue and Push

2-41

Branch (Example 1)
To Branch from address 10 to address 20, CC must be programmed high one cycle
ahead of Instruction 10 for pipelining.
Address

Instruction

(Set-up)
10

BR A

MUX2-MUXO 52-SO

xxx

xxx

000

111

R2-RO

05EL

CC

INC

ORA

ORB

XXX
XXX

x

1

o

X

X
X

XXXX
0020

XXXX
XXXX

Branch (Example 2)

en

::i

To Branch from address 10 to address 20, CC is programmed low in the previous
instruction; as a result, a ZERO test follows the condition code test in Instruction 10.
~ To ensure that a ZERO = H condition will not occur, registers should not be
-t decremented during this instruction.
~

CO
CO

Address
(Set-up)
10

Instruction
BR A

MUX2-MUXO 52-SO
XXX
110

XXX

111

R2-RO 05EL
XXX
000

CC

INC

ORA

ORB

X

o

o

X

X
X

XXXX
0020

XXX X
XXXX

Sixteen-Way Branch
To Branch l6-Way, CC is programmed high in the previous instruction. The branch
address is derived from the concatenation DRB15-DRB4::B3-BO.
Address

Instruction

(Set-up)
10

BR B'

2-42

MUX2-MUXO 52-SO
XXX
101

XXX
111

R2-RO 05EL
XXX
XXX

CC

INC

X

X
X

X

o

DRA

ORB

XXXX XXXX
XXX X 0040

........_ - IMPOSSIBLE

ex>
ex>
~

>-_H_-IMPOSSIBLE*

~

(.)

~

-=:t

'"
Z
en
"no register decrement

Figure 8. Branch Example 1

Figure 9. Branch Example 2

Figure 10. Sixteen-Way Branch

2-43

Conditional Branch
To Branch to address 20 Else Continue to address 11, INC is set high in the preceding
instruction to set up the Continue.
Address
(Set-up)
10

en
2

"l>
~

(")

-I
00
00

Instruction

MUX2-MUXO 52-SO R2-RO OSEL CC' INC
XXX
110

BR A else
Continue

xxx
111

XXX
000

X

x

o

X

X

ORA

ORB

XXXX XXXX
0020 XXXX

Three-Way Branch
To Branch 3-Way, this example uses an instruction from Table 7 with BR A in the
ZERO = L column, CONT/RPT in the ZERO = H column and BR B in the CC = H
column. To enable the ZERO = H path, register A must decrement to zero during this
instruction (see Table 6 for possible register operations). INC is programmed high in
Instruction 10 to set up the Continue.

.-.

00

Address
(Set-up)
10
11

Instruction

MUX2-MUXO 52-SO R2-RO OSEL CC INC

Continue and
Load Reg A
Decrement Reg A;
Branch 3-Way

XXX

XXX

XXX

X

110

111

010

0

t

100

111

001

0

X

ORA

ORB

XXXX XXXX
XXXX XXXX
X

0020 0030

tSelected from external status

Thirty-Two-Way Branch
To Branch 32-Way, the four least significant bits of the ORA' and ORB' addresses
must be input at the B3-BO port; these are concatenated with the 12 most significant
bits of ORA and ORB to. provide new addresses ORA' (ORA 15-0RA4::B3-BO) and ORB'
(ORB15-0RB4::B3-BO).
Address

Instruction

(Set-up)
10

32-way Branch

2-44

MUX2-MUXO 52-SO R2-RO OSEL CC INC

XXX

XXX

101

111

XXX
000

X
0

X
X

X

ORA

ORB

XXXX XXXX
0040 0030

H IMPOSSIBLE"

....

ex)

ex)
ex)

l-

t.)

«
"d'
,....
Z

(J)
• no register decrement

Figure 11. Conditional Branch

Figure 12. Three-Way Branch

*no register decrement

Figure 13. Thirty-Two-Way Branch

2-45

Repeat
To Repeat (Instruction 10), INC must be programmed low and CC high one cycle ahead
of Instruction 10 for pipelining.

en

Address

Instruction

(Set-up)
10

Continue

MUX2-MUXO 52-SO R2-RO OSEL CC INC
XXX
110

XXX
111

XXX
XXX

a

X

a

X

X

ORA

ORB

XXXX XXXX
XXX X XXXX

Repeat on Stack

:2 To Continue and push the microprogram counter onto the stack (Push), INC and CC

......
~

must be forced high one cycle ahead for pipelining .

»
To Repeat (Instruction 12), an BR S instruction with ZERO =
("')

L is used. To avoid a
ZERO = H condition, registers are not decremented during this instruction (see Table 6
-I
CO for possible register operations. CC and INC are programmed high in Instruction 12
CO to set up the Continue in Instruction 11.
-'

(X)

Address

Instruction

(Set-up)
10
11
12

Continue/Push
Continue
BR Stack

INC-O

MUX2-MUXO 52-SO R2-RO OSEL CC INC
XXX
110
110
010

XXX
100
111
111

~-MIIIf---t

XXX
XXX
XXX
000

X
X

a
a

1

a

CC-1

>-...;;L'---_ IMPOSSIBLE

Y-MPC

Figure 14. Repeat

2-46

1
X

ORA

ORB

XXXX
XXXX
XXXX
XXXX

XXXX
XXXX
XXXX
XXXX

'no register decrement

Figure 15. Repeat on Stack

2-47

Repeat Until CC = H
To Continue and push the microprogram counter onto the stack (Push), INC and CC
must be forced high one cycle ahead for pipelining.
To Repeat Until CC = H (Instruction 12), use a BR S instruction with CC = Land
CONT/RPT: POP instruction with CC = H. To avoid a ZERO = H condition, registers
are not decremented (See Table 6 for possible register operations). CC and INC are
programmed high iii Instruction 12 to set up the Continue in Instruction 11. A
consequence of this is that the instruction following 1 3 cannot be conditional.
fJ)

2

-...I

Address

»
("")

(Set-up)
10
11
12

~

-t

00
00
~

00

Instruction
Continue/Push
Continue
BR Stack else
Continue

MUX2-MUXO 52-SO R2-RO OSEL CC INC
XXX
110
110

XXX
100
111

xxx
XXX
XXX

X
X
0

010

010

000

X

1
t

ORA

ORB

XXXX XXXX
XXXX XXXX
XXXX XXXX
XXXX XXXX

t Selected from external status

Loop Until Zero
To Continue and push the microprogram counter onto the stack (Push), INC and CC
are forced high one cycle ahead for pipelining. Register A is loaded with the loop counter
using a Load A instruction from Table 6.
To decrement the loop count, a decrement register A and hold register B instruction
from Table 6 is used. To Repeat Else Continue and Pop (decrement the stack pointer),
an instruction from Table 7 with BR S in the ZERO = L column and CONT/RPT: POP
in the ZERO = H column is used. CC is programmed low in Instruction 11 to
force the ZERO test in Instruction 12; it is programmed high in Instruction 12 to set
up the Continue in Instruction 11.
Address
(Set-up)
10
11
12

2-48

Instruction
Continue/Push
Continue/Load
Reg A
Decrement Reg A;
BR 5 else
Continue: Pop

MUX2-MUXO 52-SO R2-RO OSEL CC INC
XXX
110

XXX
100

XXX
XXX

X
0

110

111

010

0

000

010

001

ORA

ORB

XXXX XXXX
XXXX XXXX
0

XXXX XXXX

XXXX XXXX

00

.-

00
00

IU


~

CO

Address
(Set-up)
10

Instruction
Call A else
Continue

MUX2-MUXO 52-SO R2-RO OSEL CC' INC
XXX

XXX

XXX

X

t

110

101

000

X

X

ORA

ORB

XXXX XXXX
X

0020

XXXX

t Selacted from external status

Two-Way Jump to Subroutine
To perform a Two-Way Call to Subroutine at address 20 or address 30, this example
uses an instruction from Table 8 with CALL A in the CC = L column and CALL B
in the CC = H column. In this example, CC is generated by external status during
the preceding (set-up) instruction. INC is programmed high in the preceding instruction
to set up the Push. To avoid a ZERO = H condition, registers should not be decremented
during Instruction 10.
Address
(Set-up)
23

Instruction
Call A else
Call B

t Selected from external status

2-52

MUX2-MUXO 52-SO R2-RO OSEL

CC

XXX

XXX

XXX

X

t

100

110

000

X

X

INC

ORA

ORB

XXXX XXXX
X

0020

0030

ex>

....

ex>
ex>

.....

u

«
o::t
,.....
Z

Figure 19. Jump to Subroutine

(J)

*no register decrement

Figure 20. Conditional Jump to Subroutine

• no register decrement

Figure 21. Two-Way Jump to Subroutine

2-53

Return from Subroutine
To Return from a subroutine, this example uses an instruction from Table 10 with RET
in the CC = L column. CC is programmed low in the previous instruction. To
avoid a ZERO = H condition, registers are not decremented during Instruction 23.

en
2

Address

Instruction

(Set-up)
23

Return

MUX2-MUXO S2-S0 R2-RO OSEL
XXX
010

xxx
011

XXX
000

CC

INC

ORA

ORB

x

o

X

X

X
X

XXXX
XXXX

XXXX
XXXX

Conditional Return from Subroutine

~ To conditionally Return from a Subroutine, this example uses an instruction from
:; Table 10 with RET in the CC = L column and CONT/RPT in the CC = H column.
n CC is selected from external status in the previous instruction. To avoid a ZERO = H
~ condition, registers are not decremented during Instruction 23.

CO
~

CO

Address
(Set-up)
23

Instruction

MUX2-MUXO S2-S0 R2-RO QSEL

Return else
Continue

CC

XXX

XXX

XXX

X

t

010

011

000

X

X

INC

ORA

ORB

XXX X XXX X
X

XXXX

XXXX

t Selected from external status

Clear Pointers
To Continue (Instruction 10), INC must be high; CC must be programmed high in the
previous instruction. To Clear the Stack and Read Pointers and Branch to address 20
(instruction 11), CC is programmed low in instruction 10 to set up the Branch. To avoid
a ZERO = H condition, registers are not decremented during Instruction 11.
Address
(Set-up)
10
11

Instruction
Continue
BR A and Clear
SP/RP

MUX2-MUXO S2-S0 R2-RO OSEL CC INC ORA
ORB
XXX
XXX
X
XXXX XXXX
XXX
110
111
XXX
X 0020 XXXX
0
0
110

001

000

X

X

X

XXXX XXXX

Reset
To Reset the' ACT8818, pull the S2-S0 pins low. This clears the stack and read pointers
and places the Y bus into a low state.
Address

Instruction

10

Reset

2-54

MUX2-MUXO S2-S0 R2-RO OSEL CC INC
XXX

000

XXX

X

X

X

ORA

ORB

XXXX XXXX

00

.00
00

l-

e,)



C')

-I

00
00
W
N

3-2

SN74ACT8832
CMOS 32·8it Registered ALU
•

50-ns Cycle Time

•

low-Power EPICTM CMOS

•

Three-Port 1/0 Architecture

•

64-Word by 36-Bit Register File

•

Simultaneous ALU and Register Operations

•

Configurable as Quad 8-Bit or Dual 16-Bit Single
Instruction, Multiple Data Machine

•

Parity Generation/Checking
The SN74ACT8832 is a 32-bit registered ALU that can operate at 20 MHz and
20 MIPS (million instructions per second), Most instructions can be performed
in a single cycle. The' ACT8832 was designed for applications that require highspeed logical, arithmetic, and shift operations and bit/byte manipulations.
The' ACT8832 can act as host CPU or can accelerate a host microprocessor.
In high-performance graphics systems, the 'ACT8832 generates display-list
memory addresses and controls the display buffer. In I/O controller applications,
the 'ACT8832 performs high-speed comparisons to initialize and end data
transfers.
A three-operand, 64-word by 36-bit register file allows the' ACT8832 to create
an instruction and store the previous result in a single cycle.

EPIC is a trademark of Texas Instruments Incorporated.

3-3

en
2

-...J
~

»
(")
-I
00
00
W
N

3-4

Contents
Page
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Understanding Microprogrammed Architecture ......... .
'ACT8832 Registered ALU ....................... .
Support Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Design Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Systems Expertise . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .
'ACT8832 Pin Descriptions ....................... .
'ACT8832 Specification Tables .................... .

3-13
3-13
3-13
3-14
3-15
3-15
3-16
3-25

'ACT8832 Registered ALU ...........................
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Architectural Elements ......................
Three-Port Register File ..................
Rand S Multiplexers ....................
Data Input and Output Ports ..............
ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ALU and MQ Shifters ...................
Bidirectional Serial I/O Pins ...............
MQ Register ..........................
Conditional Shift Pin ....................
Master/Slave Comparator ................
Divide/BCD Flip-Flops ...................
Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Input Data Parity Check .................
Test Pins ............................
Instruction Set Overview ........................
Arithmetic/Logic Instructions with Shifts .........
Other Arithmetic Instructions .................
Data Conversion Instructions ..................
Bit and Byte Instructions .....................
Other Instructions ..........................
Configuration Options .......................
Masked 32-Bit Operation .................
Shift Instructions ......................
Bit and Byte Instructions .................
Status Selection . . . . . . . . . . . . . . . . . . . . . . .

3-28
3-28
3-29
3-31
3-31
3-32
3-34
3-34
3-36
3-36
3-37
3-37
3-37
3-37
3-38
3-38
3-38
3-39
3-43
3-46
3-48
3-49
3-49
3-50
3-50
3-50
3-51
3-51

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3-5

N

~

CO

~

«

'lit

~

en

Contents (Continued)
Page

Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ABS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ADD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ADDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ANDNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13ADO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BAND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BCDBIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BINCNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BINCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BINEX3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BSUBR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BSUBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
BXOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DIVRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DNORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DUMPFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
EX3BC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
EX3C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INCNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INCNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LOADFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LOADMQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MOSLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MOSLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MOSRA ................ '...................
MQSRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NAND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PASS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CJ)

2

"l>

,f::I.

(")

-I
CO
CO

W
N

3-6

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3-52
3-53
3-55
3-57
3-59
3-61
3-63
3-65
3-67
3-70
3-72
3-74
3-76
3-78
3-80
3-82
3-84
3-85
3-88
3-90
3-92
3-94
3-96
3-99
3-101
3-103
3-105
3-107
3-109
3-111
3-113
3-115
3-117
3-119
3-121
3-123
3-125
3-127

Contents (Concluded)
Page
SDIVI .................................... .
SDIVIN ................................... .
SDIVIS ................................... .
SDIVIT ................................... .
SDIVO .................................... .
SDIVQF ................................... .
SEL ...................................... .
SETO ..................................... .
SET1 ..................................... .
SLA .....................................".
SLAD .................................... .
SLC ..................................... .
SLCD .................................... .
SMTC .................................... .
SMUll .................................... .
SMULT ................................... .
SNORM ................................... .
SRA ..................................... .
SRAD .................................... .
SRC ..................................... .
SRCD .................................... .
SRL ...................................... .
SRLD .................................... .
SUBI ..................................... .
SUBR .................................... .
SUBS .................................... .
TBO ...................................... .
TB1 ...................................... .
UDIVI .................................... .
UDIVIS ................................... .
UDIVIT ................................... .
UMULI ................................... .
XOR ..................................... .

3-129
3-131
3-133
3-135
3-137
3-139
3-141
3-143
3-145
3-147
3-149
3-151
3-153
3-155
3-157
3-159
3-161
3-163
3-165
3-167
3-169
3-171
3-173
3-175
3-177
3-179
3-181
3-183
3-185
3-187
3-189
3-191
3-193

3-7

N

('I)

00
00

....

u



II)

MICROINSTRUCTION BUS

-t

CO
CO
W
N

TESTED STATUS

STATUS

Figure 1. Microprogrammed System Block Diagram
The configuration of this processor enchances processing throughput in arithmetic
and radix conversion. Internal generation and testing of status results in fast processing
of division and multiplication algorithms. This decision logic is transparent to the user;
the reduced overhead assures shorter microprograms, reduced hardware complexity,
and shorter software development time.

Support Tools
Texas Instruments has designed a family of low-cost, real-time evaluation modules
(EVM) to aid with initial hardware and microcode design. Each EVM is a small selfcontained system which provides a convenient means to test and debug simple
microcode, allowing software and hardware evaluation of components and their
operation.
At present, the 74AS-EVM-8 Bit-Slice Evaluation Module has been completed, and
16- and 32-bit EVMs are in advanced stages of development. EVMs and support tools
for other devices in the' ACT8800 family are also planned for future development.

3-14

Design Support
Tl's '8832 32-bit registered ALU is supported by a variety of tools developed to aid
in design evaluation and verification. These tools will streamline all stages of the design
process, from assessing the operation and performance of the '8832 to evaluating
a total system application. The tools include a functional model, behavioral model,
and microcode development software and hardware. Section 8 of this manual provides
specific information on the design tools supporting TI's SN74ACT8800 Family.

Systems Expertise
Texas Instruments VLSI Logic applications group is available to help designers analyze
Tl's high-performance VLSI products, such as the '8832 32-bit registered ALU. The
group works directly with designers to provide ready answers to device-related
questions and also prepares a variety of applications documentation.
The group may be reached in Dallas, at (214) 997-3970.

3-15

, ACT8832 Pin Descriptions
Pin descriptions and grid allocations for the' ACT8832 are given on the following pages.
GB . .. PACKAGE
(TOP VIEW)

2
A
B
C
D

en
2

......
.a:=a.

»
n
-t

00
00
W
N

E
F

G
H

J
K

L
M
N
P

R
S
T

• •
• ••
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •
• •

3

4

5

6

7

8

9

10 11

12 13 14 15 16 17

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•

Figure 2. SN74ACT8832 . .. GB Package

3-16

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

...
32-BIT
REGISTERED
ALU
WRITE EN

[......,

CLK

CLK

INPUT
SELECT

CARRY IN

r:
SSF

RF
OPERAND
SELECT

EBO-EB1
S100-S103

REGISTER
FILE

DA31-DAO

ALU

I

"- ALU/MO
SPECIAL
SHIFTER
SHIFT
FUNCTION

I ·
·
··
I ·
·

CONFIGURA TlON

CF1

MODE

CF2

SELECT

AO

B PORTIO
READ·
ADDRESS •
SelECT 5

BO

WRITE 0
ADDRESS

~ MO REGISTER
ALU SHIFTER

I

SElMO

DA
PORT

PARITY
I/O

I

OUTPUT

A5

B5

co

:

SelECT 5

I

TEST PINS

TPO-TP1

SELRF1-SELRFO

A PORT 0
READ·
ADDRESS •
SELECT 5

"- SIO EN
CFO

WE3-WEO
RFCLK

CLK

DB
PORT

C5
PAO

N

PA1

(V)

PA3

00
00

PBO

U

PA2

....
«

PB1

~

PB2

'"

Z

PB3

SELECT

CJ)
0

10
11

PORi

12
INSTRUCTIONS

13
14
15

PARITY
STATUS

16
17

7

OEA

......

OEB

"-

OEYO-OEY3

......

OES

......

DAO

DA31

I

YO-Y31

EN

STATUS

,

·· ··· ~
·
0

31

PY1
PY2
PY3

DA BUS

PERRA

DB BUS

PERRB

Y BUS
MASTER/SLAVE
COMPARATOR

PERRY

DAO-DA31
DBO-DB31

PYO

MSERR

SIGN

N

CARRY-OUT

C

STATUS

ZERO
OVERFLOW
BYTE OVERFlOW

OVR
BY03-BYOO

r

~ ··· ···
0

31

~ ·· ··
· ·
0

IINSTRUCTlf)

31

DBO

DB31
YO

Y31

v

Figure 3. SN74ACT8832 . .. Logic Symbol
3-17

Table 1. SN74ACT8832 Pin Grid Allocation
PIN

CJ)

2

"»
~

o

-t

00
00
CAl
N

NO.
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
B15
B16
B17
C1

3-18

NAME

Y7
Y13
Y15
BYOF1
5103
5102
IE5101
IE5100
5100
N
OE5
55F
Y18
Y20
Y23
Y24
Y25
Y6
BYOFO
Y10
Y12
PY1
IE5103
IE5102
5101
Z

OVR
M5ERR
Y16
Y19
Y21
PY2
Y26
Y29
Y2

NO.
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
01
02
03
04
05
06
07
08
09
010
011
012
013
014
015
016
017
E1
E2

PIN
NAME

Y5
OEYO
Y9
Y11
Y14
OEY1
GNO
VCC
C
PERRY
Y17
Y22
OEY2
Y28
PY3
BYOF3
CF1
Y1
Y3
PYO
Y8
GNO
GNO
GNO
VCC
GNO
GNO
GNO
BYOF2
Y27
Y31
TP1
10
5ELMa
CFO

PIN

NO.
E3
E4
E14
E15
E16
E17
F1
F2
F3
F4
F14
F15
F16
F17
G1
G2
G3
G4
G14
G15
G16
G17
H1
H2
H3
H4
H14
H15
H16
H17
J1
J2
J3
J4
J14

NAME

YO
Y4
Y30
TPO
12
13
EB1
Cn
CLK
CF2
OEY3
11
14
16
OBO
EA
EBO
GNO
GNO
15
17
PA3
OB2
OB1
VCC
GNO
GNO
VCC
OA31
OA30
OB3
OB4
OB5
VCC
VCC

PIN

NO.
J15
J16
J17
K1
K2
K3
K4
K14
K15
K16
K17
L1
L2
L3
L4
L14
L15
L16
L17
M1
M2
M3
M4
M14
M15
M16
M17
N1
N2
N3
N4
N14
N15
N16
N17

NAME

OA28
OA27
OA29
OB6
OB7
OAO
GNO
GNO
OA24
OA25
OA26
PBO
OA2
VCC
GNO
GNO
VCC
OB30
PB3
OA1
OA4
OA7
GNO
PA2
OB26
OB28
OB31
OA3
OA6
OB9
OB13
OA19
OA23
OB25
OB29

PIN

NO.
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
P16
P17
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
R16
R17

NAME

OA5
OB8
OB12
OA9
OA15
A5
A1
VCC
GNO
C4
PERRB
GNO
OB22
OA16
OA18
OA22
OB27
PAO
OB11
PB1
OA11
PA1
A4
AO
WE2
VCC
B1
C2
OEB
OB18
OB21
PB2
OA20
OB24

PIN

NO.
51
52
53
54
55
56
57
58
59
510
511
512
513
514
515
516
517
T1
T2
T3
T4
T5
T6
T7

T8
T9
T10
T11
T12
T13
T14
T15
T16
T17

NAME

OB10
OB15
OA10
OA13
PERRA
A3
WEO
WE3
RFCLK
B4
B2
C3
CO
OB17
OB20
OB23
PA21
OB14
OA8
OA12
OA14
OEA
A2
WE1
5ELRF1
5ELRFO
B5
B3
BO
C5
C1
OB16
OB19
OA17

Table 2. SN74ACT8832 Pin Description
PIN
NAME

NO.

AO

R7

A1

P7

A2

T6

A3

56

A4

R6

A5

P6

BO

T12

B1

R10

1/0

DESCRIPTION

I

Register file A port read address select

I

Register file B port read address select

B2

511

B3

T11

B4

510

N
M

B5

T10

00
00

BYOFO

B2

BYOF1

A4

BYOF2

D13

BYOF3

C17

C

C10

CO

513

C1

T14

C2

R11

C3

512

C4

P10

C5

T13

CFO

E2

CF1

D1

CF2

F4

l-

0

5tatus signal representing carry out condition

I

Register file write address select

F2

I

AlU carry input

F3

I

Clocks synchronous registers on positive edge

DAO

K3
M1
l2

DA3

N1

DA4

M2

DA5

P1

DA6

N2

DA7

M3

DA8

T2

DA9

P4

I/O

"z

en

16-bit. or four 8-bit AlU's

Cn

DA2

~

Configuration mode select. single 32-bit. two

ClK
DA1



(")

-I
CO
CO
W

N

NAME

NO.

YO
Y1
Y2
Y3
Y4
Y5
Y6
Y7
Y8
Y9
Y10
Y11
Y12
Y13
Y14
Y15
Y16
Y17
Y18
Y19
Y20
Y21
Y22
Y23
Y24
Y25
Y26
Y27
Y28
Y29
Y30
Y31

E3
02
C1
03
E4
C2
B1
A1
05
C4
B3
C5
B4
A2
C6
A3
B12
C12
A13
B13
A14
B14
C13
A15
A16
A17
B16
014
C15
B17
E14
015
B9

Z

3-24

110

1/0

0

DESCRIPTION

Y port data bus

Output status signal represents zero condition

, ACT8832 Specification Tables
absolute maximum ratings over operating free-air temperature range
(unless otherwise noted) t
Supply voltage, vee. . . . . . . . . . . . . . . . . . . . . . . . . . . .. -0.5 V to 6 V
Input clamp current, 11K (VI < 0 or VI > Vee) . . . . . . . . . . . . .. ± 20 mA
Output clamp current, 10K (VO < 0 or Vo > Vee) .......... ± 50 mA
Continuous output current, 10 (VO = 0 to Vee) . . . . . . . . . . . .. ± 50 mA
Continuous current through Vee or GND pins. . . . . . . . . . . . .. ± 100 mA
Operating free-air temperature range. . . . . . . . . . . . . . . . . .. ooe to 70 0 e
Storage temperature range ............... . . . . . .. - 65 °e to 150 0 e
tStresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device.
These are stress ratings only and functional operation of the device at these or any other conditions beyond
those indicated under "recommended operating conditions" is not implied. Exposure to absolute-maxi mumrated conditions for extended periods may affect device reliability.

Table 3. Recommended Operating Conditions
PARAMETER
Vee Supply voltage
VIH High-level input voltage
VIL

Low-level input voltage

IOH

High-level output current

IOL

Low-level output current

VI

Input voltage

Output voltage
Vo
dt/dv Input transition rise or fall rate
TA

Operating free-air temperature

MIN

NOM

4.5
2
0

5.0

0
0
0
0

MAX

UNIT

5.5

V

Vee

V

0.8
-8
8

mA

Vee

V

Vee

15
70

V
mA
V
ns/V
°e

3-25

N
M
00
00

....

u



~

,

SI03-SIOO

T _\.

55F

I

N

PARITY

GENERATE

",:::E

& 'l

..

~,.

I

I

,

pya-pya

17-10
CF2-CFO

~'8

4

iESI03-IESloa

Oil

GND

I

4

VCC

,2

/'

9 I qJ

4

TP1-TPO

SHIFTER

...I

DIVIDE'
BCD

ClK

I

MQ
REGISTER

32

~

SELMa

t:J
32

Y31·YO

I

MASTERI
SLAVE
COMPARE

t

I

MSERR

Figure 4. 'ACT8832 32-Bit Registered ALU

3-30

4

~3

7 110

I

1

to

4
4

I

I

T

PARITY
COMPARE

EB1-ESO

I

~

1:

o831-0BO

32

I

k~ ~ ~ J'
32

t

PERRO

OEB

2

)-

,

PERRY

'f
.......

r-~

RFCLK

P03-POO

I

"" "o~ 1 .

3.

I

CHECK

1,

r< 1

ALU
/
SHIFTER

I

Co\)

f

L-i

PARITY

C5-CO
85-80

AlU

(")

-4

4

I

32

z
....,

•

I

1 '1
'-J

OA31-DAO

4

,

1

32

,

"

5
T
A
T
U
5

,

,

SELRF1SELRFO

36

36

2

WE3-WEO
C5-CO
A5-AO

PA3-PAO

REGISTER
FILE
64 X 36

BS-BO
RFCLK

4

4

PB3-PBO

32

EA

2

EB1-EBO

DA31-DAO
DB31-DBO

Cn

N

('\')

CO
CO
~

Co)



SELRF1

SELRFO

0

0

-t

0

1

External DB input

1

0

V-output MUX

1

1

External Y port

~

(')

CO
CO
W
N

SOURCE
External DA input

Rand S Multiplexers
ALU inputs are selected by the Rand S multiplexers. Controls which affect operand
selection for instructions other than those using constants or masks are shown in
Table 9.
Table 9. ALU Source Operand Selects

3-32

R-BUS

S-BUS

OPERAND

OPERAND

RESULT

SELECT

SELECT

DESTINATION

EA

EB1-EBO

-SOURCE OPERAND

0

R bus

-Register file addressed by A5-AO

1

R bus

-DA port

00

S bus

-Register file addressed by B5-BO

10

S bus

-DB port

X 1

5 bus

-MO register

Table 10. Destination Operand Select/Enables
REGISTER
FILE
WRITE
ENABLE
WE

Y BUS
OUTPUT
ENABLE
OEY

FILE

SELECT
MOSEL

0
0

0

0
0

0
0

0

0
0
0

1

X

X

X

X

X

1

REGISTER

YMUS

DA

DB

PORT

PORT

OUTPUT OUTPUT

SELECT
RFSEL 1-RFSELO

ENABLE

ENABLE

OEA

OEB

----

SOURCE

X
X

Y/PY
y/py

ALU shifter/parity generate

X

0
0

Y/PY, RF

ALU shifter/parity generate

Y/PY, RF

MQ register/parity generate

RF

External Y/PY

RF

External DA/PA

X

RF

External DB/PB

0

DA/PA

R bus register file output

DA/PA

Hi-Z

0

X

0
--

-

X

1
0
0

~-

RESULT
DESTINATION

---------

MQ register/parity generate

DB/PB

S bus register file output

DB/PB

Hi-Z

Co)

w
Co)

SN74ACT8832

Data Input and Output Ports
The DA and DB ports can be used to load the Sand/or R multiplexers from an external
source or to read S or R bus outputs from the register file. The Y port can be used
to load the register file and to output the next address selected by the Y output
multiplexer. Tables 9 and 10 describe the MUX and output controls which affect DA,
DB, and Y.

ALU
The ALU can perform seven arithmetic and six logical instruction's on the two 32-bit
operands selected by the Rand S multiplexers. It also supports multiplication, division,
normalization, bit and byte operations and data conversion, including excess-3 BCD
arithmetic. The' ACT8832 instruction set is summarized in Table 15.
(J)

The' ACT8832 can be configured to operate as a single 32-bit ALU, two 16-bit ALUs,

~

32-bit word formed by adding leading zeros to the 12 least significant bits of R bus

~ or four 8-bit ALUs (see Figures 6 and 7). It can also be configured to operate on a

l> data. This is useful in certain IBM relative addressing schemes.
(")
-I
CO
CO

W
N

4

SKrn~------~--------+---------~~---+----~

16

V31-V16

16

BVOF3

V15-VO

Figure 6. 16-Bit Configuration

3-34

Z. C. OVR. N

BVOF1

"16

16

,"JiJ
,I

5103-++-l..

n
s

T

QJ

A

IT

U
5

n
s

T

QJ

'i/" '----.g

n
S
T

A

A

T
U
5

T
U
5

4
4

5102

4

5101
5100

8

Y31-Y24

8

BYOF3

Y23-Y16

8

8

BYOF2

Y15-Y8

Figure 7. 8-Bit Configuration
w

W

0'1

SN74ACT8832

BYOF1

Y7-YO

f:J

Z, C, OVR, N

I.
BYOFO

OE5

Configuration modes are controlled by three CF inputs as shown in Table 11. These
signals also select the data from which status signals other than byte overflow will
be generated.
Table 11. Configuration Mode Selects
CONTROL INPUTS

CJ)

:2
-....I

MODE SELECTED

DATA FROM WHICH STATUS OTHER

CF2

CF1

CFO

0

0

0

Four a-bit

THAN BYOF WILL BE GENERATED
Byte 0

0

0

1

Four a-bit

Byte 1

0

1

0

Four a-bit

Byte 2

0

1

1

Four a-bit

Byte 3

1

0

0

Two 16-bit

Least significant 16-bit word
Most significant 1 6-bit word

1

0

1

Two 16-bit

1

1

0

One 32-bit

32-bit word

1

1

1

Masked 32-bit

32-bit word

~

»
n

-I ALU and MQ Shifters
CO

~
N

The ALU and MQ shifters are used in all of the shift, multiply, divide and normalize
functions. They can be used independently for single precision or concurrently for
double precision shifts. Shifts can be made conditional, using the Special Shift Function
(SSF) pin.

Bidirectional Serial lID Pins
Four bidirectional SID pins are provided to supply an end fill bit for certain shift
instructions. These pins may also be used to read bits that are shifted out of the ALU
or MQ shifters during certain instructions. Use of the SID pins as inputs or outputs
is summarized in Table 17.
The four pins allow separate control of end fill inputs in configurations other than 32-bit
mode (see Table 12 and Figure 4).
Table 12. Data Determining SID Input
SIGNAL

16-BIT MODE

a-BIT MODE

-

Byte 3

most significant word

Byte 2

SI01

-

-

Byte 1

SIOO

32-bit word

least significant word

Byte 0

SI03
SI02

3-36

CORRESPONDING WORD, PARTIAL WORD OR BYTE
32-BIT MODE

To increase system speed and reduce bus conflict, four SIO input enables
(lESI03-IESIOO) are provided. A low on these enables will override internal pull-up
resistor logic and force the corresponding SIO pins to the high impedance state
required before an input signal can appear on the signal line. If the SIO enables are not
used, this condition is generated internally in the chip. Use of the enables allow internal
decoding to be bypassed, resulting in faster speeds.
The IESIOs are defaulted to a high because of internal pull-up resistors. When an
SIO pin is used as an output, a low on its corresponding IESIO pin would force
SIO to a high impedance state. The output would then be lost, but the internal
operation of the chip would not be affected.

MQ Register
Data from the MQ shifter is written into the MQ register when a low-to-high transition
occurs on clock ClK. The register has specific functions in double precision shifts,
multiplication, division and data conversion algorithms and can also be used as a
temporary storage register. Data from the register file and the DA and DB buses can
be passed to the MQ register through the AlU.
The Y bus contains the output of the AlU shifter if SElMQ is low and the output of
the MQ register if SElMQ is high. If OEY is low, AlU or MQ shifter output will
be passed to the Y port; if OEY is high, the Y port becomes an input to the
feedback MUX.

Conditional Shift Pin
Conditional shifting algorithms may be implemented using the SSF pin under hardware
or firmware control. If the SSF pin is high or floating, the shifted AlU output will be
sent to the output buffers. If the SSF pin is pulled low externally, the AlU result will
be passed directly to the output buffers, and MQ shifts will be inhibited. Conditional
shifting is useful for scaling inputs in data arrays or in signal processing algorithms.

Master/Slave Comparator
A master/slave comparator is provided to compare data bytes from the Y output MUX
with data bytes on the external Y port when OEY is high. If the data are
not equal, a high signal is generated on the master slave error output pin (MSERR).
A similar comparator is provided for the Y parity bits.

Divide/BCD Flip-Flops
Internal multiply/divide flip-flops are used by certain multiply and divide instructions
to maintain status between instructions. Internal excess-3 BCD flip-flops preserve the
carry from each nibble in excess-3 BCD operations. The BCD flip-flops are affected
by all instructions except NOP and are cleared when a ClR instruction is executed.
The flip-flops can be loaded and read externally using instructions lOADFF and DUMPFF

3-37

C\I
M
CO
CO

I-

()


-I

W
N

3-40

SLC

Logical right single precision shift

Circular left single precision shift

Load MQ register
Pass ALU to Y

Table 15 .• ACT8832 Instruction Set (Continued)
GROUP 3 INSTRUCTIONS
INSTRUCTION BITS

17-10
(HEX)

MNEMONIC

08

SET1

18

SETO

Set bit 0

28

TB1

Test bit (one)
Test bit (zero)

FUNCTION
Set bit 1

38

TBO

48

ABS

58

SMTC

68

ADD I

Add immediate

78

SUBI

Subtract immediate

88

BADD

Byte add R to S

98

BSUBS

Byte subtract S from R

A8

BSUBR

Byte subtract R from S

B8

BINCS

Byte increment S

C8

BINCNS

08

BXOR

Byte XOR Rand S

E8

BAND

Byte AND Rand S

F8

BOR

Absolute value
Sign magnitude/two's complement

Byte increment negative S

Byte OR Rand S

3-41

Table 15. 'ACT8832 Instruction Set (Continued)
GROUP 4 INSTRUCTIONS
INSTRUCTION BITS

en

2
.;:.

.....

»

17-10
(HEX)

MNEMONIC

00
10
20
30
40
50
60
70
80
90

CRC

Cyclic redundancy character accumulation

SEL

Select S or R

FUNCTION

SNORM

Single length normalize

DNORM

Double length normalize

DIVRF
SDIVQF

Divide remainder fix
Signed divide quotient fix

SMUll

Signed multiply iterate

SMULT

Signed multiply terminate

SDIVIN

Signed divide initialize

SDIVIS

Signed divide start

AO

SDIVI

Signed divide iterate

("')

80

UDIVIS

Unsigned divide start

~

CO

UDIVI

Unsigned divide iterate

DO

UMULI

Unsigned multiply iterate

EO

SDIVIT

Signed divide terminate

FO

UDIVIT

Unsigned divide terminate

CO
CO
W
N

3-42

Table 15. 'ACT8832 Instruction Set (Continued)
GROUP 5 INSTRUCTIONS
INSTRUCTION BITS

17-10
(HEX)

MNEMONIC

OF

LOADFF

1F

CLR

Clear

2F

CLR

Clear

3F

CLR

Clear

4F

CLR

Clear

5F

DUMPFF

6F

CLR

7F

BCDBIN

BCD to binary

8F

EX3BC

Excess-3 byte correction
Excess-3 word correction

FUNCTION
Load divide/BCD flip-flops

Output divide/BCD flip-flops
Clear

9F

EX3C

AF

SDIVO

BF

CLR

Clear
Clear

Signed divide overflow test

CF

CLR

DF

BINEX3

EF

CLR

Clear

FF

NOP

No operation

Binary to excess-3

Group 1, a set of ALU arithmetic and logic operations, can be combined with the userselected shift operations in Group 2 in one instruction cycle. The other groups contain
instructions for bit and byte operations, division and multiplication, data conversion,
and other functions such as sorting, normalization and polynomial code accumulation.

Arithmetic/Logic Instructions with Shifts
The seven Group 1 arithmetic instructions operate on data from the Rand/or S
multiplexers and the carry-in. Carry-out is evaluated after ALU operation; other status
pins are evaluated after the accompanying shift operation, when applicable. Group 1
logic instructions do not use carry-in; carry-out is forced to zero.
Possible shift instructions are listed in Group 2. Fourteen single and double precision
shifts can be specified, or the ALU result can be passed unshifted to the MO register
or to the specified output destination by using the LOADMO or PASS instructions.
Table 16 lists shift definitions.
When using the shift registers for double precision operations, the least significant
half should be placed in the MO register and the most significant half in the ALU for
passage to the ALU shifter. An example of a double-precision shift using the ALU and
MO shifters is given in Figure 8.

3-43

SERIAL DATA
INPUT SIGNALS

SIOO_----,

Single Precision Logical Right Single Shift. 32·8it Configuration
SERIAL DATA
INPUT SIGNALS

SIOO..----.

Double Precision Logical Right Single Shift. 32·8it Configuration

Figure 8. Shift Examples, 32·Bit Configuration
All Group 2 shifts can be made conditional using the conditional shift pin (SSF). If the
SSF pin is high or floating, the shifted ALU output will be sent to the output buffers,
MO register, or both. If the SSF pin is pulled low, the ALU result will be passed directly
to the output buffers and any MO shifts will be inhibited.
Table 16. Shift Definitions
SHIFT TYPE
Left

NOTES
Moves a bit one position towards the most significant bit

Right

Moves a bit one position towards the least significant bit

Arithmetic right

Retains the sign unless an overflow occurs, in which case, the
sign would be inverted

Arithmetic left

May lose the sign bit if an overflow occurs. Zero is filled into
the least significant bit unless the bit is set externally

Circular right

Fills the least significant bit in the most significant bit position

Circular left

Fills the most significant bit in the least significant bit position

Logical right

Fills a zero in the most significant bit position unless the bit

Logical left

Fills a zero in the least significant bit position unless the bit

is forced to one by placing a zero on an SID pin
is forced to one by placing a zero on an SID pin

3·44

The bidirectional SIO pins can be used to supply external end fill bits for certain Group 2
shift instructions. When SIO is high or floating, a zero is filled, otherwise a 1 is filled
Table 17 lists instructions that make use of the SIO inputs and identifies input and
output functions.
Table 17. Bidirectional SIO Pin Functions
INSTRUCTION
BITS 17-10

510
MNEMONIC

1/0

0*

SRA
SRAD

0
0

Shift out

1*
2*

SRL

I

Most significant bit

(HEX)

DATA

Shift out

3*

SRLD

I

Most significant bit

4*

SLA

I

Least significant bit

5*

SLAD

I

Least significant bit

6*

SLC
SLCD

8*

SRC

9*

SRCD

A*

MOSRA

0
0
0
0
0

Shifted input to MO shifter

7*

Most significant bit

N
M
00
00
I-

U

Shifted input to MO shifter

«
~

Shifted input to ALU shifter

,....

Shifted input to ALU shifter

Z

Shift out

B*

MOSRL

I

C*

MOSLL

I

Least significant bit

D*

MOSLC

Shifted input to MO shifter
Least significant bit

00

CRC

0
0

20

SNORM

I

30

DNORM

I

Least significant bit

60

SMUll

ALUO

(/J

Internally generated end fill bit

70

SMULT

80

SDIVIN

90

SDIVIS

AO

SDIVI

BO

UDIVIS

CO

UDIVI

DO

UMULI

EO

SDIVT

FO

UDIVIT

0
0
0
0
0
0
0
0
0
0

7F

BCDBIN

I

Least significant bit

DF

BINEX3

0

Shifted input to MO register

ALUO
Internally generated end fill bit
Internally generated end fill bit
Internally generated end fill bit
Internally generated end fill bit
Internally generated end fill bit
Internal input
Internally generated end fill bit
Internally generated end fill bit

3-45

Other Arithmetic Instructions
The 'ACT8832 supports two immediate arithmetic operations. ADDI and SUBI
(Group 3) add or subtract a constimt between the values of 0 and 15 from an operand
on the S bus. The constant value is specified in bits A3-AO.
Twelve Group 4 instructions support serial division and multiplication. Signed, unsigned
and mixed multiplication are implemented using three instructions: SMUll, which
performs a signed times unsigned iteration; SMULT, which provides negative weighting
of the sign bit of a negative multiplier in signed multiplication; and UMULI, which
performs an unsigned multiplication iteration. Algorithms using these instructions are
given in Tables 18., 19, and 20. These include: signed multiplication, which performs
a two's complement multiplication; unsigned multiplication, which produces an
unsigned times unsigned product; and mixed multiplication which multiplies a signed
multiplicand by an unsigned multiplier to produce a signed result.

en

z

Table 18. Signed Multiplication Algorithm

"l>
~

OP

(')

CODE

-4

E4

LOADMQ

W
N

60
70

CO
CO

MNEMONIC

CLOCK

INPUT

INPUT

CYCLES

SPORT

R PORT

Multiplier

-

SMUll

1
N-1 t

Accumulator

Multiplicand

SMULT

1

Accumulator

Multiplicand

OUTPUT
YPORT
Multiplier
Partial product
Product (MSH) i

Table 19. Unsigned Multiplication Algorithm
OP
CODE

MNEMONIC

E4

LOADMQ

DO

UMULI

DO

UMULI

CLOCK

INPUT

INPUT

CYCLES

SPORT

R PORT

1
N-1 t
1

Multiplier
Accumulator
Accumulator

-

OUTPUT
Y PORT
Multiplier

Multiplicand

Partial product

Multiplicand

Product (MSH) i

Table 20. Mixed Multiplication Algorithm
OP
CODE

MNEMONIC

E4

LOADMQ

60
60

CLOCK

INPUT

INPUT

CYCLES

SPORT

R PORT

Multiplier

-

OUTPUT
YPORT
Multiplier

SMUll

1
N-1 t

Accumulator

Multiplicand

Partial product

SMUll

1

Accumulator

Multiplicand

Product (MSH) i

t N = 8 for quad 8-bit mode, 16 for dual 16-bit mode, 32 for 32-bit mode.
tThe least significant half of the product is in the MQ register.

3-46

Instructions that support division include start, iterate and terminate instructions for
unsigned division routines (UDIVIS, UDIVI and UDIVITI; initialize, start, iterate and
terminate instructions for signed division routines (SDIVIN, SDIVIS, SDIVI and SDIVITI;
and correction instructions for these routines (DIVRF and SDIVOFI. A Group 5
instruction, SDIVO, is available for optional overflow testing. Algorithms for signed
and unsigned division are given in Tables 21 and 22. These use a nonrestoring
technique to divide a 16 N-bit integer dividend by an 8 N-bit integer divisor to produce
an 8 N-bit integer quotient and remainder,. where N = 1 for quad 8-bit mode, N = 2
for dual 16-bit mode, and N = 4 for 32-bit mode.
Table 21. Signed Division Algorithm
OP
CODE

MNEMONIC

CLOCK

INPUT

CYCLES

SPORT
Dividend (LSH)

E4

LOADMQ

80

SDIVIN

AF

SDIVO

1
1
1

90

SDIVIS

AO

SDIVI

INPUT
R PORT

-

OUTPUT
Y PORT
Dividend (LSH)

Dividend (MSH)

Divisor

Remainder (N)

Remainder (N)

Divisor

Overflow Test

1

Remainder (N)

Divisor

Remainder (N)

N-2t

Remainder (N)

Divisor

Remainder (N)

N
M

en
en

IU

Result

EO

SDIVIT

Divisor

Remainder§

DIVRF

1
1

Remainder (N)

40

Remainder+

Divisor

Remainder'

50

SDIVQF

1

MQ register

Divisor

Quotient #

«
"d'

"Z

CIJ

tN = 8 for quad 8-bit mode, 16 for dual 16-bit mode, 32 for 32-bit mode.
tThe least significant half of the product is in the MO register.
§Unfixed
, Fixed (corrected)
#The quotient is stored in the MO register. Remainder can be output at the Y port or stored in
the register file accumulator.

Table 22. Unsigned Division Algorithm
OP
CODE

MNEMONIC

E4

LOADMQ

CLOCK

INPUT

CYCLES

SPORT

1

Dividend (LSH)

1

INPUT
R PORT
-

OUTPUT
Y PORT
Dividend (LSH)

Dividend (MSH)

Divisor

Remainder (N)

N-l t

Remainder (N)

Divisor

Remainder (N)

UDIVIT

1

Remainder (N)

Divisor

Remainder+

DIVRF

1

Remainder§

Divisor

Remainder§

BO

UDIVIS

CO

UDIVI

FO
40

tN = 8 in quad 8-bit mode, 16 in dual 16-bit mode, 32 in 32-bit mode
tUnfixed
.
§ Fixed Icorrected)

3-47

Data Conversion Instructions
Conversion of binary data to one's and two's complement can be implemented using
the INCNR instruction (Group 1). SMTC (Group 3) permits conversion from two's
complement representation to sign magnitude representation, or vice versa. Two's
complement numbers can be converted to their positive value, using ABS (Group 3).
SNORM and DNORM (Group 4) provide for normalization of signed, single- and doubleprecision data. The operand is placed in the MQ register and shifted toward the most
significant bit until the two most significant bits are of opposite value. Zeroes are shifted
into the least significant bit, provided 510 is high or floating. (A low on 510 will shift
a one into the least significant bit.) SNORM allows the number of shifts to be counted
and stored in one of the register files to provide the exponent.
(J)

2

......

Data stored in binary-coded decimal form can be converted to binary using BCD BIN
(Group 5). A routine for this conversion, given in Table 23, allows the user to convert
an N-digit BCD number to a 4N-bit binary number in 4N + 8 clock cycles .

~

:r>

Table 23. BCD to Binary Algorithm

C")

-4

CO
CO
W
N

OP

MNEMONIC

CODE

CLOCK

INPUT

INPUT

OUTPUT

CYCLES

SPORT

R PORT

DESTINATION

-

E4

LOADMQ

1

BCD operand

02

SUBR/MQSLC

1

Accumulator

Accumulator

Accumulator/MQ reg.

02

SUBR/MQSLC

1

Mask reg.

Mask reg.

Mask reg/MQ reg.

01

MQSLC

2

Don't care

Don't care

MQ reg.

68

ADDI (15)

1

Accumulator

Decimal 15

Mask reg.

Interim reg/MQ reg.

MQ reg.

REPEAT N-1 TIMES t
DA

AND/MQSLC

1

MQ reg.

Mask reg.

D1

ADD/MQSLC

1

Accumulator

Interim reg.

Interim reg/MQ reg.

7F

BCDBIN

1

Interim reg.

Interim res.

Accumulator/MQ reg.

7F

BCDBIN

1

Accumulator

Interim reg.

Accumulator/MQ reg.

1

MQ reg.

Mask reg.

Interim reg.

1

Accumulator

Interim reg.

Accumulator

END REPEAT
FA
D1

I

AND
ADD MQSLC

tN = Number of BCD digits

BINEX3, EX3BC, and EX3C assist binary to excess-3 conversion. Using BINEX3, an
N-bit binary number can be converted to an N/4- digit excess-3 number. For an
algorithm, see Table 24.

3-48

Table 24. BCD to Binary Algorithm
OP
CODE
E4
02
02

MNEMONIC

CLOCK

INPUT

INPUT

OUTPUT

CYCLES

SPORT

R PORT

DESTINATION

-

LOADMQ

1

Binary number

SUBR

1

Accumulator

Accumulator

Accumulator

MQ reg.

SET1 (33116

1

Accumulator

Mask (33116

Accumulator

REPEAT N TIMES t
OF

BINEX3

1

Accumulator

Accumulator

Accumulator/MQ reg

9F

EX3C

1

Accumulator

Internal data

Accumulator

ENO REPEAT
tN = Number of bits in binary number

N
~

Bit and Byte Instructions
Four Group 3 instructions allow the user to test or set selected bits within a byte.
SET1 and SETO force selected bits of a selected byte (or bytes) to one and zero,
respectively. TB1 and TBO test selected bits of a selected byte (or bytes) for ones
and zeros. The bits to be set or tested are specified by an 8-bit mask formed by the
concatentation of register file address inputs C3-CO and A3-AO. The register file
addressed by B5-BO is used as the destination operand for the set bit instructions.
Register writes are inhibited for test bit instructions. Bytes to be operated on are
selected by forcing SIOn low, where n represents the byte position and 0 represents
the least significant byte. A high on the zero output pin signifies that the test data
matches the mask; a low on the zero output indicates that the test has failed.
Individual bytes of data can also be manipulated using eight Group 3 byte
arithmetic/logic instructions. Bytes can be added, subtracted, incremented, ORed,
ANDed and exclusive ORed. Like the bit instructions, bytes are selected by forcing
SIOn low, but multiple bytes can be operated on only if they are adjacent to one another;
at least one byte must be nonselected.

Other Instructions
SEL (Group 4) selects one of the ALU's two operands, S or R, depending on the state
of the SSF pin. This instruction could be used in sort routines to select the larger or
smaller of two operands by performing a subtraction and sending the status result
to SSF. CRC (Group 4) is designed to verify serial binary data that has been transmitted
over a channel using a cyclic redundancy check code. An algorithm using this instruction
is given in Table 25.

3-49

~
(,)

~

~
Z

en

Table 25. CRC Algorithm
OP
CODE

MNEMONIC

CLOCK

INPUT

INPUT

OUTPUT

R PORT

DESTINATION

Polynomial g(x)

Poly reg.

1

SPORT
Vector c'(x)t

F6

·INCR

1

-

F2

SUBR

1

Accumulator

Accumulator

Accumulator

Accumulator

E4

LOADMQ

CYCLES

-

MQ reg.

REPEAT n/BN TIMESt

00

CRC

1

Accumulator

Poly reg.

E4

LOADMQ

1

Vector c'(x) t

-

MQ reg.

END REPEAT

en

tN = Number of bits in binary number
n = Length of the code vector

:2
-..J

t

CLR forces the ALU output to zero and clears the internal BCD flip-flops used in excess-3
BCD operations. NOP forces the ALU output to zero, but does not affect the flip-flops.

n

....

Configuration Options
00
00 The' ACT8832 can be configured to operate in 8-bit, 16-bit, or 32-bit modes, depending
eN
N on the setting of the configuration mode selects (CF2-CFO). Table 11 shows the control
inputs for the four operating modes. Selecting an operating configuration other than
32-bit mode affects ALU operation and status generation in several ways, depending
on the mode selected.

Masked 32-Bit Operation
Masked 32-bit operation is selected to reset to zero the 20 most significant bits of
the R Mux input. The 12 least significant bits are unaffected by the mask. Only Group
1 and Group 2 instructions can be used in this operating configuration. Status
generation is similar to unmasked 32-bit operating mode.

Shift Instructions
Shift instructions operate similarly in 8-bit, 16-bit, and 32-bit modes. The serial I/O
(SI03'-SI00') pins are used to select end-fill bits or to shift bits in or out, depending
on the operation being performed. Table 12 shows the SIO signals associated with
each byte or word in the different modes, and Table 17 indicates the specific function
performed by the SIO pins during shift, multiply, and divide operations.
Figures 9 and 10 present examples of logical right shifts in 16-bit and 8-bit
configurations.

3-50

SERIAL DATA
INPUT SIGNALS
SIOO-'~---------------------------------~

SI02-+--~L

Single Precision Logical Right Single Shift. 16-Bit Configuration
SERIAL DATA
INPUT SIGNALS
SIOO - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ,

Double Precision Logical Right Single Shift. 16-Bit Configuration

Figure 9. Shift Examples, 16-Bit Configuration
Bit and Byte Instructions
The' ACT8832 performs bit operations similarly in 8-bit, 16-bit, and 32-bit modes.
Masks are loaded into the R MUX on the A3-AO and C3-CO address inputs, and the
bytes to be masked are selected by pulling their 510' inputs low. Instructions which
set, reset, or test bits are explained later
Byte operations should be performed in 32-bit mode to get the necessary status
outputs. While byte overflow signals are provided for all four bytes (BYOF3-BYOFOI.
the other status signals (C, N, Z) are output only for the word selected with the
configuration control signals (CF2-CFO).
Status Selection
Status results (C, N, Z, and overflow) are internally generated for all words in all modes,
but only the overflow results (BYOF3-BYOFO) are available for all four bytes in 8-bit
mode or for both words in 16-bit mode. If a specific application requires that the four
status results are read for two or four words, it is possible to toggle the configuration

3-51

SERIAL DATA
INPUT SIGNALS
SIOO-.~-------------------------------------------------~
SI01~---------------------------------.
SI02~----------------~

,.-----.".,;----,

Single-Precision Logical Right Shift. 8-8it Configuration

SERIAL DATA
INPUT SIGNALS

SIOIo~--------------------------------------------------~
SI01~---------------------------------.
SI02~-----------------.

,.----rn;----,

en

SI03

Z

...,J
~

»

(")

-t

00
00

eN

N

Double-Precision Logical Right Shift. 8-8it Configuration

Figure 10. Shift Examples, 8-Bit Configuration
control signals (CF2-CFO) within the same clock cycle and read the additional status
results. This assumes that the necessary external hardware is provided to toggle
CF2-CFO and collect the status for the individual words before the next clock signal
is input.

Instruction Set
The' ACT8832 instruction set is presented in alphabetical order on the following pages.
The discussion of each instruction includes a functional description, list of possible
operands, data flow diagram, and notes on status and control bits affected by the
instruction. Microcoded examples are also shown.
Mnemonics and opcodes for instructions are given at the top of each page. Opcodes
for instructions in Groups 1 and 2 are four bits long and are combined into eight-bit
instructions which select combinations of arithmetic, logical, and shift operations.
Opcodes for the other instruction groups are all eight bits long.
An asterisk in the left side of the opcode box for a Group 1 instruction indicates that
a Group 2 opcode is needed to complete the instruction. An asterisk in the right side
of a box indicates that aGroup 1 opcode is required to combine with the Group 2
opcode in the left side of the box.

3-52

Absolute Value

ABS

I4 I8 I

FUNCTION
Computes the absolute value of two's complement data on the S bus.

DESCRIPTION
Two's complement data on the S bus is converted to its absolute value. The carry
must be set to one by the user for proper conversion. ABS causes S' + Cn to be
computed; the state of the sign bit determines whether S or S' + Cn will be selected
as the result. SSF is used to transmit the sign of S.
Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

..
A3-AO
Mask

No

No

No

No

Available S Bus Source Operands
RF
(85-80)
Yes

D8-Port

MQ
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (85-80)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

Control/Data Signals
Signal

User
Programmable

Use

SSF

No

Inactive

SiO'O

No

Inactive

SI01

No

Inactive

SI02

No

Inactive

SI03

No

Inactive

Cn

Yes

Should be programmed high for proper conversion.

3-53

1418

Absolute Value

ABS

Status Signals
ZERO

N
OVR

1 if result = 0
1 if MSB (input) = 1
1 if input of most significant byte is 80 (Hex) and inputs (if any) in all
other bytes are 00 (Hex).

C=1ifS=0

EXAMPLES (assumes a 32-bit configuration)
Convert the two's complement number in register 1 to its positive value and store
the result in register 4.

en
2:
-...I
~

»

Instr

Oprd

Oprd

Code
17-10

Addr
A5-AO

Addr
B5-BO

01001000

XX XXXX

000001

Oprd Sel

Dest

EB1-

Addr

EA EBO
X

00

Destination Selects

WE3- SELRF1-

C5-CO

SELMQ

-WEO

SELRFO

000100

0

0000

10

X

X

n

~ Example 1: Assume register file 1 holds F6D81340 (Hex):
CO
Source

11110110110110000001001101000000

Is+- RF(1)

Destination

00001001 00100111 1110 1100 11000000

I

~

RF(4)

+- S + Cn

Example 2: Assume register file 1 holds 09D527CO (Hex):
Source

00001001110101010010011111000000

Is+- RF(1)

Destination

00001001 1101 0101 00100111 1100 0000

I RF(4) +- S

3-54

CF2-

OEY3

0eA DEB 0eY0 DES
xxxx

0

Cn

CFO

1

110

ADD

Add with Carry (R + S + Cn)

1

FUNCTION
Adds data on the Rand S buses to the carry-in.

DESCRIPTION
Data on the Rand S buses is added with carry. The sum appears at the ALU and MQ
shifters.
·The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands

N

C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

M

..

CX)
CX)
~

A3-AO

()

Mask
Yes

No

Yes

«c:t

No

I"'-

Z

Available S Bus Source Operands

CJ)

MQ
RF
DB-Port
(B5-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

Yes

Yes

3-55

Add with Carry (R + S + Cn)

1

ADD

Control/Data Signals
User

Signal

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

SIOO

No

Inactive

SI01

No

Inactive

SI02

No

Inactive

SI03

No

Inactive

Cn

Yes

Increments sum if set to one.

instruction field.

(f) Status Signals t

2

-.J

if result = 0

ZERO

~

l>

N

(")

OVR

-t

C

1 if MSB = 1
1 if signed arithmetic overflow
if carry-out

=

1

CO
CO tc is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated
eN after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.
N
EXAMPLES (assumes a 32-bit configuration)
Add data in register 1 to data on the DB bus with. carry-in and pass the result to the
MQ register.
Instr

Oprd

Oprd

Code

Addr

Addr

17-10

A5-AO

B5·BO

1110 0001

00 0001

XX XXXX

Ioprd Sel
EB1-

Eli" EBO
a 10

Destination Selects

Dest
Addr

WE3- SELRF1-

C5·CO

SELMO

WEO

XX XXXX

a

1111

OEY3·

SELRFO OEA
10

CF2·

OEB

OEYO

OES

Cn

CFO

X

XXXX

a

a

110

X

Assume register file 1 holds 0802C618 (Hex and DB bus holds 1 E007530 (Hex):
Source

0000 1000 0000 0010 1100 0110 0001 1000

I R +- RF( 1)

Source

0001 1110 0000 0000 0111 0101 0011 0000

Is+- DB bus

Destination

0010 0110 0000 0011 0011 1011 0100 1000

MQ register

3-56

+-

R

+ S + Cn

ADDI

I6I8 I

ADD Immediate

FUNCTION
Adds four-bit immediate data on A3-AO with carry to S-bus data.

DESCRIPTION
Immediate data in the range 0 to 15, supplied by the user at A3-AO, is added with
carry to S.
Available R Bus Source Operands (Constant)
C3-CO
RF

A3-AO

..

DA-Port

(A5-AO) Immed

N
M
00
00

Mask
No

Yes

No

No

~

U

Available S Bus Source Operands



n

Logically AND the contents of register 3 and register 5 and store the result
in register 5.
.
Instr
Code
17-10

Op,d
Add,

Op,d
Add,

Op,d Sel

AS-AO

Bq-BO

EA EBO

11111010

000011

000101

EB1·
0

00

Dest
Add,

Destination Selects

WEj. SELRF1-

CS-CO

SELMQ

WED

000101

0

0000

SELRFO
10

CF2-

OEY3
OEA

X

0eB 0eY0 DeS
X

XXX X

0

Cn

CFO

X

110

~

~

Assume register file 3 holds F617D840 (Hex) and register file 5 holds 15F6D842 (Hex):

Co\)

Source

111101100001 0111 1101 100001000000

I

R - RF(3)

Source

0001 0101 1111 01101101 100001000010

I

S - RF(5)

Destination

0001 01000001 01101101 1000 0100 0000

I

RF(5) - RAND S

N

3-60

ANDNR

Logic AND Negative R (R' AND S)

*

I

E

FUNCTION
Computes the logical expression S AND NOT R.

DESCRIPTION
The logical expression S AND NOT R is computed. The result appears at the ALU and
MQ shifters.
"The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble 07-14) of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.
Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed
Yes

No

N

('I')

..
A3-AO

CO
CO

Mask

(.)

Yes

~

«q-

No

I"'"

Z

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

MQ
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

en

No

Shift Operations

Y-Port

ALU

MQ

Yes

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

5100

No

Inactive

instruction field.
5101

No

Inactive

5102

No

Inactive

5103

No

Inactive

Cn

No

Inactive

3-61

Logic AND Negative H (H' AND S)

ANONH

Status Signals t
ZERO
I

= 1 if result = 0
N = 0
OVR = 0
C = 0

t C is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Iflvert the contents of register 3, logically AND the result with data in register 5
and store the result in register 10.

en
2

-..J

.J:Io

»
n
-I

Inst,
Code
17-10
11111110

Op,d
Add,
A5-AO
000011

Op,d
Add,
B5-BO
000101

Op,d Sel
Dest
Destination Selects
EB1Addr
SELRF1EAEBO
C5-CO
SELMQ WEb SELRFO CiEA OEB
10
X
0 00
001010
0
0000
X

wea-

omOEYO

DES

XXXX

0

CF2Cn CFO
X 110

CO
CO Assume register file 3 holds 1 5F6D840 (Hex) and register file 5 hold F61 7D842 (Hex):
Co\)

N
Source

0001010111110110 110110oo010Qoooo

I R-

Source

1111 01100001 0111 1101 100001000010

I

S - RF(5)

Destination

11100010000000010000000000000010

I

RF(10) - RAND S

3-62

FlF(3)

BADD

Byte Add R to S with Carry

8

8

FUNCTION
Adds 8 with carry-in to a selected byte or selected adjacent bytes of R.

DESCRIPTION
8103-8100 are used to select bytes of R to be added to the corresponding bytes of
8. A byte of R with 810 programmed low is selected for the computation of
R + 8 + en. If the 810 signal for a byte of R is left high, the corresponding byte
of 8 is passed unaltered. Multiple bytes can be selected only if they are adjacent to
one another. At least one byte must be nonselected.
Available R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

MQ
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

Control/Data Signals
Signal

User

Use

Programmable
Inactive

SSF

No

5100

Yes

Byte select

5101

Yes

Byte select

5102

Yes

Byte select

5103

Yes

Byte select

Cn

Yes

Propagates through nonselected bytes; increments
selected byte(s) if programmed high.

3-63

18 18

BADD

Byte Add R to S with Carry

Status Signals
ZERO

N

1 if result (selected bytes) = 0

o
if signed arithmetic overflow (selected bytes)

OVR

if carry-out (most significant selected byte) = 1

C

EXAMPLE (assumes a 32-bit configuration)
Add bytes 1 and 2 of register 3 with carry to the contents of register 1 and store the
result in register 11.

en

z

-..J
~

l>

n

Instr

Oprd

Oprd

Oprd Sel

Dest

Code

Addr

Addr

EB1-

Addr

17-10

AS-AO

BS-BO

0100 1000 000011

000001

Eii EBO
0

00

Destination Selects

WE3-

SELRF1-

CS-CO

SELMa

WEo

SELRFO

001011

0

0000

10

om-

CF2-

0Eii Oeii OEYO 5Es
X

X

XXXX

0

Cn CFO
1

Si03- iESi03'SiOo IESiOO

110 1001

0000

Assume register file 3 holds 2C018181 (Hex) and registerfile 1 holds 7A8FBE3E (Hex):
Source

0010110000000001 10000001 10000001

I Rn'" RF(3)n

Source

011110101000 11111011111000111110

I

ALU

101001101001 0001 0100 000011000000

I Fn'" Rn + Sn + Cn

Destination

01111010100100010100 1111 00111110

I

-I
CO
CO

eN
N

tF = ALU result
n = nth byte
Register file 11 gets F if byte selected. S if byte not selected.

3-64

Sn'" RF(l)n

RF(11)n'" Fn or Sn t

Byte AND RAND S (Byte Logical AND RAND S)

BAND

IEI8 I

FUNCTION
Evaluates the logical AND of selected bytes of R-bus and S-bus data.

DESCRIPTION
Bytes with their corresponding SIO signals programmed low compute RAND S. Bytes
with SIO signals programmed high, pass S unaltered. Multiple bytes can be selected
only if they are adjacent to one another. At least one byte must be nonselected.
Available R Bus Source Operands

C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed
Yes

No

..
A3-AO

N

Mask

CO
CO

Yes

C")

No

I-

o

oCt

Available S Bus Source Operands

,...
¢

RF
MQ
DB-Port
(B5-BO)
Register
Yes

Yes

Yes

Available Destination Operands

RF
RF
(C5-CO) (B5-BO)
Yes

z

en

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

Control/Data Signals

Signal

User

Use

Programmable

SSF

No

Forced low

SIOO

Yes

Byte select

SIOl

Yes

Byte select

SI02

Yes

Byte select

SI03

Yes

Byte select

Cn

No

Inactive

3-65

IEI8

Byte AND RAND S (Byte Logical AND RAND S)

BAND

Status Signals
ZERO
N
OVR

C

1 if result (selected bytes) =0

0
0
0

EXAMPLE (assumes a 32-bit configuration)
Logically AND bytes 1 and 2 of register 3 with input on the DB bus; store the result
in register 3.
,
Instr

Oprd

Oprd

Oprd Sel

best

C/)

Code

Addr

Addr

EB1-

Addr

Z

17-10

AS-AD

BS-BO

"-oJ

11101000 000011

XX XXXX

t\ EBO
0

10

Destination Selects

C5-CO

SELMa

WE3WEo

000011

0

0000

SELRF 110

SELRFO

0mX

X

~

»
~

CF2-

i5EA 0Eii 6EYo DEs

xxxx

Cn CFO

0

X

Si03- iESiOOSiOo IEsiOo

110 1001

0000

Assume register file 3 holds 398FBEBE (Hex) and input on the DB port is 4290BFBF
(Hex):

00
00

I Rn -

Source

001110011000 11111011111010111110

Source

01000010 1001 0000 1011 1111 1011 1111

Sn - DBn

Destination

01000010 10000000 1011 1110 1011 1111

RF(3)n - Fn or Sn t

W

RF(3)n

N

tF = ALU result
n = nth byte
Register file 3 gets F if byte selected, S if byte not selected.

3-66

BCDBIN

BCD to Binary

1

F

FUNCTION
Converts a BCD number to binary.

DESCRIPTION
This instruction allows the user to convert an N-digit BCD number to a 4N-bit binary
number in 4(N-1) plus 8 clocks. The instruction sums the Rand S buses with carry.
A one-bit arithmetic left shift is performed on the ALU output. A zero is filled into bit 0
of the least significant byte unless SIOO is set low, which would force bit 0 to one.
Bit 7 of the most significant byte is dropped.
Simultaneously, the contents of the MQ register are rotated one bit to the left. Bit
7 of the most significant byte is rotated to bit 0 of the least significant byte.
N
M
00
00

Recommended R Bus Source Operands

IU

C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..


(")
-t

N
OVR

C

00
00
Co\)

N

Should be programmed low for proper conversion.

1
1
1
1

if result = 0
if MSB = 1
if signed arithmetic overflow
if carry-out = 1

ALGORITHM
The following code converts an N-digit BCD number to a 4N-bit binary number in 4(N-1 )
plus 8 clocks. This is one possible user generated algorithm. It employs the standard
conversion formula for a BCD number (shown here for 32 bits):
ABCD = [(A

x 10 + B) x 10 + C] x 10 + D.

The conversion begins with the most significant BCD digit. Addition is performed in
radix 2.

3-68

BCDBIN

BCD to Binary

I7IFI

PSEUDOCODE
LOADMO

NUM

Load MO with BCD number.

SUB

ACC, ACC, SLCMO

Clear accumulator;
Circular left shift MO.

SUB

MSK, MSK, SLCMO

Clear mask register;
Circular left shift MO.

SLCMO

Circular left shift MO.

SLCMO

Circular left shift MO.

ADD I

ACC, MSK, 15

Store 1 5 in mask register.

Repeat N-1 times:

N
M
CO
CO

(N '" number of BCD digits)
AND

ADD

MO, MSK, R1,
SLCMO
ACC, R1, R1, SLCMO

I(.)

Extract one digit;
Circular left shift MO.



Source

01000000100011111011111010111110

I Sn

+--

RF(7)n

ALU

01000000 1000 11111011111110111110

I Fn

+--

Sn

Destination

0100 00001000 1111 1011 1111 1011 1110

I RF(2)n

+--

+


~

..
A3-AO
Mask

Yes

No

Yes

No

("')

-t

00
00
Co\)

N

Available S Bus Source Operands
RF
MO
DB-Port
(B5-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

·No

Shift Operations

Y-Port

ALU

MO

Yes

None

None

Control/Data Signals
Signal

User

Use

Programmable
Inactive

SSF

No

5100

Yes

Byte select

SiOi'

Yes

Byte select

Si02

Yes

Byte select

5103

Yes

Byte select

Cn

Yes

Propagates through nonselected bytes; should be
set high for two's complement subtraction.

3-78

IAI8 I

Byte Subtract R from S with Carry

BSUBR
Status Signals

1 if result (selected bytes) = 0

ZERO

o

N

if signed arithmetic overflow (selected bytes)

OVR

C

if carry-out (most significant selected byte)

EXAMPLE (assumes a 32-bit configuration)
Subtract bytes 1 and 2 of register 1 with carry from bytes 1 and 2 of register 3.
Concatenate with bytes 0 and 3 of register 3, storing the result in register 11.
Instr

Oprd

Oprd

Oprd 5.1

Oest

Code

Addr

Addr

EB1-

Addr

17-10

A5-AO

B5-SO

10101000

00 0001

000011

EAESO
0

00

Destination Selects

C5-CO

SELMa

WE3WEo

00 1011

0

0000

SELRF110

SELRFO OEA

X

om-

0Eii
X

CF2-

OEYO

DES

XXXX

0

Cn CFO
1

Si'53- i'Esi03Si50 iEsiOO N
M

110 1001

0000

Assume register file 1 holds 09185858 (Hex) and register file 3 holds 703A9898 (Hex):
Source

0000 1001000110110101100001011000

I Rn

+-

CO
CO
~

U

~

RF(1)n

I'

Z

Source

0111 00000011 1010 1001 10001001 1000

I Sn

+-

RF(3)n

ALU

01100111 0001 1111 0100000001000000

I Fn

+-

R'n

Destination

0111 00000001 1111 01000000 1001 1000

I RF( 11)n

rJ)

+ Sn + Cn

+-

Fn or Sn t

t F = ALU result
n = nth package
Register file 11 gets F if byte selected. S if byte not selected.

3-79

I9 I8

Byte Subtract S from R with Carry

BSUBS

FUNCTION
Subtracts S from R in selected bytes.

DESCRIPTION
Bytes with SIO inputs programmed low compute R + S' + Cn. Bytes with SIO inputs
programmed high. pass S unaltered. Multiple bytes can be selected only if they are
adjacent to one another. At least one byte must be nonselected.
Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AOI Immed

Mask

(I)

2

..
A3-AO

Yes

No

Yes

No

-.J
~

l>

n

-f
CO
CO
Co\)

N

Available S Bus Source Operands
RF
(B5-BOI
Yes

DB-Port

MQ
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-COI (B5-BOI
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

SIOO

Yes

Byte select

SiOT
Si02

Yes

Byte select

Yes

Byte select

5103

Yes

Byte select

Cn

Yes

Propagates through nonselected bytes; should be
set high for two's complement subtraction.

3-80

BSUBS

I9I8I

Byte Subtract S from R with Carry

Status Signals
ZERO

N

1 if result (selected bytes) = 0

o
if signed arithmetic overflow (selected bytes)

OVR

if carry-out (most significant selected byte)

C

EXAMPLE (assumes a 32-bit configuration)
Subtract bytes 1 and 2 of register 3 with carry from bytes 1 and 2 of register 1.
Concatenate with bytes 0 and 3 of register 3, storing the result in register 11.
Instr

Op,d

Op,d

Op,d S.I

Dest

Code

Add,

Add,

EB1-

Add,

17-10

A5-AO

B5-BO

1001 1000 00 0001

000011

EAEBO
0

00

Destination Selects

We3-

SELRF1-

C5-CO

SELMa

WEo

SELRFO

001011

0

0000

10

om-

CF2-

Si03- iESiOaiEsiOo

QEij 0Ev0 (ill; Cn CFO SiOO
X
X XXXX 0
1 110 1001

0eA

0000

C'II

(¥)

Assume register file 1 holds 52888888 (Hex) and register file 3 holds 143A9898 (Hex):
Source

Source

0101 0010100010001011 1000 1011 1000

I Rn -

0001 01000011 10101001 1000 1001 1000

I

CO
CO

....

RF(1)n

CJ

Sn - RF(3)n

I'

~

~

2

CJ)

ALU

0011 11100100 111000100000 0010 0000

I Fn -

Destination

0101 00100100111000100000 1011 1000

I RF(11)n -

Rn

+ S'n +

Cn

Fn or Sn t

t F = AlU result
n = nth byte
Register file 11 gets F if byte selected. S if byte not selected.

3-81

Byte XOR Rand S
(Byte Exclusive OR Rand S)

lola

BXOR

FUNCTION
Evaluates R exclusive OR S in selected bytes.

DESCRIPTION
Bytes with SIO inputs programmed low evaluate R exclusive OR S. Bytes with SIO
inputs programmed high, pass S unaltered. Multiple bytes can be selected only ifthey
are adjacent to one another. At least one byte must be nonselected.
Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

en
z
.....

..
A3-AO
Mask

Yes

No

Yes

No

~

»
(")
-t

CO
CO
W
N

Available S Bus Source Operands
RF
MQ
DB-Port
(B5-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

5100

Yes

Byte select

5101

Yes

Byte select

5102

Yes

Byte select

5103

Yes

Byte select

Cn

No

Inactive

3-82

Byte XOR Rand S
(Byte Exclusive OR Rand S)

BXOR

I0 I8 I

Status Signals
ZERO

N
OVR

C

1 if result (selected bytes) = 0

o
o
o

EXAMPLE (assumes a 32-bit configuration)
Exclusive OR bytes 1 and 2 of register 6 with bytes 1 and 2 on the DB bus; concatenate
the result with DB bytes 0 and 3, storing the result in register 10.
Instr

Op,d

Op,d

Op,d Sel

Dest

Code

Add,

Add,

EB1-

Add,

WE3-

17-10

A5-AO

B5-BO

C5-CO

SELMO WEO SELRFO

1101 1000 000110

XX XXXX

EAEBO
0

10

001010

Destination Selects

0

SELRF1-

0000

10

om-

CF2-

0eA Oeii 0Ev0 OES
X

X

XXXX

0

Cn CFO
1

Si53"- iEsi'03SiOO iEsiOo

110 1001

0000

Assume register file 6 holds 938FBEBE (Hex) and the DB bus holds 4190BEBE (Hex):
Source

100100111000 11111011111010111110

I Rn -

RF(6)n

Source

0100 0001 1001 0000 1011 1110 1011 1110

I Sn -

DBn

Destination

0100 0001 0001 1111 0000 0000 1011 1110

I

RF( 1O)n - Fn or Sn t

tF

= ALU result
n = nth pac~age
Register file 10 gets F if byte selected, S if byte not selected.

3-83

I F It

1

CLEAR

FUNCTION
Forces ALU output to zero and clears the BCD flip-flops.

DESCRIPTION
ALU output is forced to zero and the BCD flip-flops are cleared.
tThis instruction may also be coded with the following opcodes:
[2] [F]. [3] [F], [4] [F], [6] [F], [B] [F], [e] [F], [E] [F]

Available R Bus Source Operands
C3-CO

RF

A3-AO

(AS-AO) Immed

DA-Port

CJ)

2

-.J

..
A3-AO
Mask

No

No

No

No

~

~
-I

(X)
(X)

eN

N

Available S Bus Source Operands

RF
(BS-BO)

DB-Port

No

No

MQ
Register
No

Available Destination Operands

RF

RF

(CS-CO) (85-80)
Yes

No

Status Signals

IZER~

OVR
Cn

3-84

1

o
o
o

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

CLR

CRC

Cyclic Redundancy Character Accumulation

I0I0I

FUNCTION
Evaluates R exclusive OR S for use with cyclic redundancy check codes.

DESCRIPTION
Data on the R bus is exclusive ORed with data on the S bus. If MOO XNORed with
SO is zero (MOO is the LSB of the MO register and SO is the LSB of S-bus data), the
result is sent to the ALU shifter. Otherwise, data on the S bus is sent to the ALU shifter.
A right shift is performed; the MSB is filled with RO (MOO XOR SO), where RO is the
LSB of R-bus data. A circular right shift is performed on MO data.
Recommended R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

N
M
00
00

..
A3-AO

~

Mask
Yes

No

No

u

«~

No

"enZ

Recommended S Bus ,Source Operands
MQ
RF
DB-Port
(B5-BO)
Register
Yes

Yes

No

Recommended Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

Shift Operations

Y-Port

ALU

MQ

No

Right

Right

No

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

5100
5101
5102
5103
Cn

No

Inactive

No

Inactive

No

Inactive

No

Inactive

No

Inactive

3-85

10 10

Cyclic Redundancy Character Accumulation

CRC

Status Signals
ZERO

=

1 if result = 0

N = 0
I

OVR = 0

en

= 0

CYCLIC REDUNDANCY CHARACTER CHECK
DESCRIPTION

en
~

Serial binary data transmitted over a channel is susceptible to error bursts. These bursts
may be detected and corrected by standard encoding methods such as cyclic
redundancy check codes, fire codes, or computer generated codes. These codes all
divide the message vector by a generator polynomial to produce a remainder that
contains parity information about the message vector.

~

l> If a message vector of m bits, a(x), is divided bya generator polynomial, g(x), of order

n

k-1, a k bit remainder, r(x), is formed. The code vector, c(x), consisting of mIx) and
r(x) of length n = m + k is transmitted down the channel. The receiver divides the
received vector by g(x).

N

After m divide iterations, r(x) will be regenerated only if there is no error in the message
bits. After k more iterations, the result will be zero if and only if no error has occurred
in either the message or the remainder.

-f
CO
CO
W

ALGORITHM
An algorithm for a cyclic redundancy character check, using the 'ACT8832 as a
receiver, is given below:
LOADMQ VEC(X)
Load MQ with first 32 message bits of
received vector c' (x).
LOAD POLY

Load register with polynomial g(x).

CLEAR SUM

Clear register acting as accumulator.

REPEAT (n/32) TIMES:
SUM = SUM CRC POLY

Perform CRC instruction where
R Bus = POLY
S Bus = SUM
Store result in SUM.

LOADMQ VEC(X)

Load MQ with next 32 message bits of
received vector c'(x).

(END REPEAT)

3-86

CRC

Cyclic Redundancy Character Accumulation

I0 I0 I

SUM now contains the remainder [r'(x)) of c'(xl. A syndrome generation routine may
be called next, if required.
Note that the most significant bit of
g(x) = (gk-1 )(xk-1)

+

(9k_2)(x k - 2 )

+ .. (go)(x O )

is implied and that POL Y(O) is set to zero if the length of g(x) requires fewer bits than
are in the machine word width.

3-87

1410

Divide Remainder Fix

DlVRF

FUNCTION
Corrects the remainder of nonrestoring division routine if correction is required.

DESCRIPTION
DIVRF tests the result of the final step in nonrestoring division iteration: SDIVIT (for
signed division) or UDIVIT (for unsigned division). An error in the remainder results
when it is nonzero and the signs of the remainder and the dividend are different.
The R bus must be loaded with the divisor and the S bus with the most significant
half of the previous result. The least significant half is in the MO register. The Y bus
result must be stored in the register file for use during the subsequent SDIVOF
instruction.
CJ)

~
~

DIVRF tests to determine whether a fix is required and evaluates:
Y +- S + R' + 1 if a fix is necessary
Y +- S + R + 0 if a fix is unnecessary

l>
(")
-t

Overflow is reported to OVR at the end of the division routine (after SDIVOF).

~

Recommended R Bus Source Operands

00
N

C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

No

No

Recommended S Bus Source Operands
MQ
RF
DB-Port
(B5-BO)
Register
Yes

Yes

No

Recommended Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

3-88

No

Shift Operations

Y-Port

ALU

MQ

No

None

None

DIVRF

Divide Remainder Fix

I4 I0 I

Control/Data Signals

User

Signal

Use

Programmable

SSF

No

Inactive

5100

No

Inactive

SiC5T

No

Inactive

5102

No

Inactive

5103

No

Inactive

Cn

Yes

Should be programmed high

Status Signals
ZERO

N
OVR

Cn

1 if remainder = 0

N

o
o

CO
CO
l-

('I)

t.)

1 if carry-out =


n

"""4

00
00

W

RF

A3-AO

(A5-AO) Immed

DA-Port

..
A3-AO

Mas!<
No

No

No

No

N

Recommended S Bus Source
Operands (MSH)
RF
(B5-BO)
Yes

DB-Port
No

MQ
Register
No

Recommended Destination
Operands
RF

RF

(C5-CO) (B5-BO)
Yes

3-90

No

Shift Operations
(conditional)

Y-Port

ALU

MQ

No

Left

Left

DNORM

o

3

Double-Length Normalize

Control/Data Signals
User

Signal

Use

Programmable

SSF

No

Inactive

5100

Yes

When low, selects a one end-fill bit in LSB

5101

No

Passes internally generated end-fill bits

5102

No

5103

No

Cn

No

Status Signals
ZERO
N
OVR
Cn

1 if result = 0

N

1 if MSB = 1

C')

1 if MSB XOR 2nd MSB

00
00

o

lt)

<
'¢

EXAMPLE (assumes a 32-bit configuration)

,....

Normalize a double-precision number.

z

(This example assumes that the MSH of the number to be normalized is in register 3
and the lSH is in the MQ register. The zero on the OVR pin at the end of the instruction
cycle indicates that normalization is not complete and the instruction should be
repeated).
Instr

Oprd

Oprd

Oprd Sel

Code

Addr

Addr

EB1-

17-10

A5-AO

B5-BO

00110000

XX XXXX

000011

Eli: EBO
X

00

Dest
Addr
C5-CO
000011

Destination Selects

SELRF1-

SELMO

WE3WeO

0

0000

10

SELRFO

Offi"-

0eA We
X

X

CF2-

OEYO

OES

Cn

CFO

XXXX

0

X

110

Assume register file 3 holds FA75D84E (Hex) and MQ register holds 37F6D843 (Hex):

I ALU shifter

Source

11111010011101011101100001001110

Source

0011 0111 1111 01101101 100001000011

MQ shifter

Destination

1111 010011101011 1011 0000 1001 1101

8RF(3)

Destination

01101111 11101101 1011 000010000110

+-

+-

OVR

+-

MQ register

Result (MSH)

I MQ register

GJ

RF(3)

+-

+-

Result (LSH)

ot

tNormalization not complete at the end of this instruction cycle.

3-91

en

I5 IF

Output Divide/BCD Flip-Flops

DUMPFF

FUNCTION
Output contents of the divide/BCD flip-flops.

DESCR,PTION
The contents of the divide/BCD flip-flops are passed through the MQ register to the
Y output Imultiplexer.
Available R Bus Source Operands
C3-CO

en
2

'-I

RF
A3-AO
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

No

No

No

No

~

»(")
-f
CO
CO
W
N

Available S Bus Source Operands
RF
MQ
DB-Port
(B5-BO)
Register
No

No

No

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
No

No

Status Signals

IZER~

=

=

0
0

OVR = 0
Cn

3-92

= 0

Shift Operations

V-Port

ALU

MQ

Ves

None

None

DUMPFF

I5IFI

Output DividelBCD Flip-Flops

EXAMPLES (assumes a 32-bit configuration)
Dump divide/BCD flip-flops to Y output.
Oprd
Addr

Instr

Code
17-10
0101 1111

A5-AO

XX

Oprd
Addr
B5-BO

Oprd Sel
EB1EAEBO

Dest
Addr
C5-CO

xxxx xx xxxx x xx xx

-WE3-

Destination Selects

SELRF1-

SELMa WEO SELRFO
XXXX
1
XXXX
XX

0EY3'15EA 15Es 0Ev0 DES en

x

x

0000

x

X

CF2CFO
110

Assume divide/BCD flip-flops contain 2A055470 (Hex):
Source

0010101000000101 0101 01000111 0000

I MQ register

+-

Destination

0010101000000101 0101 01000111 0000

I Y output

MQ register

+-

Divide/BCD flip-flops

N

M

00
00
~

(.)

«qr"

2

en

3-93

I8 IF

Excess·3 Byte Correction

EX3BC

FUNCTION
Corrects the result of excess-3 addition or subtraction in selected bytes.

DESCRIPTION
This instruction corrects excess-3 additions or subtractions in the byte mode. For
correct excess-3 arithmetic, this instruction must follow each add or subtract. The
operand must be on the 5 bus.
Data on the 5 bus is added to a constant on the R bus determined by the state of
the BCQ flip flops and previous overflow condition reported on the 55F pin. Bytes with
510 inputs programmed low evaluate the correct excess-3 representation. Bytes with
510 inputs programmed high or floating, pass 5 unaltered.

en
2

-.oJ

Available R Bus Source Operands

~

»
n

......j

CO
CO
W
N

C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..
A3-AO
Ml!sk

No

No

No

No

Available S Bus Source O,perands
MQ
RF
DB-Port
(B5-BO)
Register
Yes

No

No

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

Shift Operations

Y-Port

ALU

MQ

No

No

No

No

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

SIOO

Yes

Byte select

5101

Yes

Byte select

5102

Yes

Byte selElct

5103

Yes

Byte select

Cn

No

Inactive

3-94

EX3BC

F

8

Excess-3 Byte Correction

Status Signals
ZERO

o

N

o
if arithmetic signed overflow

OVR

if carry-out = 1

Cn

EXAMPLE (assumes a 32-bit configuration)
Add two BCD numbers and store the sum in register 3. Assume data comes in on
DB bus.
1.
2.
3.
4.
5.
6.

Clear accumulator (SUB ACC, ACC)
Store 33 (Hex) in all bytes of register (SET1 R2, H/33/1
Add 33 (Hex) to selected bytes of first BCD number (BADD DB, R2, R1)
Add 33 (Hex) to selected bytes of second BCD number (BADD DB, R2, R3)
Add selected bytes of registers 1 and 3 (BADD, R1, R3, R3)
Correct the result (EX3BC, R3, R3)

Instr

Op.d

Op.d

Op.d Sol

Dest

Code

Add.

Add.

EB1-

Add.

-WE3-

17-10

AS-AD

BS-8O,

CS-CO

SELMQ WEO SELRFO

Eli EBO

XX XXXX 0
00001000 00 0010 XX XXXX 0
10001000 00 0010 XX XXXX 0
1000 1000 00 0010 XX XXXX 0

XX
XX

000010

10

1000 1000 000001

11110010 00 0010

1000 1111

000011

XX XXXX 000011

SELRF1-

0

0000

10

00 0010

0

0000

10

00 0001

0

0000

10

10

000011

0

0000

10

00

000011

0

0000

10

X 00

000011

0

0000

10

0

- --

Destination Selects

0eYa0eA 0Eii iiEYO 0eS
X
X
X
X
X
X

X XXXX
X XXXX
X XXX X
X XXX X
X XXXX
X XXXX

CF2- 5103- IESI03Cn CFO

SiOo iEsiOo

1

110

0

XXXX XXXX
X 110 XXXX XXXX

0

0

110 1100

0000

0

0

110 1100

0000

0

0

110 1100

0000

0

0

110 1100

0000

0

Assume DB bus holds 51336912 at third instruction and 34867162 at fourth
instruction.
000000000000 0000 0000 0000 0000 0000

I

RF(2)

+-

0

2

0000 0000 0000 0000 0011 0011 0011 0011

RF(2)

+-

00003333 (Hex)

3

01010001001100111001110001000101

RF(1)

+-

RF(2) +DB

4

0011 0100 1000 0110 1010 0100 1001 0101

RF(3)

+-

RF(2)

5

0011 010010000110010000001101 1010

I

6

0011 0100 1000 0110 0100 0000 0111 0100

I RF(3)n

RF(3)n

+ DB
+ RF(3)n

+-

RF(1)n

+-

Corrected RF(3)n result

3-95

I9 IF

Excess·3 Word Correction

EX3C

FUNCTION
Corrects the result of excess-3 addition or subtraction.

DESCRIPTION
This instruction corrects excess-3 additions or subtractions in the word mode. For
correct excess-3 arithmetic, this instruction must follow each add or subtract. The
operand must be on the 5 bus.
Data on the 5 bus is added to a constant on the R bus deteqnined by the state of
the BCD flip-flops and previous overflow condition reported on the SSF pin.
Available R Bus Source Operands

en

C3-CO

:2
-...I

~

»
(")

-t

CO
CO

W
N

RF
A3-AO
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

No

No

No

No

Available S Bus Source Operands
RF
MQ
DB-Port
(B5-BO)
Register
Yes

No

No

Available Destination Operands
RF
RF
(C5-CO) (B5-80)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

No

No

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

5100

No

Inactive

5101

No

Inactive

5102

No

Inactive

5103

No

Inactive

Cn

No

Inactive

3-96

EX3C

I9IF

Excess-3 Word Correction

Status Signals

o

ZERO

N

1 if MSB =
if arithmetic signed overflow

OVR

en

if carry-out = 1

EXAMPLE (assumes a 32-bit configuration)
Add two BCD numbers and store the sum in register 3. Assume data comes in on
DA bus.
1.
2.
3.
4.
5.
6.
7.

Clear accumulator (SUB ACC, ACC)
Store 33 (Hex) in all bytes of register (SET1 R2, H/33/1
Add 33 (Hex) to all bytes of first BCD number (ADD DB, R2, R1)
Add 33 (Hex) to all bytes of second BCD number (ADD DB, R2, R3)
Add the excess-3 data (ADD, R1, R3, R3)
Correct the excess-3 result (EX3C, R3, R3)
Subtract the excess-3 bias to go to BCD result.

Instr

Oprd

Oprd

Oprd Sel

Code
17-10

Addr
A5-AO

Addr
B5-80

EA EBO

11110010

00 0010

00001000
11110001
11110001
111.10001

000010
000010
000010

000001
1001 1111 XX
11110010 000010

xxx x

EB1-

Dest
Addr
C5-CO

Destination Selects
WE3- SELRF1SELMa

0
0

xx
xx

000010
000010

0
0

0
0

10
10

000001
000011

0
0

000011
000011

0

00

X 00

000011
000011

000011

0

000011

0
0
0

XX
XX
XX
XX

XXXX
XXXX
XXXX
XXXX

00

WEO

SELRFO
10
0000
0000
10
10
0000
10
0000
0000
0000
0000

10
10
10

X
X
X
X
X
X
X

....
U


(")

-I
00

CO

Co\)

N

3-110

CF2-

0eY0 0eS

RF(1)

R +

en

0

Cn CFO
0 110

MQSLC

Pass (V - F) with Circular Left MQ Shift

I0 I

*

I

FUNCTION
Passes the result of the ALU instruction specified in the upper nibble of the instruction
field to Y MUX. Performs a circular left shift on MQ.

DESCRIPTION
The result of the arithmetic or logical operation specified in the lower nibble of the
instruction field (13-10) is passed unshifted to Y MUX.
The contents of the MQ register are rotated one bit to the left. The MSB is rotated
out and passed to the LSB of the same word, which may be 1, 2, or 4 bytes long.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the MQ register. If SSF is low, the MQ register will not be altered .
• A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

Available Destination Operands IALU Shifter)
RF

RF

(C5-CO)

185-80)

Yes

No

Y-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high or floating; retains MQ

SiOO
SiOT

No

Inactive

No

Inactive

SI02

No

Inactive

SI03

No

Inactive

Cn

No

Affects arithmetic operation programmed in bits

without shift if low.

13-10 of instruction field.

3-111

10 I *

Pass (Y - F) with Circular Left MQ Shih

MQSLC

Status Signals t
ZERO
N

1 if result = 0
1 if MSB of result =

o if MSB of result
OVR

C

= 0
1 if signed arithmetic overflow

1 if carry-out = 1

tc is ALU carry-out and is evaluated before shift operation. ZERO and N (negative)

are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
CJ)

Add data in register 1 to data on the DB bus with carry-in and store the unshifted
result in register 1. Circular shift the contents of the MQ register one bit to the left.

:2
-.,J
~

l>

n

-4

00
00

W
N

Inst'
Code
17·10
11010001

Op,d
Add,
A5·AO
00 0001

Op,d
Add,
B5·BO
XX

Op,d Sel

Dest
Add,
EB1·
EAEBO
C5·CO
10 00 0001

xxxx a

Destination Selects
SELRF1·
SELMa
SELRFO i5EA 0Ee
10
X
X
0
0000

WE3.
WEO

Offi·
0EY0 OES

xxxx a

CF2·
Cn CFO
I
110

Assume register file 1 holds 2508C618 (Hex), DB bus holds 11007530 (Hex), and
MQ register holds 4DA99AOE (Hex).
Source

0010 0101 0000 1000 1 lOa 01100001 1000

Source

0001 0001 0000 0000 0111 0101 0011 0000

Destination

001101100000 1001 00111011 0100 1001

Source

Destination

3-112

I

0100 1101 lOla 1001 1001 lOla 0000 1 I 10

1001 1011 0101 001 I 001 I 0100 0001 1100

I R - RF(1)
I S - DB bus
I RF( 1) - R + S + Cn
I MQ shifter - MQ register
I MQ register - MQ shifter

MOSLL

Pass (Y - F) with Logical Left MO Shift

FUNCTION
Passes the result of the ALU instruction specified in the upper nibble of the instruction
field to Y MUX. Performs a left shift on MO.

DESCRIPTION
The result of the arithmetic or logical operation specified in the lower nibble of the
instruction field (13-10) is passed unshifted to Y MUX.
The contents of the MO register are shifted one bit to the left. A zero is filled into
the least significant bit of each word unless the SIO input for that word is programmed
low; this will force the least significant bit to one. The MSB is dropped from each word,
which may be 1, 2, or 4 bytes long, depending on the configuration selected.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the MQ register. If SSF is low, the MQ register will not be altered.
• A list of ALU operations that can be used with this instruction is given in Table 15.

N
M

CO
CO
I(.)


("')
-t

00
00
tAl
N

Inst,
Code
17-10
10110001

Op,d
Add,
A5-AO
000001

Op,d
Op,d Sel
Dest
Add,
Add,
EB1B5-BO
EA EBO C5-CO
XX XXXX
0 10 00.. 0001

Destination Selects
SELRF1OffiSELMa WEo SELRFO 0eA 0Eii 0EY0
0
0000
10
X
X
XXXX

WE3-

OES
0

CF2Cn CFO
1 110

Assl-lme register file 1 holds 5608C61~ (Hex), DB bus holds 14007530 (Hex), and
MO register holds 98A99AOE (Hex).
Source

0101 01100000 1000 110001100001 1000

I R +- RF( 1)

Source

0001 0100000000000111 0101 0011 0000

Is+- DB bus

Destination

011010100000 1001 0011 1011 0100 1001

I RF(1) +- R + S + Cn

Source

1001100010101001 1001 10100000 1110

I MO shifter +- MO register

Destination

01001100 0101 0100 1100 1101 00000111

3-118

MO register

+- MO shifter

NAND

Logical NAND (R NAND S)

*

IcI

FUNCTION
Evaluates the logical expression R NAND S.

DESCRIPTION
Data on the R bus is NANDed with data on the S bus. The result appears at the ALU
and MQ shifters.
"The result of this instruction can be shifted in the same micro cycle by specifying a shift instruction in the
upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

N
M

..

CO
CO

A3-AO

t-

Mask
Yes

No

Yes

O

«
,...~

No

z

Available S Bus Source Operands

en

MQ

RF
DB-Port
(85-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

Y-Port
Yes

No

ALU

MQ

Shifter

Shifter

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

5100

No

instruction field.

5101

No

5102

No

5103

No

Cn

Inactive

3-119

I * Ie I

Logical NAND (R NAND S)

NAND

Status Signals t
ZERO

N

1 if result = 0
1 if MSB = 1

OVR

0

C

0

tc is AlU carry out "and is evaluated before shift operation.

ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after AlU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Logically NAND the contents of register 3 and register 5, and store the result
in register 5.
rJ)

2

"'-J
~

»
("')
-t

CO
CO

Inst'
Code
17-10

Op,d
Add,

Op,d

Op,d Sel

Add,

A5-AO

1111 1100

000011

B5-BO
000101

EB1·
EA EBO
0

00

Dest
Add,
C5-CO
000101

Destination Selects
SELRF1OEY3CF2SELMQ
SELRFO OEA OEB 0eY0 OES Cn CFO
X
X XXXX
X 110
0
0000
10
0

WE3.
'Weii

Assume register file 1 holds 60F6D840 (Hex) and register file 5 holds 13F6D377 (Hex).

~

I R-

Source

01100000111101101101100001000000

Source

00010011111101101101001101110111

S - RF(5)

Destination

111111110000 100100101111 lOll 1111

RF(5) - R NAND S

3-120

RF(3)

NOP

No Operation

F

F

FUNCTION
Forces AlU output to zero.

DESCRIPTION
This instruction forces the AlU output to zero. The BCD flip-flops retain their old value.
Note that the clear instruction (ClR) forces the AlU output to zero and clears the BCD
flip-flops.
Available R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

..
A3-AO

N
M
CO
CO

Mask
No

No

No

No

....

()

ct

Available S Bus Source Operands

~

I'

RF
MO
DB-Port
(B5-BO)
Register
No

No

No

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Z

en
Shift Operations

Y-Port

ALU

MO

Yes

None

None

Status Signals

IZER~

OVR

C

o

o
o

3-121

IFIF

No Operation

NOP

EXAMPLE (assumes a 32-bit configuration)
Clear register 12.
Inst,
Code
17-10
11"11111

Op,d
Add,
AS-AO
XX XXXX

Dl:Istination

en
2:

"~

o

-t

co
co
W
N

3-122

I

Op,d
Md,
B5-6O
XXXX

xx

Op,d Sel
EB1EA EBO

x xx

Destination Selects
Dest
Add,
WE3- SELRF1C5-CO
SELMa WeB' SELRFO OEA DEe
001100
0
0000
10
X
X

0000 0000 0000 0000 0000 0000 0000 0000

I RF(12) -

0

OEY3-

0Ev0

xxxx

CF2OES Cn CFO
X
0
110

Logical NOR (R NOR S)

NOR

*

I0

FUNCTION
Evaluates the logical expression R NOR S.

DESCRIPTION
Data on the R bus is NORed with data on the S bus. The result appears at the ALU
and MQ shifters.
"The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble (17-141 of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

N

..

('I)

00
00

A3-AO

....

Mask
Yes

No

~

No

Yes

I"

Available S Bus Source Operands

Z

C/)

RF
(B5-BO)
Yes

DB-Port

MO
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (85-BO)
Yes

No

Y-Port
Yes

ALU

MO

Shifter

Shifter

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

5100

No

instruction field.

5101

No

5102

No

5103

No

Cn

No

Inactive

3-123

I * 10

Logical NOR (R NOR S)

NOR

Status Signals t
ZERO
N
OVR
C

1 if result = 0
1 if MSB = 1

o
o

t C is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift op~ration. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Logically NOR the contents of register 3 and register 5, and store the result
in' register 5.

~

-...I
~

l>

(")

-t

CO
CO

Ins~r

Code
17-10
11111011

Oprd
Addr
A5-AO
000011

Oprd
Addr
B5-BO
000101

Oprd Sel
Dest
EB1Addr
EAEBO
C5-CO
0 00 000101

Destination Se!!3cts
SELRF1SELMO
SELRFO 0eA OEB
X
X
0
0000
10

We3WEci

0eYaOEYO

OES

XXXX

0

CF2Cn CFO
X 110

Assume register file 3 holds 60F6D840 (Hex) and register file 5 holds 13F6D377 (Hex).

~

Source

011000001111 01101101 100001000000

I R +- RF(3)

Source

00010011111101101101001101110111

Is+- RF(5)

Destination

1000 11000000 10010010010010001000

I RF(5) -

3-124

R NOR S

OR

Logical OR IR 0" S)

*

IBI

FUNCTION
Evaluates the logical expression R OR S.

DESCRIPTION
Data on the R bus is ORed with data on the S bus. The result appears at the ALU
and MQ shifters.
'The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands
C3-CO
RF
A3-AO
(AS-AO) Immed
Yes

DA-Port

No

N
M
CO
CO

..
A3-AO
Mask

I-

No

«
q-

Yes

CJ

r-.

Available S Bus Source Operands

2

en

RF
MQ
DB-Port
(BS-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF
RF
(CS-CO! (BS-BO)
Yes

No

Y-Port
Yes

ALU

MQ

Shifter

Shifter

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

5100

No

instruction field.

5101

No

5102

No

5103

No

Cn

No

Inactive

3-12S

Logical OR (R OR S)

OR

Status Signals t
ZERO

1 if result

N
OVR
C

1 if MSB

=0
=1

0
0

t C is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluat/ld after ALU operation and after shift operation.

EXAMPLE (assu!11es a 32-bit configuration)
Logically OR the contents of register 5 and register 3, and store the result in
register 3.
t/)

Z

~

~
C')
-f

~

~

Code
17-10

Oprd
Addr
A5-AO

Oprd
Addr
B5-BO

1111 1011

000101

000011

Instr

Oprd Sel

EA

EB1·
EBO

Dest
Addr
C6-CO

0

00

000011

Destination Selects
SELRF1·
SELMO
SELRFO 0eA OEB
0000
10
X
X
0

"We3.
WeD

om·

0eY0 DES

Cn

XXXX

X

0

CF2·
CFO
110

Assume register file 5 holds 60F6D840 (Hex) and register file 3 holds 13F6D377 (Hex).
Source

011000001111 01101101 100001000000

Source

Destination

3-126

I

R

+-

RF(5)

00010011111101101101001101110111

S

+-

RF(3)

0111 0011 1111 0110 1101 1011 0111 0111

RF(3)

+-

R OR S

PASS

F

Pass (Y - F)

FUNCTION
Passes the result of the ALU instruction specified in the lower nibble of the instruction
field to Y MUX.

DESCRIPTION
The result of the arithmetic or logical operation specified in the lower nibble of the
instruction field (/3-10) is passed unshifted to Y MUX.
* A list of ALU operations that can be used with this instruction is given in Table 15.

Available Destination Operands
RF

RF

(C5-CO)

(85-80)

Yes

No

Y-Port
Yes

ALU

MQ

Shifter

Shifter

None

None

N
M
CO
00

I-

CJ

~
,....

Control/Data Signals

Signal

User

z

en

Use

Programmable

SSF

No

Inactive

SIOO

No

Inactive

~101

No

Inactive

SI02

No

Inactive

SI03

No

Inactive

Cn

No

Affects arithmetic operation specified in bits 13-10 of
instruction field.

Status Signals t
ZERO

N

1 if result = 0
1 if MSB of result = 1

o if MSB of result
OVR

C

= 0

1 if signed arithmetic overflow
if carry-out condition

tc is ALU carry out and is evaluated before shift operation. ZERO and

N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

3-127

IFI*

Pass (Y -

PASS

F)

EXAMPLE (assumes a 32-bit configuration)
Add data in register 1 to data on the DB bus with carry-in and store the unshifted
result in register 10.
Instr

Op,d

Op,d

Code
17-10

Add,

Add,

Op,d Sel
EB1-

Dest
Add,

A5-AO

B5-BO

Eli: EBO

C5-CO

SELMQ

WeD

SELRF1SELRFO

0eA

QEij

11110001

000001

XX XXXX

00 1010

0

0000

10

X

X

0

10

Destination Selects

WEJ-

0Ev3OEYO 0eS
xxxx 0

CF2Cn CFO
1

110

Assume register file 3 holds 9308C618 (Hex) and DB bus holds 24007530 (Hex).
Source

1001 0011 00001000 110001100001 1000

I R-

Source

00100100 0000 0000 0111 0101 0011 0000

I

Destination

10110111000010010011101101001001

3-128

RF(1)

S - DB bus

RF(10) - R

+ S + en

SDIVI

Signed Divide Iterate

IAI0 I

FUNCTION
Performs one of N-2 iterations of nonrestoring signed division by a test subtraction
of the N-bit divisor from the 2N-bit dividend. An algorithm using this instruction is
given in the "Other Arithmetic Instructions" section.

DESCRIPTION
SOIVI performs a test subtraction of the divisor from the dividend to generate a quotient
bit. The test subtraction passes if the remainder is positive and fails if negative. If
it fails, the remainder will be corrected during the next instruction.
SOIVI checks the pass/fail result of the test subtraction from the previous instruction,
and evaluates
F ..... R
F ..... R'

+ S
+ S + Cn

if the test fails
if the test passes

N
M
00

A double precision left shift is performed; bit 7 of the most significant byte of the MO
shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of
the most significant byte of the ALU shifter is lost. The unfixed quotient bit is circulated
into the least significant bit of the MO shifter.
The R bus must be loaded with the divisor, the S bus with the most significant half
of the result of the previous instruction (SOIVI during iteration or SOIVIS at the beginning
of iteration). The least significant half of the previous result is in the MO register. Carryin should be programmed high. Overflow occurring during SOIVI is reported to OVR
at the end of the signed divide routine (after SOIVOF).
Available R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed
Yes

No

DA-Port

Yes

..
A3-AO
Mask
No

Recommended S Bus Source Operands
RF
MQ
D8-Port
(85-80)
Register
Yes
Yes
No
Recommended Destination Operands
RF
RF
(C5-CO) (85-80)
Yes
No

Shift Operations

Y-Port

ALU

MQ

Yes

Left

Left

3-129

~

U



C

(')

-4
CO
CO
Co\)

N

3-134

1 if intermediate result = 0

o
o
1 if carry-out

SOIVIS

SDiVIT

Signed Divide Terminate

I EI0 I

FUNCTION
Solves the final quotient bit during nonrestoring signed division. An
algorithm using this instruction is given in the "Other Arithmetic Instructions" section.

DESCRIPTION
SDIVIT performs the final subtraction of the divisor from the remainder during
nonrestoring signed division. SDIVIT is preceded by N-2 iterations of SDIVI, where
N is the number of bits in the dividend.
The R bus must be loaded with the divisor, and the S bus must be loaded with the
most significant half of the result of the last SDIVI instruction. The least significant
half lies in the MQ register. The Y bus result must be ioaded back into the register
file for use in the subsequent DIVRF instruction. Carry-in should be programmed high.
SDIVIT checks the pass/fail result of the previous instruction's test subtraction and
evaluates;
Y+-R+S
Y +- R' + S

I-

if the test fails
if the test passes

+ Cn

CJ

«
~

The contents of the MQ register are shifted one bit to the left; the unfixed quotient
bit is circulated into the least significant bit.
Overflow during this instruction is reported to OVR at the end of the signed division
routine (after SDIVQF).
Available R Bus Source Operands
C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Recommended S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ
Register
No

Recommended Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

Left

Left

N
M
CO
CO

3-135

~

en

IEI0

Signed Divide Terminate

Control/Data Signals

User

Signal

Use

Programmable

SSF

No

Inactive

5100

No

Pass internally generated end-fill bits.

5101

No

5102

No

5103

No

Cn

Yes

Should be programmed high

Status Signals

en

ZERO

1 if intermediate result = 0

:2

N

o

~

OVR

o

-..I

»
(')

C

-t

CO
CO

eN

I\)

3-136

1 if carry-out

SDIVIT

SOIVO

Signed Divide Overflow Test

IAI

F

FUNCTION
Tests for overflow during nonrestoring signed division. An algorithm using this
instruction is given in the "Other Arithmetic Instructions section.

DESCRIPTION
This instruction performs an initial test subtraction of the divisor from the dividend.
If overflow is detected, it is preserved internally and reported at the end of the divide
routine (after SOIVOF). If overflow status is ignored, the SOIVO instruction may be
omitted.
The divisor must be loaded onto the R bus; the most significant half of the previous
SOIVIN result must be loaded onto the S bus. The least significant half is in the MO
register.

N
The result on the Y bus should not be stored back into the register file; WE' should
be programmed high.

('I)

00
00
~

Carry-in should also be programmed high.

u


C

n

Use

Programmable

-I
CO
CO
Co\)
to..)

3-138

1 if divisor = 0

o
o
1 if carry-out

SDiVO

SDlVQF

Signed Divide Quotient Fix

I5I0I

FUNCTION
Tests the quotient result after nonrestoring signed division and corrects it if necessary.
An algorithm using this instruction is given in the "Other Arithmetic Instructions"
section.

DESCRIPTION
SDIVQF is the final instruction required to compute the quotient of a 2N-bit dividend
by an N-bit divisor. It corrects the quotient if the signs of the divisor and dividend are
different and the remainder is nonzero.
The fix is implemented by incrementing S:

Y-S+
Y-S+O

if a fix is required
if no fix is required

The R bus must be loaded with the divisor, and the S bus with the most significant
half of the result of the preceding DIVRF instruction. The least significant half is in
the MQ register.

N
M

00
00
I-

U



o

Source

000011110000 1111 0000 111100001111

CO

Source

10100000100000111011111010111110

ALU

1010oo()0 1000001110111111101111;0

Destination

10100000100000111011111110111110

-I

ffiN

Rn - C3-CO::A3-AO

I Sn - RF(1)n
I Fn - Sn OR Rn
I RF(1)n - Fn or Snt

tF = ALU result
n = nth byte
Register file 1 gets F if byte selected. S if byte not selected_

3-146

Si03- iE'Si03SiOo iEsiOo

116 HOl

0000

SLA

Arithmetic Left Single Precision Shift

FUNCTION
Performs arithmetic left shift on result of ALU operation specified in lower nibble of
instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is shifted one bit
to the left. A zero is filled into bit 0 of the least significant byte of each word unless
the SID input is programmed low; this will force bit 0 to one. Bit 7 is dropped frqm
the most significant byte in each word, which may be 1, 2, or 4 bytes long, depending
on the configuration selected.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the MQ register. If SSF is low, the MQ register will not be altered .
• A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations
ALU Shifter
Arithmetic Left
Available Destination Operands (ALU Shifter)
RF

RF

(C5-CO)

(85-80)

Yes

No

V-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high; passes ALU result if low.

SIOO

Yes

Fills a zero in LS8 of each word if high; fills a

SI01

Yes

one in L58 if low.

5102

Yes

5103

Yes

Cn

No

Affects arithmetic operation programmed in bits
13-10 of instruction field.

3-147

SlA

Arithmetic Left Single Precision Shift
Status Signals t
ZERO
N

1 if result = 0
1 if MSB of result = 1
cOif MSB of result = 0

OVR
C

1 if signed arithmetic overflow or if MSB XOR MSB-1

1 before shift

1 if carry-out condition

tc is ALU carry-out and is evaluated before shift operation. ZERO and N (negative I are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after s~ift operation.

EXAMPLE (assumes a 32-bit configuration)

en
2

"~

(")

-I

00

~

Perform the computation A = 2(A + B), where A and B are single-precision, two's
complement numbers. Let A be stored in r~g!ster 1 and B be input via the DB bus.
Instr

Oprd

Oprd

Oprd 5el

Dest

Code

Addr

Addr

EB1-

Addr

17-10

AS-AO

BS-BO

01000001

00 0001

XX XXXX

EA EBO
0

10

Destination Selects
WE3- SELRF1-

OEV3-

CF2- S103-

CS-CO

SELMa

WEO

SELRFO

OEA

OEB

OEVO

OES

000001

0

0000

10

X

X

XXXX

0

Cn CFO
0

IESI03-

SiOo iESiOo

110 1110

0000

Assume register file 1 holds 1308C618 (Hex), DB bus holds 44007530 (Hex).

N
Source

00010011000010001100011000911000

I R-RF(1)

Source

010001000000 0000 01 I I 0101 0011 0000

I

Intermediate
Result

0101 01 I 1 0000 1001 001 I 101 I 01001000

I ALU Shifter

Destination

10101 I 100001 001001 I I 0110 1001 0001

3-148

S - DB bus

RF( 1)

+-

+-

R + S + Cn

ALU shift result

SSF

1

SLAD

Arithmetic Left Double Precision Shift

15 I

*

I

FUNCTION
Performs arithmetic left shift on MO register (LSH) and result of ALU operation (MSH)
specified in lower nibble of instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is used as the upper
half of a double-precision word, the contents of the MO register as the lower half.
The contents of the MO register are shifted one bit to the left. A zero is filled into
bit 0 of the least significant byte of each word unless the SID input for the word is
set to zero; this will force bit 0 to one. Bit 7 of the most significant byte in the MO
shifter is passed to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the
most significant byte in the ALU shifter is dropped.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MUX and MO register. If SSF is low, the ALU output and MO
register will not be altered .
• A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations
ALU Shifter

MQ Shifter

Arithmetic Left

Arithmetic Left

Available Destination Operands (ALU Shifter)
RF

RF

(C5-CO)

(B5-BO)

Yes

No

Y-Port
Yes

ContrOl/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high; passes ALU result if low.

SIOO

Yes

Fills a zero in LSB of each word if high; fills a

SI01

Yes

one in LSB if low.

SI02

Yes

SI03

Yes

Cn

No

Affects arithmetic operation specified in bits 13-10 of
instruction field.

3-149

I5 I*

Arithmetic Left Double Precision Shift

SLAD

Status Signals t
ZERO

N

1 if result = 0
1 if MSB of result. = 1

o if
OVR

MSB of result = 0

1 if signed arithmetic overflow or if MSB XOR MSB-1

C

1 before shift

if carry-but condition

tc is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Perform the computation A = 2(A + B), where A and B are two's complement numbers.
(/) Let A be a double Ilrecision number residing in register 1 (MSH) and the MQ register
~ (LSH). Let B be a single precision number which is input through the DB bus.
~

»
(")
-f
00
00

Instr

Oprd

Oprd

Oprd Sel

Dest

Code

Addr

Addr

EB1-

Addr

17-10

A5-AO

B5-BO

-

0101 0001

00 0001

XX XXXX

0

10

EA EBO

Destination Selects

WE3- SELRF1-

C5·CO

SELMa

000001

0

-

WEO

SELRFO

OEA

OEB

0000

10

X

X

OEY3-

CF2-

Si03- iESiOO-

0EYci DES en

CFO

SIOO

IESIOO

SSF

XXXX

110 1110

0000

1

0

0

Co\)

N

Assume register file 1 holds 2408C618 (Hex), DB bus holds 26007530 (Hex), and
MQ register holds 50A99AOE (Hex).
MSH
Source

0010010000001000110001100001 1000

I R +- RF( 1)

Source

00100110000000000111 0101 0011 0000

Is+- DB bus

Intermediate
Result

0100 10100000 1001 0011 1011 0100 1000

I ALU Shifter

Destination

1001 0100 OOP1 00100111 0110 1001 0000

I RF( 1) +- ALU shift register

Source

0101 0000 1010 1001 1001 101000001110

I MO shifter +- MO register

Destination

10100001 0101 0011 0011 01000001 1101

+-

R + S + en

LSH

3-150

MO register

+-

MO shift result

SLC

Circular Left Single Precision Shift

FUNCTION
Performs circular left shift on result of ALU operation specified in lower nibble of
instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is rotated one bit
to the left. Bit 7 of the most significant byte in each word is passed to bit 0 of the
least significant byte in the word. which may be 1. 2. or 4 bytes long.
The shift may be made conditional on SSF. If SSF is high or floating. the shift result
will be sent to Y MUX. If SSF is low. F is passed unaltered.
* A list of ALU operations that can be used with this instruction is given in Table 15.

N
M

Shift Operations
ALU Shifter

MQ Shifter

Circular Left

None

CO
CO

....

u

(")
-t

Instr

Oprd

Dest

Addr

Oprd
Addr

Oprd Sel

Code

EB1-

Addr

17-10

AS-AD

B5-BO

EA EBO

01100110

000110

XXXXXX

0

00

Destination Selects
WE3- SELRF 1-

C5-CO

SELMQ

WEO

SELRFO

OEA

OEB

000001

o

0000

10

X

X

"Ci"EY35EYo OES
XXX X

CF2Cn CFO

0

CO Assume register file 6 holds 3788C618 (Hex).
CO
Co\)

Source

0011 0111 1000 1000 110001100001 1000

I

R

Intermediate
Result

0011 0111 1000 1000 1100 0110 0001 1000

I

ALU Shifter

Destination

0110 1111 0001 0001 1000 1100 0011 0000

I

RF( 1)

N

3-152

+-

RF(6)

+-

+-

R

+ Cn

ALU shifter result

0

110

SSF

1

SLCD

Circular Left Double Precision Shift

7

FUNCTION
Performs circular left shift on MQ register (LSH) and result of ALU operation specified
in lower nibble of instruction field (MSH).

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is used as the upper
half of a double-precision word. the contents of the MQ register as the lower half.
The contents of the MQ and ALU registers are rotated one bit to the left. Bit 7 of the
most significant byte in the MQ shifter is passed to bit 0 of the least significant byte
of the ALU shifter. Bit 7 of the most significant byte is passed to bit 0 of the least
significant byte in the MQ shifter.
The shift may be made conditional on SSF. If SSF is high or floating. the shift result
will be sent to Y MUX. If SSF is low. F is passed unaltered and the MQ register is
not changed.

N

* A list of ALU operations that can be used with this instruction is given in Table 15.

U

~

CO

t-



Perform a circular left double precision shift of data in register 6 (MSH) and MQ (LSH),
and store the result back in register 6 and the MQ register.

~

Instr

Op,d

Op,d

Op,d Sel

Add,

Add,

EB1·

(')

Code
17·10
01110110

A5·AO
000110

B5·BO

-t

XX XXXX

EA EBO
0

00

Dest
Add,

Destination Selects

iNE3.

SELRF1·
SELRFO OEA

C5·CO

SELMO

WED

000110

0

0000

10

X

+-

RF(6)

0Ev3.
We 0EY0 OES
X

XXXX

CF2·
Cn CFO SSF

0

0

110

l

CO
CO
W Assume register file 6 holds 3708C618 (Hex) and MQ register holds 50A99AOE (Hex).
N
MSH

R

Source

0011 0111 00001000 110001100001 1000

Intermediate
Result

0011 0111 00001000110001100001 1000

I ALU Shifter

Destination

01101111 0001 0001 100011000011 0000

I

Source

0101 0000 1010 1001 1001 10100000 1110

I MQ register

+-

MQ register

Destination

10100001 0101 0011 0011 01000001 1100

I

+-

MQ shift result

RF(6)

+-

+-

R

+ Cn

ALU shifter result

LSH

3-154

MQ register

SMTC

Sign Magnitude/Two's Complement

I5I8I

FUNCTION
Converts data on the S bus from sign magnitude to two's complement or vice versa.

DESCRIPTION
The S bus provides the source word for this instruction. The number is converted by
inverting S and adding the result to the carry-in, which should be programmed high
for proper conversion; the sign bit of the result is then inverted. An error condition
will occur if the source word is a negative zero (negative sign and zero magnitude).
In this case, SMTC generates a positive zero, and the OVR pin is set high to reflect
an illegal conversion.
The sign bit of the selected operand in the most significant byte is tested; if it is high,
the converted number is passed to the destination. Otherwise the operand is passed
unaltered.
Available R Bus Source Operands
C3-CO
RF

A3-AO

(A5-AO) Immed

DA-Port

..
A3-AO

Mask
No

No

No

No

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ
Register
Yes

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Shift Operations

Y-Port

ALU

MQ

Yes

None

None

3-155

15 18

Sigm Magnitude/Two's Complement

SMTC

Control/Data Signals
Sign!:!1

en
2
....,

t

(")

-4

CO
CO

User

Use

Programmable

SSF

No

Inactive

SIOO

No

Inactive

SiOf

No

Inactive

SI02

No

Inactive

SI03

No

Inactive

Cn

Yes

Should be programmed high for proper conversion

Status Signals
ZERO
N
OVR

1 if result = 0
1 if MSB = 1
1 if input of most significant byte is 80 (Hex) and results in all other

bytes are 00 (Hex).
C = 1 if S = 0

tAl
N EXAMPLES (assumes a 32-bit configuration)
Convert the two's complement number in register 1 to sign magnitude representation
and stor~ the result in register 4.

Oprd
Add,

Oprd
Add,

Op,d Sel
EB1-

Dest
Add,

A5-AO

B5-BO

EA EBO

C5-CO

000001

X 00

000100

Instr

Code
17,10

0101 1000 XX XXXX

Destination Selects

WE3-

SELRF1-

SELMO

WEo

SELRFO

OEA

(ffij

0

0000

10

X

X

OEY3·

Example 1: Assume register file 1 holds C3F6D840 (Hex).
Source

11000011111101101101100001000000

I S-

Destination

1011 11000000 1001 00100111 11000000

I RF(4) -

RF(1)

S' + Cn

Example 2: Assume register file 1 holds 550927CO (Hex).

Source

0101 0101 0000 1001 00100111 11000000

I S-

Destination

01010101 0000 1001 00100111 11000000

I RF(4) -

3-156

RF(1)

S

CF2-

0EY0 DEs
XXXX

0

Cn
1

CFO
110

SMUll

Signed Multiply Iterate

I6I0 I

FUNCTION
Computes one of N-1 signed or N mixed multiplication iterations for computing an
N-bit by N-bit product. Algorithms for signed and mixed multiplication using this
instruction are given in the "Other Arithmetic Instructions" section.

DESCRIPTION
SMUll checks to determine whether the multiplicand should be added with the present
partial product. The instruction evaluates:
F

+-

R + S + Cn

F-S

if the addition is required
if no addition is required

A double precision right shift is performed. Bit 0 of the least significant byte of the
ALU shifter is passed to bit 7 of the most significant byte of the MO shifter; carry-out
is passed to the most significant bit of the ALU shifter.
The S bus should be loaded with the contents of an accumulator and the R bus with
the multiplicand. The Y bus result should be written back to the accumulator after
each iteration of UMULI. The accumulator should be cleared and the MO register loaded
with the multiplier before the first iteration.

C3-CO
A3-AO

(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Recommended S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ
Register
No

Recommended Destination Operands Shift Operations
RF

RF

(C5-CO) (B5-BO)
Yes

No

en
en

lt)

«
'It

"Z

en

Available R Bus Source Operands

RF

N
('I)

Y-Port

ALU

MQ

No

Right

Right

3-157

1610

Signed Multiply Iterate

Control/Data Signals
User

Signal
SSF

No

Inactive

5100

No

Passes LSB from ALU shifter to MSB of MQ shifter.

5101

No

SI02

No

Si03

No

Cn

Yes

Status Signals

en
Z
.....
~
~

Use

Programmable

ZERO

N
OVR

C

-f
CO
CO
W
N

3-158

1 if result = 0
1 if MSB = 1

o
1 if carry-out

Should be programmed low

SMUll

SMULT

Signed Multiply Terminate

I7I0 I

FUNCTION
Performs the final iteration for computing an N-bit by N-bit signed product. An algorithm
for signed multiplication using this instruction is given in the "other Arithmetic
Instructions" section.

DESCRIPTION
SMUll checks the present multiplier bit (the least significant bit of the MO register)
to determine whether the multiplicand should be added with the present partial product.
The instruction evaluates:
F

+-

R'

+ S + en

if the addition is required
if no addition is required

F-S

with the correct sign in the product.
A double precision right shift is performed. Bit 0 of the least significant byte of the
ALU shifter is passed to bit 7 of the most significant byte of the MO shifter.
The S bus should be loaded with the contents of an register file holding the previous
iteration result; the R bus must be loaded with the multiplicand. After executing SMULT,
the Y bus contains the most significant half of the product, and MO contains the least
significant half.
Available R Bus Source Operands
C3-CO
RF

A3-AO

(A5-AO) Immed

DA-Port

..
A3-AO

Mask
Yes

No

Yes

No

Recommended S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ
Register
No

Available Destination Qperands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Shifl Operations

Y-Port

ALU

MQ

No

Right

Right

3-159

17 10

Signed ,Multiply Terminate

Control/Data Signals
User

Signal
SSF

No

Inactive

5100

No

Passes LSB from ALU shifter to MSB of MQ shifter.

5101

No

5102

No

5103

No

en

Yes

Status Signals

en
:2
-...I

Use

Programmable

ZERO

N

~

OVR

l>
(")
-t

c

CO
CO
Co\)

N

3-160

1 if result = 0

1 if MSB = 1

o
1 if carry-out

Should be programmed low

SMULT

SNORM

Single-Length Normalize

I2I0 I

FUNCTION
Tests the two most significant bits of the MO register. If they are the same, shifts
the number to the left.

DESCRIPTION
This instruction is used to normalize a two's complement number in the MO register
by shifting the number one bit position to the left and filling a zero into the LSB (unless
the SIO input for that word is low). Data on the S bus is added to the carry, permitting
the number of shifts performed to be counted and stored in one of the register files.
The shift and the S bus increment are inhibited whenever normalization is attempted
on a number already normalized. Normalization is complete when overflow occurs.

C3-CO

C\I
M
00
00

..

()

A3-AO


~

o

~

Perform the computation A = (A + B)/2, where A and B are single-precision numbers.
Let A reside in register 1 and B be input via the DB bus.
Instr
Code
17·10
00000001

Oprd
Addr
A5-AO
000001

Oprd
Oprd Sel
Dest
Addr
EB1·
Addr
B5-BO
EA EBO C5-CO
XX XXXX 0 10
000001

Destination Selects
SELRF10eY3.
SELMQ
SELRFO OEA Oeii 0eY0
o 0000 10
X
X XXXX

WE3WED

DEs
0

CF2Cn CFO SSF
Ci 110 1

CO
CO Assume register file 1 holds 6Ab8C618 (Hex) and DB bus holds 51007530 (Hex).
W
N

Source

0110 10100000 1000 110001100001 1000

I R +- RF( 1)

Source

0101 0001 00000000 0111 0101 0011 0000

Is+- DB bus

Intermediate t
Result

10111011000010010011101101001000

Destination

0101110110000100 1001110110100100

I ALU Shifter R + S + en
I RF(1) ALU shift result
+-

+-

tAfter the intermediate operation (ADD), overflow has occurred and OVR status signal is set high. When the
arithmetic right shift is executed, the sign bit is corrected (see Table 16 for shift definition notes).

3-164

SRAD

Arithmetic Right Double Precision Shift

1

I

*

FUNCTION
Performs arithmetic right shift on MQ register (LSH) and result of ALU operation (MSH)
specified in lower nibble of instruction field.

DESCRIPTION
The result of the ALLi operation specified in instruction bits 13-10 is used as the upper
half of a double precision word, the contents of the MQ register as the lower half.
The contents of the ALU are shifted one bit to the right. The sign bit of the most
significant byte is retained unless the sign bit is inverted as a result of overflow. Bit 0
of the least significant byte in the ALU shifter is passed to bit 7 of the most significant
byte of the MQ register. Bit 0 of the MQ register's least significant byte is dropped.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to
the Y MUX.
* A list of ALU operations that can be used with this instruction is given in Table 15.
Shift Operations
ALU Shifter

MQ Shifter

Arithmetic Right Arithmetic Right
Available Destination Operands (ALU Shifter)
RF

RF

(C5-CO)

(B5-BO)

Yes

No

Y-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shifted output if high; passes ALU result

5100

No

LSB of ALU shifter is passed to MSB of MQ shifter,
and LSB of MQ shifter is dropped.

if low.
SI01

No

SI02

No

5103

No

Cn

No

Affects arithmetic operation specified in bits 13-10 of
instruction field.

3-165

I1 I*

Arithmetic Hight Double Precision Shift

SHAD

Status Signals t

ZERO

1 if result

N

= 0
=

1

MSB of result =

0

1 if MSB of result

o if
o

OVR
C

1 if carry-out condition

t C is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Perform the computation A = (A + B)/2, where A and B are two's complement numbers.
Let A be a double precision number residing in register 1 (MSH) and MQ (LSH). Let
~ B be a single precision number which is input through the DB bus.

en
~

»
o
-I
CO
CO
W
N

Instr

Code
17-10
0001 0001

Oprd
Add,
A5-AO
000001

Op,d Sel
Dest
EB1Add,
EAEBO
C5-CO
XX XXXX 0 10
000001
Op,d
Add,
B5-BO

Destination Selects

WE3.

SELMQ
0

SELRF1·
SELRFO
0000
10

WEo

0EY3OEA 0EEi 0EY0 5Es
X

X

XXXX

0

CF2Cn CFO SSF
0 110 1

Assume register file 1 holds 4A08C618 (Hex). and DB bus holds 51007530 (Hex).
and MQ register holds 17299AOF (Hex).
MSH
Source

01001010000010001100011000011000 I

Source

0101 0001 000000000111 0101 0011 0000 IS+- DB bus

Intermediate:!:
Result

Destination

R +- RF(1)

1001 1011 00001001 0011 1011 01001000 I

ALU

01001101100001001001110110100100 I

RF(1) +-

Shifter +- R

ALU

+

S

+

Cn

shift result

LSH
Source

Destination

0001

on 1 00101001

1001 101000001111

0000 1011 1001 0100 1100 1101 00000111

MO shifter +- MO register

MO register +- MQ shift result

:tAfter the intermediate operation (ADD), overflow has occurred and OVR status signal is set high. When the
arithmetic right shift is executed, the sign bit is corrected (see Table 16 for shift definition notes).

3-166

SHC

Circular Hight Single Precision Shift

I8I

*

FUNCTION
Performs circular right shift on result of ALU operation specified in lower nibble of
instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is shifted one bit
to the right. Bit 0 of the least significant byte is passed to bit 7 of the most significant
byte in the same word, which may be 1,2, or 4 bytes long depending on the selected
configuration.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to
the Y MUX .
• A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations
ALU Shifter

MQ Shifter

Circular Right

None

Available Destination Operands IALU Shifter)
RF

RF

(C5-CO)

(B5-BO)

Yes

No

Y-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high; passes ALU result

SIOO

No

Rotates LSB to MSB of the same word, which may

SIOl

No

be 1, 2, or 4 bytes long depending on configuration

if low.

SI02

No

SI03

No

Cn

No

Affects arithmetic operation specified in bits 13-10 of
instruction field.

3-167

I8 I*

Circular Right Single Precision Shift

SRC

Status Signals t
ZERO

N

1 if result = 0
1 if MSB of result = 1

o if MSB of result
OVR

C

= 0

1 if signed arithmetic overflow
1 if carry-out condition

t C is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Perform a circular right shift of register 6 and store the result in register 1.
C/)

2

"l>
~

n

Instr

Code
17-10

Oprd
Addr
A5-AO

1000 0110 000110

Oprd
Addr
85-80

XX

Oprd Sel
E81·
EA E80

xxxx a xx

Dest
Addr
C5-CO
00 0001

Destination Selects

SELRF1-

SELMa

iiVE3.
WEO

a

0000

10

SELRFO

OEY3-

OEA 0Eii
X

X

CF2OEYO OES Cn CFO SSF
a a 110 1

xxxx

-t

00 Assume register file 6 holds 3788C618 (Hex).
00

~

Source

0011 0111 1000 1000 1100 0110 0001 1000

Intermediate
Result

0011 0111 1000 1000 1100 0110 0001 1000

Destination

0001 1011 1100 0100 0110 0011 0000 1100

3-168

IR

+-

RF(6)

I ALU Shifter R + Cn
I RF( 1) ALU shift result
+-

+-

SHCD

Circular Hight Double Precision Shift

I9I

*

FUNCTION
Performs circular right shift on MO register (LSH) and result of ALU operation (MSH)
specified in lower nibble of instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is used as the upper
half of a double precision word, the contents of the MO register as the lower half.
The contents of the ALU and MO shifters are rotated one bit to the right. Bit 0 of the
least significant byte in the ALU shifter is passed to bit 7 of the most significant byte
of the MO shifter. Bit 0 of the least significant byte is passed to bit 7 of the most
significant byte of the ALU shifter.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MXU and MO register. If SSF is low, the Y MUX and MO register
will not be altered.

N
M

* A list of ALU operations that can be used with this instruction is given in Table 15.

U

~
~

«~

Shift Operations

"

Z

en

ALU Shifter

MQ Shifter

Circular Right

Circular Right

Available Destination Operands (ALU Shifter)
RF

RF

(C5-CO)

(B5-BO)

Yes

No

Y-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high; passes ALU result and

SIOO

No

Rotates LSB of ALU shifter to MSB of MQ shifter,
and LSB of MQ shifter to MSB of ALU shifter

retains MQ register if low.
SIOl

No

SI02

No

SI03

No

Cn

No

Affects arithmetic operation specified in bits 13-10 of
instruction field.

3-169

I9 I*

Circular Hight Double Precision Shift

SHCD

Status Signals t
1 if result = 0

ZERO

1 if MSB of result = 1

N

o if

MSB of result = 0

1 if signed arithmetic overflow

OVR

C

if carry-out condition

t C is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)

en

Perform a circular right double precision shift of the data in register 6 (MSH) and MQ
(LSH), and store the result back in register 6 and the MQ register.

2

.....

~

l>
n
-t

Instr

Op,d

Op,d

Op,d Sel

Code
17·10

Add,

Add,
BS;eO

EB1EA EBO

10010110

AS-AO
000110

XX XXXX

0

XX

Dest
Add,

Destination Selects

WE3-

CS-CO

SELMQ

WEci

SELRF1·
SELRFO

000110

0

0000

10

0EY30eA DeB 0eYli 0Es
X

X

XXXX

0

Cn

CF2·
CFO

0

110

CO
CO Assume register file 6 holds 3788C618 (Hex) and MQ register holds 50A99AOF (Hex).
W
N

MSH
R

RF(6)

Source

0011 0111 00001000110001100001 1000

Intermediate
Result

0011 0111 0000 1000 110001100001 1000

Destination

1001 1011 1000010001100011 00001100

Source

0101 000010101001 1001 101000001111

MQ shifter - MQ register

Destination

001010000101 0100 1100 1101 0000 0111

MQ register - MQ shift result

+-

I ALU shifter R + Cn
I RF(6) - ALU shift result
+-

LSH

3-170

SRL

Logical Right Single Precision Shift

FUNCTION
Performs logical right shift on result of ALU operation specified in lower nibble of
instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is shifted one bit
to the right. A zero is placed in the bit 7 of the most significant byte of each word
unless the SIO input for the word is programmed low; this will force the sign bit to
one. The LSB is dropped from the word, which may be 1,2, or 4 bytes long depending
on selected configuration.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to
the Y MUX.
• A list of ALU operations that can be used with this instruction is given in Table 15.

N

('I)
ex)
ex)

....

U

Shift Operations



(")
~

CO
CO
W
N

3-172

I

+-

DA bus

RF( 1)

+-

+-

R + en

ALU shift result

SRLD

Logical Right Double Precision Shift

FUNCTION
Performs logical right shift on MQ register (LSH) and result of ALU operation (MSH)
specified in lower nibble of instruction field.

DESCRIPTION
The result of the ALU operation specified in instruction bits 13-10 is used as the upper
half of a double precision word, the contents of the MQ register as the lower half.
The ALU result is shifted one bit to the right. A zero is placed in the sign bit of the
most significant byte unless the SIO input for that word is programmed low; this will
force the sign bit to one. Bit 0 of the least significant byte is passed to bit 7 of the
most significant byte of the MQ shifter. Bit 0 of the least significant byte of the MQ
shifter is dropped.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result
will be sent to the Y MUX and MQ register. If SSF is low, the ALU result and MQ
register will not be altered.
* A list of ALU operations that can be used with this instruction is given in Table 15.
Shift Operations
ALU Shifter

MQ Shifter

Logical Right

Logical Right

Available Destination Operands (ALU Shifter)
RF

RF

(C5-CO)

(B5-BO)

Yes

No

Y-Port
Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

Yes

Passes shift result if high; passes ALU result and

5100

Yes

Fills a zero in M5B if high or floating;

SiOT

Yes

fills a one M5B if low.

5102

Yes

5103

Yes

Cn

No

retains MQ

Affects arithmetic operation specified in bits 13-10 of
instruction field.

3-173

I3 I*

Logical Right Double Precision Shift

SRLD

Status Signals t

ZERO

1 if result = 0

N == 1 if MSB of result = 1

o if MSB of result
OVR
C

= 0

1 if signed arithmetic overflow
if carry-out conditioh

t C is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated
after shift operation .. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
t/)

2

~

l>
C')

-4
00
00

Perform a logical right double precision shift of the data in register 1 (MSH) and MO
(LSH), filling a one into the most significant bit, and store the result back in register 1
and the MO register.
Instr

Op,d

Op,d

Op,d S.I

Dest

Code

Add,

Add,

EB1·

Add,

17·10

A5·AO

B5·BO

00110110 XX XXXX 00 0001

EAEBO
X

00

Destination Selects

iiVe3.

SELRF1·

C5·CO

SELMO

WEo

SELRFO

000001

0

0000

10

om·

CF2·

i5'EA 0Eii 0EY0 0Es
X

X

0

XXXX

Cn CFO
0

Si03. iESi03.
SIOO iESiOo

110 1110

0000

W
N Assume register file 1 holds 2DA8C615 (Hex) and MO register holds 50A99AOE (Hex).

MSH
Source

0010 1101 10101000 1100 0110 0001 0101

R

Intermediate
Result

0010 1101 10101000 110001100001 0101

ALU Shifter

Destination

10010110 1101 0100 0110 00110000 1010

I RF(1)

Source

0101 0000 1010 1001 1001 1010 0000 1110

I MQ shifter

Destination

1010 1000 0101 0100 1100 1101 0000 0111

+-

RF(1)

+-

+-

S + Cn

ALU shift result

LSH

3-174

+-

MQ register

MQ register

+-

MQ shift result

,

SUBI

Subtract Immediate

I7I8I

FUNCTION
Subtracts four-bit immediate data on A3-AO with carry from S-bus data.

DESCRIPTION
Immediate data in the range 0 to 15, supplied by the user at A3-AO, is inverted and
added with carry to S.
Available R Bus Source Operands (Constant)
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

..
A3-AO
Mask

No

Yes

No

N
M
00
00

No

t-

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

«~

MO

"Z

Register

Yes

Yes

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

U

No

CJ)

Shift Operations

Y-Port

ALU

MO

Yes

None

None

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

SIOO

No

Inactive

SiOT

Inactive

Si02

No
No

5103

No

Inactive

Cn

Yes

Two's complement subtraction if programmed high.

Inactive

3-175

1718 1

SUBI

Subtract Immediate

Status Signals
ZEAO
N
OVA
C

1

if result =

1

if

1

0

MSB = 1

if arithmetic Signed overflow
if carry-out

EXAMPLE (assumes a 32-bit configuration)
Subtract the value 12 from data on the DB bus, and store the result into register file 1.

en
z

Inst,
Code
17-10
01111000

Op,d
Add,
A5-AO
001100

Op,d
Op,d Sel
Dest
EB1Add,
Add,
B5-BO
Eli EBO C5-CO
XX XXXX
X 10 00 0001

Destination Selects

WE3SELMa

o

SELRF1SELRFO
0000
10

WEo

0EY3-

15EA

Qeij

X

X

0eY0 0eS
XXXX

'" Assume bits A3-AO hold C (Hex) and DB bus holds 24000100 (Hex).
~

l>
o
-t

CO
CO
W
N

Source

00000000 0000 0000 0000 0000 00001100

I A +- A3-AO

Source

0010010000000000 0000 0001 00000000

Is+-

Destination

001001000000000000000000 1111 0100

I

3-176

DB bus

AF( 1)

+- A'

+

S

+

Cn

0

CF2Cn CFO
1 110

Subtract R with Carry (R ' + S + Cn)

SUBR
FUNCTION

Subtracts data on the R bus from S with carry.

DESCRIPTION
Data on the R bus is subtracted with carry from data on the S bus. The result appears
at the ALU and MQ shifters.
* The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble (17-14) of the instruction field. The rellult may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

Yes

No

No

Available S Bus Source Operands
RF
MQ
DB-Port
(B5-BO)
Register
Yes

Yes

Yes

Available Destination Operands
RF
RF
(C5-CO) (B5-BO)
Yes

Y-Port
Yes

No

ALU

MQ

Shifter

Shifter

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

SIOO

No

instruction field.

SI01

No

SI02

No

SI03

No

Cn

Yes

Two's complement subtraction if programmed high.

3-177

1* 12

Subtract q with Carry (R'

+

S

+

SUBR

Cn)

Status Signals t

ZERO

if result = 0

N

1 if MSB = 1

OVR

if signed arithmetic overflow

C

if carry-out

t C is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated after shift
operation. OVR (overflow) is evaluated after ALU operl;ltion and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Subtract data in register 1 from data on the DB bus, and store the result in the MQ
register.

en
2

-..J

t
n

Instr

Oprd

Oprd

Code
17-10
11100010'

Addr
A5-AO

Addr
85-80

000001

XX XX'XX

Oprd Sel
EB1EAE80
0

10

Destination Selects

Dest
Addr
C5-CO

SELMQ

XX XXXX

1

WEa-

SELRF1-

WEO

SELRFO
XXXX
XX

0eV3OEA OEB 6EYli OES
X

X

XXX X

0

-4

CF2Cn CFO
1

110

00 Assume register file 1 holds 15008400 (Hex) and DB bus holds 4900C350 (Hex).

ffiN

Source

0001 0101 0000 0000 1000 0100 1101 0000

I R-

RF( 1)

Source

01001001 0000000011000011 0101 0000

I S-

DB bus

Destination

0011 0100000000000011 111010000000

I

3-178

MQ register - R'

+ S + Cn

Subtract S with Carry (R + S' + Cn)

SUBS
FUNCTION

Subtracts data on the S bus from R with carry.

DESCRIPTION
Data on the S bus is subtracted with carry from data on the R bus. The result appears
at the ALU and MQ shifters.
"The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the
upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions
are listed in Table 15.

Available R Bus Source Operands
C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

MQ
Register

Yes

Yes

Available Destination Operands
RF

RF

(C5-CO) (B5-BO)
Yes

No

Y-Port
Yes

ALU

MQ

Shifter

Shifter

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Affect shift instructions programmed in bits 17-14 of

5100

No

instruction field.

5101

No

5102

No

5103

No

Cn

Yes

Two's complement subtraction if programmed high.

3-179

I* I3

Subtract S with Carry (R + Sf + Cn)

SUBS

StatUI> Signals t

ZERO
N

OVR
C

if result = 0
1 if MSB = 1
1 if signed arithmetic overflow
if carry-out

t C is ALU carry-out and is evaluated before ~hift operation. ZERO and N (negative) are evaluated
after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

EXAMPLE (assumes a 32-bit configuration)
Subtract data on the DB bus from data in register 1, and store the result in the MQ
register.
f/)

2
....,
~

l>

(")

....

ex)
ex)

W
N

Op,d

Op,d

Add,

Add,

A5·AO

B5·BQ
XX XXXX

Instr
Code
17-10
11100011

000001

Op,d Sel
EB1-

Oest
Add,

EAEBO

C5-CO

SELMQ

XX XXXX

1

0

10

Destination Selects

We3-

SELRF1·
WEO SELRFO
XXXX
XX

OEY3-

00i DEB 1iEYo 0Es
X'

X

XXXX

0

CF2·
Cn CFO
1

110

Assume register file 1 holds 15008400 (Hex) and DB bus holds 4900C350 (Hex).
Source

000101010000 000010000100110) 0000

I .R .... RF(1)

Source

01001001 0000000011000011 0101 0000

Is+- DB bus

Destination

3-180

1100 1011

i 111

1111 11000001 10000000

I MQ register .... R + S' + en

TBO

Test Bit (Zero)

3

8

FUNCTION
Tests bits in selected bytes of S-bus data for zeros using mask in C3-CO::A3-AO.

DESCRIPTION
The S bus is the source word for this instruction. The source word is passed to the
ALU, where it is compared to an a-bit mask, consisting of a concatenation of the C3-CO
and A3-AO address ports (C3-CO::A3-AO). The mask is input via the R bus. The test
will pass if the selected byte has zeros at all bit locations specified by the ones of
the mask. Bytes are selected by programming the SIO inputs low. Test results are
indicated on the ZERO output, which goes to one if the test passes. Register write
is internally disabled during this instruction.
Available R Bus Source Operands

N
M
CO
CO

C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

..

I(.)
c:(

A3-AO
Mask

No

No

No

~

"Z

Yes

(IJ

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

MQ
Register

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

SIOO

Yes

Byte Select

SI01

Yes

Byte Select

SI02

Yes

Byte Select

SI03

Yes

Byte Select

Cn

No

Inactive

3-181

131a

Test Bit (Zero)

TBO

.Status Signals
ZERO

1 if result (selected bytes)

N

o

OVR

0
0

C

Pass

EXAMPLE (assumes a 32-bit configuration)
Test bits 7, 6 and 5 of bytes 0 and 2 of data in register 3 for zeroes.

en
2

"

~

Instr

Mask

Oprd

Oprd Sol

Mask

Code

(LSH)

Addr

EB1·

(MSH)

17-10

A3-AO

B5-BO

EA EBO

C3-CO

0011 1000

0000

000011

X 00

1110

Destination Selects

WEi·

SELRF1-

SELMa

WeO

SELRFO

X

XXXX

xx

OEV3-

CF2-

0eA i5EB OEYO 0Es
x

x

XXXX

Q

Cn CFO

SiO"3- iEsi'1i35100 iESiOo

X 110 1010

Assume register file 3 holds 881 CD003 (Hex).

l>

(")

Source

11100000111000001110000011100000

I

R +- Mask (C3-CO::A3-AO)

Source

100010000001 11001101 000000000011

I

SN

-I

CO
CO
W
N

Output
tn

nth byte

3-182

GJ

+-

ZERO

RF(3)n t

+-

1

0000

Test Bit (One)

TB1

2

8

FUNCTION
Tests bits in selected bytes of S-bus data for ones using mask in C3-CO::A3-AO.

DESCRIPTION
The S bus is the source word for this instruction. The source word is passed to the
ALU, where it is compared to an 8-bit mask, consisting of a concatenation of the C3-CO
and A3-AO address ports (C3-CO::A3-AO). The mask is input via the R bus. The test
will pass if the selected byte has ones at all bit locations specified by the ones of the
mask. Bytes are selected by programming the SIO inputs low. Test results are indicated
on the ZERO output, which goes to one if the test passes. Register write is internally
disabled for this instruction.
Available R Bus Source Operands
C3-CO
RF

A3-AO

DA-Port

(A5-AO) Immed

..
A3-AO
Mask

No

No

No

Yes

Available S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port

MQ
Register

Yes

Yes

Control/Data Signals
Signal

User

Use

Programmable

SSF

No

Inactive

SIOO

Yes

Byte Select

SIOl

Yes

Byte Select

SI02

Yes

Byte Select

Si03

Yes

Byte Select

Cn

No

Inactive

3-183

12 18

Test Bit (One)

TB1

Status Signals

ZERO

1 if result (selected bytes)

Pass

o

N
OVR

0

C

0

EXAMPLE (assumes a 32-bit configuration)
Test bits 7, 6 and 5 of bytes 1 and 2 of data in register 3 for ones.

CJ)

Destination Selects

Instr

Mask

Oprd

Oprd Sel

Mask

Code

(LSH)

Addr

EB1-

(MSH)

17-10

A3-AO

B5-BO

Eli EBO

C3-CO

SELMa

WEo

SELRFO

0010 1000

0000

000011

1110

x

xxxx

xx

x

00

0EY3-

WE3- SELRF1-

CF2-

OEA DEe 0W0 0eS
x

x

XXX X

0

Cn CFO

SiOO- iESi03SiOO iESiOii

X 110 1001

2
""'" Assume register file 3 holds 881 CFOO;3 (Hex).
~

l>

(")

Mask

11100000 111000001110000011100000

Rn - Mask (C3-CO::A3-AO)

Source

100010000001 11001101 000000000011

Sn - RF(3)n t

~

00
00
W
N

Output
tn

3-184

nth byte

G

ZERO - 0

0000

UDIVI

Unsigned Divide Iterate

IcI0 I

FUNCTION
Performs one of N-2 iterations of nonrestoring unsigned division by a test subtraction
of the N-bit divisor from the 2N-bit dividend. An algorithm using this instruction can
be found in the "Other Arithmetic Instructions" section.

DESCRIPTION
UDIVI performs a test subtraction of the divisor from the dividend to generate a quotient
bit. The test subtraction may pass or fail and is corrected in the subsequent instruction
if it fails. Similarly a failed test from the previous instruction is corrected during
evaluation of the current UDIVI instruction (see the "Other Arithmetic
Instructions"section for more details).
The R bus must be loaded with the divisor, the S bus with the most significant half
of the result of the previous instruction (UDIVI during iteration or UDIVIS at the
beginning of iteration). The least significant half of the previous result is in the MQ
register.

M
CO
CO

UDIVI checks the result of the previous pass/fail test and then evaluates:

U

F+-R+S
F +- R' + S

+ en

N

~

«~
,...

if the test is failed
if the test is passed

Z

CJ)

A double precision left shift is performed; bit 7 of the most significant byte of the
MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7
of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is
circulated into the least significant bit of the MQ shifter.
Available R Bus Source Operands
C3-CO
A3-AO
RF
(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Recommended S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ

Register
No

3-185

Ie 10

Unsigned Divide Iterate

Recommended Destination Operands Shift Operations
RF

RF

(C5-CO) (85-80)
Yes

No

Y-Port

ALU

MQ

Yes

Left

Left

Control/Data Signals
User
Signal

Programmable

SSF

No

Inactive

5100

No

Passes internally generated end-fill bit.

5101

No

en

5102

No

5103

No

-...I

Cn

Yes

2

~

»

(") Status Signals

-i
00
00
W
N

ZERO

1 if result = 0

N

o

OVR

o

C

3-186

1 if carry-out

Use

Should be programmed high.

UDIVI

UDIVIS

Unsigned Divide Start

IBI0 I

FUNCTION
Computes the first quotient bit of nonrestoring unsigned division. An
algorithm using this instruction is given in the "Other Arithmetic Instructjions" section.

DESCRIPTION
UDIVIS computes the first quotient bit during nonrestoring unsigned division by
subtracting the divisor from the dividend. The resulting remainder due to subtraction
may be negative; the subsequent UDIVI instruction may have to restore the remainder
during the next operation.
The R bus must be loaded with the divisor and the S bus with the most significant
half of the remainder. The result on the Y bus should be loaded back into the register
file for use in the next instruction. The least significant half of the remainder is in the
MQ register.

~

UDIVIS computes:

ex)
ex)

F

+-

R'

+

S

+

....

U

Cn

A double precision left shift is performed; bit 7 of the most significant byte of the
MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7
of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is
circulated into the least significant bit of the MQ shifter.
Available R Bus Source Operands
C3-CO
RF

A3-AO

(A5-AO) Immed

DA-Port

..
A3-AO
Mask

Yes

No

Yes

No

Recommended S Bus Source Operands
RF
(B5-BO)
Yes

DB-Port
Yes

MQ
Register
No

Recommended Destination Operands Shift Operations
RF
RF
(C5-CO) (B5-BO)
Yes

No

Y-Port

ALU

MQ

Yes

Left

Left

3-187


~

(1

ZERO

N
OVR

C

-4
CO
CO
W
N

3-190

1 if intermediate result =0

o
o
1 if carry-out

UDiVIT

UMULI

Unsigned Multiply Iterate

o o

FUNCTION
Performs one of N unsigned multiplication iterations for computing an N-bit by N-bit
product. An algorithm for unsigned multiplication using this instruction is given in the
"Other Arithmetic Instructions" section.

DESCRIPTION
UMULI checks to determine whether the multiplicand should be added with the present
partial product. The instruction evaluates:
F +- R + S +
F+-S

en

if the addition is required
if no addition is required

A double precision right shift is performed. Bit 0 of the least significant byte of the
ALU shifter is passed to bit 7 of the most significant byte of the MQ shifter; carry-out
is passed to the most significant bit of the ALU shifter.
The S bus should be loaded with the contents of an accumulator and the R bus with
the multiplicand. The Y bus result should be written back to the accumulator after
each iteration of UMULI. The accumulator should be cleared and the MQ register loaded
with the multiplier before the first iteration.
R Bus Source Operands
C3-CO
RF
A3-AO
(A5-AO) Immed
Yes

No

DA-Port

Yes

..
A3-AO
Mask
No

Recommended S Bus Source Operands
RF
MQ
DB-Port
(85-80)
Register
Yes
Yes
No
Recommended Destination Operands Shift Operations
RF

RF

(C5-CO) (85-80)
Yes

No

Y-Port

ALU

MQ

Yes

Right

Right

3-191

N

~
CO
~

u


(")

-I
00
00
eN

0')

4-2

SN74ACT8836 32·Bit by 32·Bit
Multiplier/Accumulator
The SN74ACT8836 is a 32-bit integer multiplier/accumulator (MAC) that accepts
two 32-bit inputs and computes a 64-bit product. An on-board adder is provided
to add or subtract the product or the complement of the product from the
accumulator.
To speed-up calculations, many modern systems off-load frequently-performed
multiply/accumulate operations to a dedicated single-cycle MAC. In such an
arrangement, the 'ACT8836 MAC can accelerate 32-bit microprocessors,
building block processors, or custom CPUs. The' ACT8836 is well-suited for
digital signal processing applications, including fast fourier transforms, digital
filtering, power series expansion, and correlation.

4-3

rJ)

2:
~

~

»
(")
~

co
co

eN
0')

4-4

SN74ACTB836
32·BIT BY 32·81T MULTIPLIER/ACCUMULATOR
03046. JANUARY 1988

•

Performs Full 32-Bit by 32-Bit
Multiply/Accumulate in Flow-Through Mode
in 60 ns (Max)

•

Can be Pipelined for 36 ns (Max) Operation

•

Performs 64-Bit by 64-Bit Multiplication in
Five Cycles

•

Supports Division Using Newton-Raphson
Approximation

•

Signed, Unsigned, or Mixed-Mode Multiply
Operations

•

EPIC'· (Enhanced-Performance Implanted
CMOS) l-J.'m Process

•

Multiplier, Multiplicand, and Product Can be
Complemented

•

Accumulator Bypass Option

•

TTL I/O Voltage Compatibility

•

Three Independent 32-Bit Buses for
Multiplicand, Multiplier, and Product

•

Parity Generation/Checking

•

Master/Slave Fault Detection

•

Single 5-V Power Supply

•

Integer or Fractional Rounding

description
The' ACT8836 is a 32-bit by 32-bit parallel multiplier/accumulator suitable for low-power, high-speed
operations in applications such as digital signal processing, array processing, and numeric data processing.
High speed is achieved through the use of a Booth and Wallace Tree architecture.
Data is input to the chip through two registered 32-bit DA and DB input ports and output through a registered
32-bit Y output port. These registers have independent clock enable signals and can be made transparent
for flowthrough operations.

II

The device can perform two's complement, unsigned, and mixed-data arithmetic. It can also operate as
a 64-bit by 64-bit multiplier. Five clock cycles are required to perform a 64-bit by 64-bit multiplication
and multiplex the 128-bit result. Division is supported using Newton-Raphson approximation.
A multiply/accumulate mode is provided to add or subtract the accumulator from the product or the
complement of the product. The accumulator is 67 bits wide to accommodate possible overflow. A warning
flag (ETPERR) indicates whether overflow has occurred.
A rounding feature in the' ACT8836 allows the result to be truncated or rounded to the nearest 32-bits.
To ensure data integrity, byte parity checking is provided at the input ports, and a parity generator and
master/slave error detection comparator are provided at the output port.
The SN74ACT8836 is characterized for operation from OOC to 70°C.

2:

o

i=
c:r:
~
a:

ou.

-

2:

w

(,)

2:

~

Q

c:r:

EPIC is a trademark of Texas Instruments Incorporated

ADVANCE INFORMATION doc.mants contain

~~~;:d::~nO:h:::or~::f~::.!~ac~::=,.:~

uta and other specifications are subject to change
without notice.

TEXAS

~

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

Copyright © '988, Texas Instruments Incorporated

4-5

SN74ACT8836

32-BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
logic symbol
32 x 32 MULTIPLIER/
ACCUMULATOR
4>
74ACT8836
ClK
CKEA
CKEB

CKEi
CKEY
OASGN
OBSGN
COMPl
RNOO
RN01
ACCO
ACC1

•

SFTO
SFT1

(J)

2

FTO
FT1

'-I
~

l>

n

SELY

-I
CO
CO

SElO
EA

EB

W

en

SElREG
WEMS
WElS

»C
<
»2

(H1)

.....
......
......

(H151
(H2)
(G141
(C12)

,....

OA31

m

'T1

0

OB31

PAR
STAT

OA REG
DB REG

ClK
EN

I REG

Y PORT
MASTER/SLAVE

(0151

EOUAl CHK

Y REG

(014)

0

(G13)

110

(H12)
(G12)

INSTR
INPUTS

(E15)
(C14)

(A15)
(E14)
(B15)

3

(M8)

0

(09)

OA
PORT

(013)
(F1)

PARITY
INPUTS

o

ISHIFTER
1 CONTROL

(G4)
(H13)

1

08
PORT

CONTROL
EXTENDED
PRECISION

YMUX

(H14)

......

(C3)

......

OMUX
RMUX

(06)
(07)

3

(B3)
(G1)

(05)
(M7)

0

o I FEEOTHROUGH

(G15)

(P9)
(010)

3

(813)
(B12)

0
2

Y OUT/EN

SMUX

(814)

...,

(03)
(02)

TESTI 0
PINS 1

_l'.. RAorR81
MS 32-BITS WRITE
...... lS 32-BITS ENA8lE

(G3)

..,
0

•••

•
•31•
0

••
•

•

••
31

PERRB
PERRY
MSERR
ETPERR
PYO
PY1
PY2
PY3
PAO
PA1
PA2
PA3
PBO
PB1
PB2

YETPO
YETP1
YETP2

(C13)

(E7)
(01)

PERRA

PB3

I

INPUT
SELECT

TPO
TP1

r

I OAT~
I OAT~

~
I

RESULTS;

::D

s:
»

:::!

0
2

TEXAS ."

4-6

(E141

Y PORT
PARITY

OBO

2

(081
(C15)

08 PORT

(F15)

OAO

0

(881

OA PORT

ClK

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TeXAS 75265

0

••
•
31

••
•

YO

Y31

SN74ACT8836
32·81T 8Y 32·81T MULTIPLIER/ACCUMULATOR

functional block diagram (positive logic)

SGNEXT
SELD

PA3-PAO +-_--/-..;4'--_+-_ _ _-1
PB3-PBO

+-_--1-:..4'--_+-___-1

2

PERRA

+ ____-+____....J

PERRB

+-----+-------'
2

SFT1-SFTO

SELREG
WEMS
WELS

32

DA31-DAO+--+3::.:2=-e_~_ _ _.....,

32
32

CKEA~----~----~---;--I

DB31-DBO
CKEB

CKEI
EA
~-----H-~EB

'----'

(0
('I)

DASGN
DBSGN

L -_ _ _ _ _ _ _-I

CO
CO

MULTIPLIER/ADDER STAGE 1

l-

RND1-RNDO

ACC1-ACCO

t)

PIPELINE REGISTER

COMPL
2

«

MULTIPLIER/ADDER STAGE 2

""'"2:
en

~----------1-CKEY

FT1-FTO~
TP1-TPO~

2

o

CLK+--

VCC~

l-

GND~

e:(

J'----------~~SELY

PERRY

~

a:

oLL
2
w

U

2

OEY~----------+~~~--~~~---~

ETPERR YETP2-YETPO

Y31-YO

e:(

>
C

MSERR PY3-PYO

e:(

TEXAS . "

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

4-7

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
GB PIN·GRID·ARRAY PACKAGE
(TOP VIEWI

2
A
B

C

D
E

F
G

H

J
K
L
M

N
p

R

3

4

5

• • • •
•
• @ • •
• • • •
• • •
• • •
• • •
• • • •
• • • •
• • • •
• • •
• • •
• • •
• • • •
•@ • •
• • • •

6

7

8

9

• • • •
• • • •
• • • •
• •

•
• • •
• • •
• • •

•
•
•
•

10 11 12 13 14 15

• • • • •
• • • • •
• • • • •
•
•
•
•
• •
• •
• •
•
•
•
•
• • • • •
• • • • •
• • • • •

• •

@.

•
•
•
•

•
•
•
•

• •

•
•
•
•
•

•
•
•
•
•
• •

@ •

• •

GB PACKAGE PIN ASSIGNMENTS

NO.
Al
A2
A3
A4
A5
A6
A7
A8
A9
Al0
All
A12
A13
A14
A15
Bl
B2
83
B4
B5
B6
B7
B8
B9
Bl0
Bll

l>

C

<

l>
2

(')

m

2

."

o:xJ
s:

l>

:::!

o

PIN
NAME

Y8
Yl0
Yll
Y13
Y14
Y16
Y18
Y19
Y21
Y23
Y25
Y27
Y28
Y30
PYl
Y2
Y6
SELY
Y7
Y9
Y12
Y17
Y20
Y26
Y29
Y31

NO.
B12
B13
B14
815
Cl
C2
C3
C4
C5
C6
C7
C8
C9
Cl0
Cl1
C12
C13
C14
C15
Dl
D2
D3
D7
D8
09
D13

PIN
NAME

YETPl
YETPO
YETP2
PY3
YO
Y4
EB
Y5
VCC
GND
Y15
GND
Y22
GND
VCC
CKEY
OEY
ACCO
PERRY
WEMS
TPl
TPO
GND
VCC
Y24
ACCl

NO.
D14
D15
El
E2
E3
E13
E14
E15
Fl
F2
F3
F13
F14
F15
Gl
G2
G3
G4
G12
G13
G14
G15
Hl
H2
H3
H4

PIN
NAME

PYO
ETPERR
SELREG
Y3
GND
GND
PY2
RNDl
SFTO
Yl
GND
GND
MSERR
DASGN
SELD
SGNEXT
WELS
SFT1
RNDO
DBSGN
CKEI
FTl
CLK
CKEB
DBO
DBl

NO.
H12
H13
H14
H15
Jl
J2
J3
J4
J12
J13
J14
J15
Kl
K2
K3
K13
K14
K15
L1
L2
L3
L13
L14
L15
Ml
M2

PIN
NAME

COMPL
FTO
EA
CKEA
DB2
DB3
DB5
DB7
DA26
DA24
DA30
DA31
D84
DB9
D811
DA22
DA28
DA29
DB6
DB15
DB13
DA18
DA20
OA27
DB8
DB17

2

..If

4·8

TEXAS
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

PIN

NO.
M3
M7
M8
Ml0
M13
M14
M15
Nl
N2
N3
N4
N5
N6
N7
N8
N9
Nl0
Nll
N12
N13
N14
N15
Pl
P2
P3
P4

NAME

DB18
PBl
PAO
DA6
DA16
DA17
DA25
DB10
DB19
DB20
DB21
DB23
DB27
VCC
GND
DAO
DA4
DA10
DA13
DA15
DA19
DA23
DB12
DB16
DB24
D822

NO.
P5
P6
P7
P8
P9
Pl0
Pll
P12
P13
P14
P15
Rl
R2
R3
R4
R5
R6
R7
R8
R9
Rl0
Rll
R12
R13
R14
R15

PIN
NAME

DB25
D829
DB31
PERRA
PA2
DA2
DA8
DA12
DA14
DA11
DA21
DB14
DB26
DB28
D830
PBO
PB2
PB3
PERRB
PAl
PA3
DAl
DA3
DA5
DA7
DA9

SN74ACT8836
32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR

PIN
NAME

ACCO
ACCI
ClK
CKEA
CKES

NO.
C14
013
HI
H15

110

I

DESCRIPTION
Accumulate mode ope ode (see Table 2)

I

System clock

I

Clock enable for A register, active low

I

Clock enable for 8 register, active low

Clock enable for Y register, active low

CKEI

H2
G14

CKEY

C12

I
I

COMPl

H12

I

DAO
DAI
DA2
DA3
DA4
DA5
DA6
DA7
DAB
DA9
DA10
DAll
DA12
DA13
DA14
DA15
DA16
DA17
DA18
DA19
DA20
DA21
DA22
DA23
DA24
DA25
DA26
DA27
DA28
DA29
DA30
DA31

N9
Rll
Pl0
R12
Nl0
R13
Ml0
R14
Pll
R15
NIl
P14
P12
N12
P13
N13
M13
M14
l13
N14
l14
P15
K13
N15
J13
M15
J12
l15
K14
K15
J14
J15

DASGN

F15

Clock enable for I register. active low

Product complement control; high complements multiplier result, low passes multiplier unaltered
to accumulator.

U)
('I)
ex)
ex)

I

I-

DA port input data bits 0 through 31

U


C
~

TEXAS •
INSTRUMENTS
POST OFFICE BOX 656012. DALLAS, TEXAS 75265

4-11

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR

PIN

»c
<
»z

NAME

NO.

YO
Yl
Y2
Y3
Y4
Y5
Y6
Y7
Y8
Y9
Yl0
Yll
Y12
Y13
Y14
Y15
Y16
Y17
Y18
Y19
Y20
Y21
Y22
Y23
Y24
Y25
Y26
Y27
Y28
Y29
Y30
Y31
YETPO
YETPl
YETP2

Cl
F2
Bl
E2
C2
C4
82
84
Al
B5
A2
A3
B6

110

DESCRIPTION

A4

A5
C7
A6
B7
A7
A8
B8
A9
C9
Al0
09
All
B9
A12
A13
Bl0
A14
Bl1
B13
B12
B14

110

110

Y port data bus. Outputs data from Y register (OEY ::::; L); inputs data to master/slave comparator

(DEY = HI.

Data bus for extended precision product. Outputs three most significant bits of the 67-bit multiplier
core result; inputs external data to master/slave comparator.

TABLE 1. INSTRUCTION INPUTS
Low

Signal

High

OASGN

Identifies DA Input data as two's complement

m

OBSGN

Identifies DB input data as two's complement

Identifies DB input data as unsigned

Z

RNOO

Rounds integer result

Leaves integer result unaltered

o;:g
s:

RNOl

Rounds fractional result

Leaves fractional result unaltered

(")

"T1

COMPL

»-I

ACCO
ACCl

oz

Identifies DA input data as unsigned

Complements the product from the multiplier

Passes the product from the multiplier to the

before passing it to the accumulator

accumulator unaltered

See Table 2

See Table 2

TEXAS . "
4·12

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
TABLE 2. MULTIPLIER/ADDER CONTROL INPUTS

ACCl

ACCO

EA

EB

0

0

X
X
X

X
X
X

Ace

Operation

± IR x S) + 0
± IR x S) + ACC

0

1

1

0

1

1

0

0

±1 x 1 + 0

1

1

0

1

±1 x DB + 0

1

1

1

0

±DA x 1 + 0

1

1

1

1

±DA x DB + 0

±IR x S) - ACC

is the data stored in the accumulator

TABLE 3. SHIFTER CONTROL INPUTS

SFTl

SFTO

l

l

Pass data without shift
Shift one bit left; fill with zero

Shifter Operation

l

H

H

l

Swap upper and lower halves of temporary register

H

H

Shift 32 bits right; fill with sign bit

TABLE 4. FLOWTHROUGH CONTROL INPUTS
Control Inputs

FTl
l
l
H
H

FTO
l
H
l
H

Registers Bypassed

Pipeline

Y

I

B

A

Yes

Yes Yes Yes Yes

Yes

No

Yes

Yes

No

No

No

No

No

No

No

No

No

No

<.0
M
CO
CO

No

I(.)

«
I"
"""
Z

TABLE 5. TEST PIN CONTROL INPUTS
TPl
l
l
H
H

TPO
l
H
l
H

Operation

All outputs and liDs forced low

en

All outputs and liDs forced high
All outputs placed in a high impedance state

Normal operation (default state)

2

o

data flow
Two 32·bit input data ports, DA and DB, are provided for input of the multiplicand and multiplier to registers
A and B and the multiplier/adder. Input data can be clocked to the A and B registers before being passed
to the multiplier/adder if desired. Two multiplexers, Rand S, in conjunction with a flowthrough decoder
select the multiplier operands from DA and DB ihPuts, A and B registers, or the temporary register. Data
is supplied to the temporary register from a shifter that operates on external OAf DB data or a previous
multiplier/adder result. The 67·bit multiplier/adder result can be output through the Y port or passed through
the shifter to the accumulator.
External DA and DB data is also available to the accumulator via the shifter. This 64-bit data can be extended
with zeros or the sign bit. The 64 least significant bits from the shifter may also be latched in the 64-bit
temporary register and input to the multiplier through the Rand S multiplexers. A swap option allows the
most significant and least significant 32-bit halves of temporary register data to be swapped before being
made available to the Rand S multiplexers. This allows either 32-bit half of the temporary register to be
used as a multiplier.

i=

C

C

C
<
>
2

SGNEXT
SELD

0 MUX ,

li

2

II

ACCUMULATOR

32

\;A
P'
M~

I

I
1
I

32

,32
B
REGISTER

\ . A MUX /

'\BMUX

~.
\

.r?

~
RMUX /

\

J

SMUX /

T
MULTIPLIER/ADDER STAGE 1

2

PIPELINE REGISTER

."

MULTIPLIER/ADDER STAGE 2

::0

T

0

67

s:
>

FIGURE 1. TEMPORARY REGISTER AND ACCUMULATOR

::!
0
2

TEXAS •
4-16

JL

I~

I

EA

I

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TEXAS 75265

SELREG
WEMS
WELS

DB31-DBO

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR

shifter
The shifter can be used to multiply by two for Newton·Raphson operations or perform a 32-bit shift for
double precision multiplication. The shifter is controlled by two SFT inputs, as shown in Table 3.

Y register
Final or intermediate multiplier/adder results will be clocked into Y register when CKEY is low.
Results can be passed directly to the Y output multiplexer using flowthrough decoder signals to bypass
the register (see Table 4).

Y multiplexer and Y output multiplexer
The Y multiplexer allows the 64-bit result or the contents of the Y register to be switched to the Y bus,
depending upon the state of the flowthrough control outputs. The upper 32 bits are selected for output
when the Y output multiplexer control SEL Y is high; the lower 32 bits are selected for output when SEL Y
is low. Note that the Y output multiplexer can be switched at twice the clock rate so that the 64-bit result
can be output in one clock cycle.

flowthrough decoder
To enable the device to operate in pipelined or flowthrough modes, on-chip registers can be bypassed using
flowthrough control signals FT1 and FTO. Up to three levels of pipeline can be supported, as shown in
Table 4.

co

MULTIPLIER/ADDER STAGE 1

M

a)
a)

PIPELINE REGISTER
MULTIPLIER/ADDER STAGE 2

I-

()


0
«
TEXAS

-I!}

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

4-17

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR

Y

~-------------------4-CKEY

REGISTER

~------------------~~SELY

PERRY

OEY~------------------~~~4------4~~------~

ETPERR YETP2· YETPO

Y31·YO

MSERR PY3·PYO

FIGURE 3. OUTPUT ERROR CONTROL

en

z
...,
~

l>
(')
-t

extended precision check
Three extended product outputs, YETP2-YETPO, are provided to recover three bits of precision during
overflow. An extended precision check error signal (ETPERR) goes high whenever overflow occurs. If sign
controls DASGN and DBSGN are both low, indicating an unsigned operation, the extended precision bits
66-64 are compared for equality. Under all other sign control conditions, bits 66-63 are compared for
equality.

CO master slave comparator

CO

Co\)
0')

»c
»<2

(')

-m
2

A master/slave comparator is provided to compare data bytes from the Y output multiplexer with data
bytes on the external Y port when OEY is high. A comparison of the three extended precision bits of the
multiplier/adder result or Y register output with external data in the YETP1-YETPO port is performed
simultaneously. If the data is not equal, a high signal is generated on the master slave error output pin
(MSERR). A similar comparison is performed for parity using the PY3-PYO inputs. This feature is useful
in fault-tolerant design where several devices vote to ensure hardware integrity.

test pins
Two pins, TP1-TPO, support system testing. These may be used, for example, to place all outputs in a
high-impedance state, isolating the chip from the rest of the system (see Table 5).

data formats

."

The 'ACT8836 performs single-precision and double-precision multiplication in two's complement, unsigned
magnitude, and mixed formats for both integer and fractional numbers.

:lJ

Input formats for the multiplicand (R) and multiplier (5) are given below, followed by output formats for
the fully extended product. The fully extended product (PRDT) is 67 bits wide. It includes the extended
product (XTP) bits YETP1-YETPO, the most significant product (MSP) bits Y63-Y32, and the least significant
product (LSP) bits Y31-YO.

o

s:
»
:::!
o
2

4-18

TEXAS . .
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

SN74ACT8836

32·BI1 BY 32·BI1 MULTIPLIER/ACCUMULATOR
This can be represented in notational form as follows:
PRDT

XTP : : MSP : : LSP

PRDT

YETP2 - YETPO : : Y63 - YO

or
Table 6 shows the output formats generated by two's complement, unsigned and mixed-mode
multiplications.
TABLE 6. GENERATED OUTPUT FORMATS
Two's Complement

Unsigned Magnitude

Two's Complement

Two's Complement

Two's Complement

Unsigned Magnitude

Two's Complement

Unsigned Magnitude

examples
Representative examples of single-precision multiplication, double-precision multiplication, and division using
Newton-Raphson binary division algorithm are given below.

single-precision multiplication
Microcode for the multiplication of two signed numbers is shown in Figure 1. In this example, the result
is rounded and the 32 most significant bits are output on the Y bus. A second instruction (SEL Y = 0)
would be required to output the least significant half if rounding were not used.
Unsigned and mixed mode single-precision multiplication are executed using the same code. (The sign
controls must be modified accordingly.) Following are the input and output formats for signed, unsigned,
and mixed mode operations.

e.>

«
'd'

Input Operand B

31

30

29

2

_2 31

2 30

2 29

22

0
21

I 31

20

(Sign 1

_231

30

29

2

2 30

2 29

22

0
21

""'"
en
Z

20

(Signl

Unsigned Integer Inputs
Input Operand A
31

30

29

2 31

2 30

2 29

.........

2

Input Operand B
2
22

21

0

31

30

29

2

20

2 31

2 30

2 29

22

0
21

20

31

30

29

_20

2- 1

2- 2

(Sign)

0

2- 29 2-30 2-31

I I

31

30

29

-20

2- 1

2- 2

-
C


-t

CO
CO

Extended
Product
(YETP2-YETPO)

Co\)
0')

I

('")

m

Z

o"

...........

30

31

32

I I 31

30

29 .......... .

234

233

232

231

230

2 29

o

2

Two's Complement Fractional Outputs

("')

»
c
<
»
z

Least Significant Product
(Y3l-YO)

Most Significant Product
(Y63-Y32)

66

65

64

-24

23

22

Most Significant Product
(Y63-Y32)

II

63

62

61

21

20

2- 1

'-...-'

30

Least Significant Product
(Y3l-YO)
31

II

32

31

30

29

..

'"

,

.....

2-31 2-32 2-33

2-28 2-29 2-30

2

r 60 2-61

0
2-62

(Sign)

Unsigned Fractional Outputs
Extended
Product
(YETP2-YETPO)
66

65

64

22

21

20

Least Significant Product
(Y3l-YO)

Most Significant Product
(Y63-Y32)

II

63

62

61

2- 1

2-2

2-3

. .....

.....

30

31

II

32

2-30 2-31 2-32

31

lJ

s:
»

:!

o
z

TEXAS

4-20

30

29 ...........

2-33 2-34 2-35

~

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

2

0

2-62 2-63 2- 64

I

SN74ACT8836
32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR

double-precision multiplication
To simplify discussion of double-precision multiplication, the following example implements an algorithm
using one' ACT8836 device. It should be noted that even higher speeds can be achieved through the use
of two' ACT8836s to implement a parallel multiplier.
The example is based on the following algorithm where A and Bare 64-bit signed numbers.
Let
Am = as,a62, a61,· .. , a32
and
AI = a31, a30, a29, ... , ao (ao
LSB)
Therefore:
A = (Am x 2 32 ) + AI
Likewise:
B = (B m x 2 32 ) + BI
Thus:
A x B = [(Am x 2 32 ) + AI] x [(B m x 2 32 ) + BI]
= (Am x Bm) 2 64 + (Am x BI + AI] x Bm )2 32

+ AI x BI

Therefore, four products and three summations with rank adjustments are required.
Basic implementation of this algorithm uses a single 'ACT8836. The result is a two's complement 128-bit
product. Microcode signals to implement the algorithm are shown in Figure 4.
The first instruction cycle computes the first product, AI x BI. The least significant half of the result is
output through the Y port for storage in an external RAM or some other 32-bit register; this will be the
least significant 32-bit portion of the final result.
The instruction also uses the shifter to shift the AI x BI product 32 bits to the right in order to adjust
for ranking in the next multiplication-addition sequence. The least significant half of the shift result is stored
in the lower 32-bit portion of the accumulator; the upper 32 bits contain the zero and fill.
The second instruction produces the second product, AI x Bm , adds it to the contents of the accumulator,
and stores the result in the accumulator for use in the third instruction.
Instruction 3 computes Am x BI, adds the result to the accumulator, and outputs the least significant
32 bits of the addition for use as bits 63-32 of the final product.
This instruction also shifts the result 32 bits to the right to provide the necessary rank adjustment and
stores the shift result (the most significant half of the addition result) in the lower 32 bits of the accumulator.
Bits ACC63-ACC32 are filled with zeros; the sign is extended into the three upper bits (ACC66-ACC64).
Instruction 4 computes the fourth product (Am x Bm), adds it to the accumulator, and outputs the least
significant half at the Y port for use as bits 95-64 of the final product.
This example assumes that the chip is operating in feed-through mode. A fifth instruction is therefore required
to perform the fourth iteration again so that bits 127-96 of the final product can be output.

c.o

M
00
00

I-

(.)

«
~
"""
z
en

2

o

i=


c

n
n
c

:s:
c

;:
-I
CI
:1:1

SN74ACT8836
32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR
Newton-Raphson binary division algorithm
The following explanation illustrates how to implement the Newton-Raphson binary division algorithm using
the 'ACT8836 multiplier/accumulator. The Newton-Raphson algorithm is an iterative procedure that
generates the reciprocal of the divisor through a convergence method.
Consider the equation Q = A/B. This equation can be rewritten as Q = A x (1/B). Therefore, the quotient
Q can be computed by simply multiplying the dividend A by the reciprocal of the divisor (B). Finding the
divisor reciprocal 1/B is the objective of the Newton-Raphson algorithm.
To calculate 1/B the Newton-Raphson equation, Xi + 1 = Xi(2'BXi) is calculated in an iterative process.
In the equation, B represents the divisor and X represents successively closer approximations to the
reciprocaI1/B. The following sequence of computation illustrates the iterative nature of the Newton-Raphson
algorithm.
Step 1
Step 2
Step 3

X 1 = XO(2-BXO)
X2 = X 1(2-BX 1)
X3 = X2(2-BX2)

Step n

Xn = Xn-1 (2-BXn-1 )

The successive approximation of Xi, for all i, approaches the reciprocal 1/B as the number of iterations
increases; that is
1im Xi = 1/B
i -+ n
The iterative operation is executed until the desired tolerance or error is reached. The required accuracy
for 1/B can be determined by subtracting each xi from its corresponding xi + 1. If the difference IXi + 1
- Xi I is less than or equal to a predetermined round off error, then the process is terminated. The desired
tolerance can also be achieved by executing a fixed number of iterations based on the accuracy of the
initial guess of 1/B stored in RAM of PROM.

II

The initial guess, XO, is called the seed approximation. The seed must be supplied to the Newton-Raphson
process externally and must fall within the range of 0 < XO < 2/B if B is greater than 0 or 2/B < XO < 0
if B is less than O.
To perform the Newton-Raphson binary division algorithm using the' ACT8836, the divisor, B, must be
a positive fraction. As a positive fraction, B is limited within the range of 1/2 ,;; B < 1.
Since Xi from Newton-Raphson must lie between 0 < Xi < 2/B and since the range of the positive fraction
B is 1/2 ,;; B < 1, then the limits of Xi become 1 ,;; Xi <2.
The range of - BXi will therefore be - 2 ,;; - BXi ,;; - 1/2.

z

-o

The limits of - BXi are shown in Table 7 as they would appear in the' ACT8836 extended bit, binary fraction
format.

I-

TABLE 7. LIMITS OF -BXi IN 'ACT8836 EXTENDED BIT FORMAT

a:

Extended Bits

-2
-%

66

65

64

1
1

1
1

1
1

63

62

61

......

2

1

0

0
1

0
1

0
0

......

0
0

0
0

0
0

......


C

The diagram indicates that - BXi is always of the form:
1 1 1 dO. d1 d2 ............ dn-2 dn-1



c

<
l>
2
m

(")

2

."

o

::D

s:
l>
:::!

o
2

TEXAS . "

4-24

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR

absolute maximum ratings over operating free-air temperature range (unless otherwise noted)t
Supply voltage, Vee. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. -0.5 V to 6 V
Input clamp current, 11K (VIVee) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ±20 mA
Output clamp current, 10K (VOVee) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ±50 mA
Continuous output current, 10 (VO = 0 to Vee) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. ± 50 mA
eontinous current through Vee or GND pins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. ± 100 mA
Operating free-air temperature range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 0 DC to 70°C
Storage temperature range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 65°C to 150°C
t Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings
only and functional ope~ation of the device at these or any other conditions beyond those indicated under "recommended operating
conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.

recommended operating conditions
MIN

NOM

MAX

4.5

5

5.5

V

Vee
0.8

V

Vee

Supply voltage

VIH

High-level input voltage

2

Vil

Low-level input voltage

0

10H

High-level output current

UNIT

V

~8

rnA

8

rnA

10l

Low-level output current

VI

Input voltage

0

Vee

V

Vo
dt/dv

Output voltage

0

Input transition rise or fall rate

0

Vee
15

nslV

TA

Operating free-air temperature

0

70

V
°e

electrical characteristics over recommended operating free-air temperature range (unless otherwise
noted)
PARAMETER

TEST CONDITIONS
10H

.-

~20 ~A

VOH
IOH

~

~8

rnA

10l ~ 20 ~A
VOL
10l ~ 8 rnA
~

Vee or 0

II

VI

lee

VI ~ Vee or 0.10

ei

VI ~ Vee or 0

Alee t

One input at 3.4 V.
other inputs at 0 or Vee

10ZH

VI ~ Vee or 0

10Zl

VI

~

Vee or 0

Vee

TA - 25°C
TYP
MAX
MIN

TA MIN

4.5 V

4.4

4.4

5.5 V

5.4

5.4

4.5 V

3.8

3.7

5.5 V

4.8

4.7

oDe

to 70 DC

UNIT

MAX

V

0.1

5.5 V

0.1

0.1

4.5 V

0.32

0.4

5.5 V

0.32

0.4

5.5 V

0.1

5.5 V

50

100

~A

10

10

pF
rnA

5

V
V
~A

± 1.0

1

1

5V

0.5

5

~0.5

I-

o


o
c:r:

TEXAS . "
INSTRUMENlS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

4-25

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
setup and hold times
PARAMETER

MIN

tsul
tsu2

Instruction before ClKi
Data before ClKi

14
12

tsu3

CKEA before ClKi

14

tsu4

CKEB before ClKi

14

tsu5

CiITi before

10

ClKi

tsu6

CKEY before ClK i

t su 7

SElREG before ClK i

19
12

tsu8

WEMS before ClKi

11

tsu9
th1
th2

WElS before ClKi
Instruction after CLKf

Data after ClK i

11
0
0

th3

CKEA after ClK i

0

tM

CKEB after ClKi

0

th5

CiITi after

0

th6
th7

CKEY after ClKi
SElREG after ClKi

0
0

th8

WEMS after ClK i

0

th9

WElS after ClKi

0

ClKi

(f)

:2

"l>
~

(")

-f
CO
CO
W

0)

l>

c
<
l>
2:

(')

m

-2:

."

0

::tI
~

l>

::!

0

2:
TEXAS •
4-26

INSTRUMENTS
POST OFFICE BOX 65!?012 • DALLAS, TEXAS 75265

MAX

UNIT

ns

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
switching characteristics over recommended ranges of supply voltage and free-air temperature (see
Figure 2) for load circuit and voltage waveforms)
PARAMETER

FROM

TO

(INPUT)

(OUTPUT)

FT MODE (FT1-FTO)

MIN

TYP

MAX

tpdl t

ClK

PIPE

11

36

tpd2 t

PIPE

Y REG

11

36

tpd3 t

PIPE

ACCUM

11

36

tpd4t

Y REG

Y

All modes

18

tpd5

SElY

Y

All modes

18

tpd6 t

ClK

Y REG

01

54

tpd7t

ClK

ACCUM

10 or 01

67

tpd8

ClK

Y

10

67

tpd9

DATA

Y

00

60

tpdl0 t

DATA

ACCUM

00

56

tpdl1

ClK

YETP

11 or 10

18

tpd12

ClK

ETPERR

11 or 10

18

tpd13

ClK

YETP

00

67

tpd14

ClK

ETPERR

01

67

tpd15

DATA

YETP

00

60

tpd16

DATA

ETPERR

00

60

tpd17

PA

PERRA

All modes

20

tpd18

DA

PERRA

All modes

20

tpd19

PB

PERRB

All modes

20

tpd20

DB

PERRB

All modes

20

tpd21

PY

PERRY

All modes

20

tpd22

Y

MSERR

All modes

22

tpd23

YETP

MSERR

All modes

22

ten2

YETP

All modes

20

tenl

DEY
DEY

Y

All modes

20

tdisl

DEY

YETP

All modes

15

tdis2

OEY

Y

All modes

15

UNIT

ns

•
z
o

t=


1

I
~ten2

I

I.

_I

tpd9

Y31-YO SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSX

..pzzz

I
~ tpdS

X

LSP

tdi.2~
1

XS

MSP

FIGURE 5. FULL FLOWTHROUGH MODE (FT - 001
CLK __________________________________________~~~-----------------1

CKEA.CKEB~~----------------------------------------I~----------------------_

CKEI. CKEY

24'

I

INSTR ~I:::::::::::::::::::::::::::::::::::::::::~I::::::::::::::::::==
I
I

1

1

I

I

I

I

I

x::::

DATA~~~==================================jl=================X===
1
I
SELREG ___

WEMS. WELS

l>
C

~
:2

om

:2
."

o:xJ

SUM-OF-

1

I

1

1

slsssssssssssssssssssssssssssssssssss~

--,-

*:---t -t

i

1
I

h7 h9 -----*'

PRODUCT

I

ACCUM.

1

I.

SELY

42277727222

I

t.u7-t.u9~

tpdl0

_I

1

1

,

I

1

{

1
I

1
1

1

DEY SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 1>
I
.
1
I.
tpd9
.1
I

I

I
I
~I

tpdS--*-+I
I

,I

1

Y31-YO SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSlSSSX

1

LSP

X

ten2--k-------.1

s:

FIGURE 6. FULL FLOWTHROUGH MODE. ACCUMULATOR MODE (FT - 001

l>

:::!

o

:2
TEXAS •
4-28

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

tdis2 --I4-+t

I
1

MSP

XS

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
PARAMETER MEASUREMENT INFORMATION

CLK __________~r----1~__________________~~~---------------------I

1

1

1

CKEA. CKEB ~:
CKEI. CKEY '-t t . .
;-- su3- su6INSTR~:
_ tsu1--+!
:

DATA:::::XI

:
QZZZZZZZZZZZZZZZzzzzz;
1_
1 t t
-===~.• ==~.~h~3~-~h6~========
~
1

..
1;~-----th1---~.1

I

1

*-- tsu2--+!

I

A. B

X:::=================
1
1

1

1...
~----th2----...-I.1
1

PRDDUCT=====:ti============~f:::======~YR~EG[:====:::
I

1

...1~------tpd6------+l~

1
I

SELY
OEY

...rI

1
I

:

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS~ :
:
:

~

1

I+--+t:

Y31-YO SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
X
1
I
ten2 I.
.1

tpd4:
t p d5

LSP

-+--+:

FIGURE 7. FLOWTHROUGH PIPE ONLY VOLTAGE WAVEFORMS (FT

tdis2-+-+t

:

==XS

X:::=JM~S[P

•

= 01)

2

o

i=
c:r:
:E
a:

ou.
2

w

(.)

2

c:r:

>
c
c:r:

TEXAS •
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

4-29

SN74ACT8836

32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
PARAMETER MEASUREMENT INFORMATION
ClK
I

I

CKEA. CKEB
CKEI. CKEY
INSTR

I

_"S)-~~'
-., :+-~:;;:;;;;~--!------!-------j-----th3::t.;;i::t;::=::::f
tsu3- t su6
~ I
-.:

*:±=====J

:.-tsu 1

:

~th1~

DATAS!<:

--.:
SElREG

WEMS,WELS

~tsu2

*::::====:J

A.a

:

:

_th2~

I

I

I

I

SSS(SSSSSSS»
I

tsu7-tsu9

.1

I
I
I
I+-- t h7-t h9---+i

:
SUM-OFPRODUCT

I
I
SSSSSSSSSSSSSSSSS,*

I

I

en

z-.oJ
~

»
(")
-I

CO
CO

I

I

I

II

I

I

I

I

I

I

SSSSSSSSSSSSSSSSSK I
I

I

'--:======~'=======::±'====
x..
I

ACCUM.

i+---tpd7 ------..,I
PRODUCT

I

I
I
I

,rIZZ????Z?Z?ZZ?Z???????????????????????

I.

VREG

I I

I

xq:=======:I~======:::t====
:

Io--tpdS-----oI :

:

:

SElY ----------t:----~.-( I
I
I

OEY

I
I

SSSSSSSSSSSSSSSS)..1

en

I
I

: :

I
I

1/ ZZZZVZZZZZZZZZZZZZZZZZ

I I

I

I

I

I

I

II
I I

I
:

I
I

:

I

I

I

I

I

SSSSSSSSSSSSSSSSSSSS SX=:Jl!!SP~1=:*==:J!:MS~P=+':::::)j(~2Z:LZ~7~Z~Z2:ZZZZ2Z2/2Z22:LZ:L2~2~Z~/~Z~Z2:Z2:ZZ/Z
I

ten2

Co\)

'I

I I

l~tpd4
Y31-YO

I

....... SfSSSSSSSSSSSSS1SSSSSSS

I.

01

I

Io----<+-tpd5

~tdis2

FIGURE 8. FLOWTHROUGH PIPE ONLY. ACCUMULATOR MODE (FT = 01)

»c

<
»
z
(")

m
Z

."

olJ
s:

»
::t

oZ ________________________________________________
4-30

TEXAS . "
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

SN74ACT8836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
PARAMETER MEASUREMENT INFORMATION

ClK __________________~~L-----------------------------------------I
14- th3-th5-.i
CKEA. CKEB ~
I
I
~
I
...,..\SSSSSSSSS\SS\S$':::"
CKEI
' - - - tsu 3-tsu5-----'
INSTR::X
I
* ____________
~
~ul
~
i
1414------t hl------+l.1

DATA:J(::::::::::::::::::t::::::JA[.!B:::::::::::::::::::*:::::::::::::::::::::::~I
,
,
1
...- - - - t su 2 - - - -...1

I

I

I

,..'.-------th2-----_.1

SElY _____________~I----------------------------_1(
I

I

:

I

I

OEY SSSSSSSSSSSSSSSSS>SSSSSSSSSSSSSSSSS,/>
..(ZZZ
I
~ten2
I
tdis2~
I.
tpdB
..
~tpd5
I
Y31-YO-)~S"'S""'S""''''''''''''''S'''''''''S'''''S'''''S'''''S..,."...S""'",....S"""......S..,.S...S""'S,. . S,. . S""""'S..."...S...S,....S,....S......S..,."...S""'S,. . S,. . S""'S. ,.S...,,""'S,....*~--...,.,LS"'P---X-------:M"'S::-P---~
FIGURE 9. FLOWTHROUGH PIPE AND Y ONLY 1FT - 10)

ClK

,

CD
M

I

CO
---*I ~tsu3-tsu5
th3- th5*---1
CKEA. CKEB I I
I
}:- CO
CKEI
-~~~I---------------7-----------+---------------r-----------L----~~~ IINSTR ~
-.I

*:
I+------thl__
i
*

x

~tsul

x

xc~::==
I
I
I

'

~tsu2

I

I,

I

I

I

I

I
I

I

I

I

1+----tpd7--+f
I
:
I
I
I

I

1_

I

I
:

I
I

..aZZVZZZZZVVZZZZZZZZZZZOZZOaZOZ

~tsu7-t5u9

PRODUCT 227ZZZZZZZZZZZZZZZX

SElY

en

I

I
I
I
I4--- t h7- t h9---+i
I

I

X

I

!---th2_
SElREG
I
WEMS. WElS s\\\\\\sSSSSSSS)...
SUM-OF-

X

A.S

ACCUM.

I

I
:

I

*
,

«~

I"'"
xCtl::::::= Z

,

DATA::li<

U

I
I
I
:

Z

o

-
C
%>SS«SSSSs*
I

I

/

,

PRODUCT

SElY

/,

~~~

I
I

/

/

I

I

X

IYREG

'

/

I

I

XC==!======::=====t:==

:

/

/

:

I

VAEG
/

I

I

I
~r~I------~~__

//

/

I

"

I

/

~ '4>'S§S\\\\\\\\f§§SSSSSSSr :
Y31-YO

I

,

/

/

I

*:::::=====t=====::!:====::±I==

PIPE

I

/

I

:
:

,

S&~~~,,~'\\\§\>*
I

xq~====:)XI

*'

PIPE

_tpd1---ot

:
Xc~===::>a::::=

I

CO/2'

--.: r.

d:::====>e!==
,

X

I

/
I

I ~tpd4""

::
,

I

, I

I',

,

I

S§\m~~~""~~~
I4-len2 _

LSP1 :

:

L
-4_____~(----t---~\~SS"~S~SS'~~'~,~§~S~S

/

I

I

tpd5~
'I

:

:

I

tpd5~
I I

1/zzm*mm
"

j(:::::::iM~S[P'C::X!(=::::J;LS!iP[2=:X~=]!M!!SP[2~:~)vZ/.~?:2zZ?z~?:2zza~'l
I
I
14----01-- tpd5
---.I !4- 'dis2

FIGURE 11. ALL REGISTERS ENABLED (FT - 11)

»
c
»2<
o

m

2
o

."

:::a

~

::!

o
2

TEXAS
4-32

i

I '

~

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

SN74ACTB836
32·BIT BY 32·BIT MULTIPLIER/ACCUMULATOR
PARAMETER MEASUREMENT INFORMATION
ClK

-~ :.- th3-t su6
CKEA. CKEB::cl. I
CKEI. CKEY
I
INSTR:JIC :
...,

j4- tsu1

:
I
I

*:

x

JI<: •. B'"

-.I

:+-tsu 2

:

:+--th2_

INTR. PRDDUCT

xcj=====:::>xq:====~x:::;

WEMS. WELS

I

I

I

:

I

I

l,tsU7-tsu9~,

I
I
I

I

tpd3~

PRODUCT

SS\SSSSSSSSSSSS\\SS*

I

SElY

I

tpd 2 - - "

:

I

:

:

:

I

1~-4--~~~

I
I
I
I
I
I
OEY SSS\'L'Z\"Sb~,§\\S», I
I
I
I t+-.t-tpd4

:

I

.X:
"

'CCUM.

I

I

I

;.

.

I

I

I

I

I

I

I

I

'I

XC=Yil!RliEGC:*==)XC===::t=====~:=====::~=:
I

YREG :

I

I

:
I

I

""];§£i!~;==)X
"""y""'~"(&"'~ACCUM.
.
..
I
I

I
I

{WflTLYAV?ZZV/TAVrLrfld/Yqfl!ZTAVW2V1W

SUM-OFPRODUCT

I

I

~th7-th9-.1

I

I

*::::=====~:=====:::t=====t==
I
I

PIPE

I

I

I

I

I

I

XI

PIPE

I

I

I

I

I

----.I

S$\\.'§\\\\\\\}

I

I

I

~~'$A~~"'X
I
I
- - tpd1
I

SElREG

r-

J.zzzi

=====>¢:::=
----.I
l!<:::::::=:JC~.£D]12[1=)xC::±====~X::::~====::JX¢:::=
I

14-- th 1
DATA

th3- t h6"':
:
I
I

:
'
:

::

:

I

I

Y31·YO SSSSSSSSSSSS~

1

:

lSP1

*

-..-.I-ten2

I

I
I
I
I

I

I

tpd5~

______-+J/~-------:I~',S~S~S~SSSS~S~S~SS~S~~S~SS~~S~S

I

:
MSP1

I

I

I
I

I
'
I I

4Zmmzzmzzza

*

: :
LSP2

X

I

MSP2

,I

I

:
~,§\SSS\\\&'5S'

--.t ~tdis2

i+---M-tpd5

FIGURE 12. ALL REGISTERS ENABLED. ACCUMULATOR MODE (FT - 11)

Tvce
S1
TEST

PARAMETER
ten

FROM OUTPUT _ _P_O.IN
....T
_ _"'R"'loy.._ _•
UNDER TEST

tdis

tpZH
tpZL
tpHZ
tpLZ

Rl

clt

1 kll

50 pF

1 kll

50 pF

-

50 pF

tDd

S1'
OPEN

S2
CLOSED

CLOSED

OPEN

OPEN

CLOSED

CLOSED

OPEN

OPEN

OPEN

z
o

i=

C


(")

-I
00
00
CAl
-..J

5-20

NO.

DESCRIPTION

I/O

OENORM

B16

I/O

ENRA

M2

I

ENRB

M1

I

FAST

E3

I

GNO
GNO
GNO
GND
GND
GND
GND
GNO
GND
GND
GND
GND
GND
GNO
GND
GNO
GNO

04
06
07
09
010
012
013
E4
E14
F4
F14
H4
H14
K4
K14
L14
M4

HALT

R2

10
11
12
13
14
15
16
17
18
19
INEX

E2
01
E1
F2
G3
F1
G2
G1
H3
H1
C14

Status pin indicating a denormal output from the
ALU or a wrapped output from the multiplier. In
FAST mode, causes the result to go to zero when
OENORM is high.
When high, enables loading of RA register on a
rising clock edge if the RA register is not disabled
(see PIPESO below).
When high. enables loading of RB register on a
rising clock edge if the RB register is not disabled
(see PIPESO below).
When low. selects gradual underflow (IEEE mode).
When high. selects sudden underflow. forcing all
denormalized inputs and outputs to zero.

Ground pins. NOTE: All ground pins should be
used and connected.

I

Stalls operation without altering contents of
instruction or data registers. Active low.

I

Instruction inputs

I/O

Status pin indicating an inexact output

Table 2 .• ACT8837 Pin Functional Description (Continued)
PIN
NAME

NO.

I/O

IVAL

A15

I/O

MSERR

0

OEC

E17
A1
A2
A16
A17
B1
B17
H2
J15
P1
S1
T1
T16
T17
G15

OES

F17

I

OEY

F16

I

OVER

B14

I/O

PAO
PA1
PA2
PA3
PBO
PB1
PB2
PB3

L17
K15
K16
K17
S2
P4
R3
T2

PERRA

F15

0

PERRB

C1

0

PIPESO

P2

I

PIPES1

R1

I

NC

DESCRIPTION
Status pin indicating that an invalid operation or a
nonnumber (NaN) has been input to the multiplier
or ALU.
Master/Slave error output pin

No internal connection. Pins should be left floating.

I

Comparison status output enable. Active low.
Exception status and other status output enable.
Active low.
Y bus output enable. Active low.
Status pin indicating that the result is greater the
largest allowable value for specified format
(exponent overflow).

I

Parity inputs for DA data

I

Parity inputs for DB data
DA data parity error output. When high, signals a
byte or word has failed an even parity check.
DB data parity error output. When high, signals a
byte or word has failed an even parity check.
When low, enables instruction register, RA and RB
input registers. When high, puts instruction
register, RA and RB registers in flowthrough mode.
When low, enables pipeline registers in ALU and
multiplier. When high, puts pipeline registers in
flowthrough mode.

5-21

Table 2. 'ACT8837 Pin Functional Description (Continued)
PIN
NAME

en

z

~

~

»
(")
-4

00
00
eN

NO.

I/O

DESCRIPTION

I

When low, enables status register, product (P) and
sum (S) registers. When high, puts status register,
P and S registers in flowthrough mode.

PIPES2

N4

PYO
PY1
PY2
PY3

A13
C12
B13
A14

RESET

P3

I

RNDO
RND1

F3
D2

I

RNDCO

B15

I

SELMS/LS

G16

I

SELOPO
SELOP1
SELOP2
SELOP3
SELOP4
SELOP5
SELOP6
SELOP7
SELSTO
SELST1

J3
J2
J1
K1
K2
K3
L1
L2
H17
H16

SRCC

J16

I

SRCEX

C16

I/O

STEXO
STEX1

D16
D15

I/O

TPO
TP1

H15
G17

I

UNDER

C13

I/O

UNORD

D17

I/O

I/O

I

I

~

5-22

Y port parity data
Clears internal states and status with no effect to
data registers. Active low.
Rounding mode control pins. Select four IEEE
rounding modes (see Table 18).
When high, indicates the mantissa of a wrapped
number has been increased in magnitude by
rounding.
When low, selects LSH of 64-bit result to be
output on the Y bus. When high, selects MSH of
64-bit result.

Select operand sources for multiplier and ALU
(See Tables 6 and 7)

Select status source during chained operation
(see Table 16)
When low, selects ALU as data source for C
register. When high, selects multiplier as data
source for C register.
Status pin indicating source of status, either
ALU (SRCEX = L) or multiplier (SRCEX = H)
Status pins indicating that a nonnumber (NaN) or
denormal number has been input on A port
(STEX1) or B port (STEXO).
Test pins (see Table 19)
Status pin indicating that a result is inexact and
less than minimum allowable value for format
(exponent underflow).
Comparison status pin indicating that the two
inputs are unordered because at least one of them
is a nonnumber (NaN).

Table 2 .• ACT8837 Pin Functional Description (Concluded)
PIN
NAME

VCC
VCC
VCC
VCC
VCC
VCC
VCC
VCC
VCC
VCC
YO
Y1
Y2
Y3
Y4
Y5
Y6
Y7
Y8
Y9
Y10
Y11
Y12
Y13
Y14
Y15
Y16
Y17
Y18
Y19
Y20
Y21
Y22
Y23
Y24
Y25
Y26
Y27
Y28
Y29
Y30
Y31

NO.

05
08
011
014
G4
G14
J4
J14
L4
M14
C2
03
82
C3
83
A3
C4
84
A4
C5
85
A5
C6
86
A6
C7
87
A7
C8
88
A8
A9
89
C9
A10
810
C10
A11
811
A12
C11
812

DESCRIPTION

1/0

5-V power supply

'"

M
1/0

CO
CO
I-

32-bit Y output data bus

o

«~

'"2:

en

5-23

, ACT8837 Specification Tables
absolute maximum ratings over operating free-air temperature range
(unless otherwise noted) t
Supply voltage, Vee ....................... - 0.5 V to 6 V
Input clamp current, 11K (VI < 0 or VI > Vee) ........ ± 20 mA
Output clamp current, 10K (VO <0 or Vo > Vee) . . . .. ± 50 mA
eontinuous output current, 10 (VO = 0 to Vee) . . . . . .. ± 50 mA
eontinuous current through Vee or GND pins . . . . . . .. ± 100 mA
Operating free-air temperature range . . . . . . . . . . . .. ooe to 70 0 e
Storage temperature range. . . . . . . . . . . . . . . .. - 65 °e to1 50 0 e
tStresses beyond those listed under "absolute maximum ratings" may cause permanent damage
to the device. These are stress ratings only and functional operation of the device at these or
any other conditions beyond those indicated under "recommended operating conditions" is
not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect
device reliability.

recommended operating conditions
PARAMETER

V

y£t'

V

0

_v,:10.8
-8
A 'x'
_':''j(>
8

Low-level input voltage

IOH

High-level output current

»
(")

IOL
VI

Input voltage

OQ

5.0

5.25

VIH

co
W
......

4.75

UNIT

2

VIL

-I

MAX

Supply voltage

z
......
~

NOM

Vee

High-level input voltage

en

SN74ACT8837
MIN

Low-level output current

Vo
dt/dv

Output voltage

TA

Operating free-air temperature

5-24

Input transition rise or fall rate

~lPv
"<0

0
0

V
mA
mA

Vee

V

Vee
15

ns/V

70

V
°e

electrical characteristics over recommended operating free-air
temperature range (unless otherwise noted)
PARAMETER

TEST CONDITIONS
10H

=

10L

=

-8 mA

II
ICC
Ci

VI
VI
Vi

3.76

5.5 V

4.76

TYP MAX

UNIT

5.5 V
10

V
.<

...•'0',\'\;>.

5.5 V
4.5 V

mA

= VCC or 0
= VCC or 0,
= VCC or 0

4.5 V
4.5 V

= 20 p.A
=8

MIN

5.5 V

VOL
10L

SN74ACT8837

TA - 25°C
MIN TYP MAX

4.5 V

-20 p.A

VOH
10H

VCC

J'

\

';:"

.O(\i)~i

'

0.45

.

V

0.45

5.5 V

±1

p.A

5.5 V

200

p.A

5V

pF

switching characteristics (see Note)
PARAMETER

SN74ACT8837-65
MIN

MAX

Propagation delay from DAIDB/I inputs
tpd1

to Y output
Propagation delay from input register to

tpd2

output buffer

tpd5

output

buff~r

Propagation delay from SELMS/LS to Y output

<$.,F;,:§i'
':;

.

Propagation delay from input register to
td1

output register
Delay time', input register to pipeline register or

td2

pipeline register to output register

ns

118

ns

~.~..\\'l.(.,.'~
70

ns

30

ns

32

ns

95

ns

,....'<-\;.

output buffer
Propagatipn delay from output register to

tpd4

125

,\

Propagation delay from pipeline register to
tpd3

UNIT

65

ns

Note: Switching data must be used with timing diagrams for different operating modes.

5-25

setup and hold times
SN74ACT8837-65
MIN
MAX

PARAMETER
tsu1

Setup time. Instruction before ClK!

18

tsu2

Setup time. data operand before ClK!

18

tsu3

for double-precision operation (input register

UNIT

.,~' ~. ns
£>1).,,,,"Vi ,.
ns

f'\
;O'\) .J

Setup time. data operand before second ClK I

~O

ns

0

ns

not enabled)
th1

Hold time. Instruction input after ClK I

clock requirements
SN74ACT8837-65

PARAMETER
tw

Pulse duration
Clock period

5-26

I ClK high
I ClK low

MIN
15

~t.
~,tc1 1(T"

~Jv

~J.lNIT
ns
ns

SN74ACT8837 FLOATING POINT UNIT
The SN74ACT8837 is a high-speed floating point unit implemented in TI's
advanced 1-p.m CMOS technology. The device is fully compatible with IEEE
Standard 754-1985 for addition, subtraction and multiplication operations.
The' ACT8837 input buses can be configured to operate as two 32-bit data buses
or a single 64-bit bus, providing a number of system interface options. Registers are
provided at the inputs, outputs, and inside the ALU and multiplier to support multilevel
pipelining. These registers can be bypassed for nonpipelined operation.
A clock mode control allows the temporary register to be clocked on the rising edge
or the falling edge of the clock to support double precision operations (except
multiplication) at the same rate as single precision operations. A feedback register with
a separate clock is provided for temporary storage of a multiplier result, ALU result
or constant.
To ensure data integrity, parity checking is performed on input data, and parity is
generated for output data. A master/slave comparator supports fault-tolerant system
design. Two test pin control inputs allow alii/Os and outputs to be forced high, low,
or placed in a high-impedance state to facilitate system testing.
Floating point division using a Newton-Raphson algorithm can be performed in a sumof-products operating mode, one of two modes in which the multiplier and ALU operate
in parallel. Absolute value conversions, floating point to integer and integer to floating
point conversions, and a compare instruction are also available.
I'

Data Flow

('I')

CO

Data enters the' ACT8837 through two 32-bit input data buses, DA and DB. The buses 00
can be configured to operate as a single 64-bit data bus for double precision operations ~
U
(see Table 7). Data can be latched in a 64-bit temporary register or loaded directly 
n

-I
CO
CO

SELMS/l§'

SELST1-

SElSTO

FROM
INSTRUCTION - - •
REGISTER

ore

CAl

m

-.J

PY3-PYO

Y31-YO

m

MSERR

UNORD

AGT B
A eQ B

IVAL
IHEX
OVER
UNDER
DEHORM
DENIN
RNOCO

SRCEX
CHEX
STEX1-STEXO

Figure 1. •ACT8837 Floating Point Unit

5-28

A parity check can also be performed on the entire input data word by setting BYTEP
low. In this mode, PAO is the parity input for DA data and PBO is the parity input for
DB data.

Temporary Input Register
A temporary input register is provided to enable double precision numbers on a single
32-bit input bus to be loaded in one clock cycle. The contents of the DA bus are loaded
into the upper 32 bits of the temporary register; the contents of DB are loaded into
the lower 32 bits. A clock mode signal (ClKMODE) determines the clock edge on which
the data will be stored in the temporary register. When ClKMODE is low, data is loaded
on the rising edge of the clock; when ClKMODE is high, data is loaded on the falling
edge.

RA and RB Input Registers
Two 64-bit registers, RA and RB, are provided to hold input data for the multiplier
and AlU. Data is taken from the DA bus, DB bus and the temporary input register,
according to configuration mode controls CON FIG l-CONFIGO (see Tables 3 and 5).
The registers are loaded on the rising edge of clock ClK. For single-precision operations,
CONFIG1-CONFIGO should ordinarily be set to 0 1 (see Table 4).
Table 3. Double-Precision Input Data Configuration Modes
LOADING SEQUENCE
DATA LOADED INTO
TEMP REGISTER ON FIRST

DATA LOADED INTO

CLOCK AND RA/RB

RA/RB REGISTERS ON

REGISTERS ON SECOND

SECOND CLOCK

~

CLOCKt

0

VvLNS
U1

IN
N

Table 8. Independent ALU Operations, Single Operand (19 ... 0, 16 - 0)
CHAINED
OPERATION

PRECISION

19

RA
18

0= Not
Chained

0= A(SP)
1 = A(OP)

PRECISION
RB

16

17
0= B(SP)
1 = B(OP)

OPERAND
TYPE

OUTPUT
SOURCE

o

= ALU
result

ABSOLUTE
VALUE A

ALU OPERATION

15

14

13-10

1 = Single
Operand

O=A
1 = IAI

0000
0001
0010
0011
0100
0101
0110
0111
1000

----_ ..

-

1001
1010
1011
1100
1101
1110
1111

RESULT

Pass A operand
Negate A operand
Integer to floating point
conversion t
Floating point to integer
conversion
Undefined
Undefined
Floating point to floating
point conversion:J:
Undefined
Wrap (denormal) input
operand
Undefined
Undefined
Undefined
Unwrap exact number
Unwrap inexact number
Unwrap rounded input
Undefined

tThe precision of the integer to floating point conversion is set by 18.
*This converts single precision floating point to double precision floating point and vice versa. If the 18 pin is low to indicate a single-precision input, the result
of the conversion will be double precision. If the 18 pin is high, indicating a double-precision input, the result of the conversion will be single precision.

Table 9. Independent ALU Operations, Two Operands (19 ... 0, 15 ... 0)
CHAINED
OPERATION
19

PRECISION
RA
18

PRECISION
RB
17

OUTPUT
SOURCE
16

OPERAND
TYPE
15

ABSOLUTE
VALUE A
14

ABSOLUTE
VALUE B
13

ABSOLUTE
VALUE Y
12

0= Not
chained

0= A(SP)
1 = A(DP)

o

0= ALU
result

0= Two
operands

O=A
1 = IAI

0= B
1 = IBI

0= V
1 = IVI

= B(SP)
1 = B(OP)

--

ALU OPERATION·
11-10

RESULT

00
01
10
11

A+B
A-B
Compare A, B
B - A

'-

.-

Table 10. Independent Multiplier Operations (19 ... 0, 16 ... 1)
CHAINED
OPERATION
19

o=

Not
chained

PRECISION
RA
18
0= A(SP)
1 = A(DP)

PRECISION
RB
17
o = B(SP)
1 = S(OP)

OUTPUT
SOURCE·
16
1 = Multiplier
result

15
0

tSee Table 15.

U1

W
eN

SN74ACT8837

ABSOLUTE
VALUE A
14t

ABSOLUTE
VALUEB
13 t

NEGATE
RESULT'
12t

O=A
1 = IAI

0= B
1 = IBI

0= V
1 = IVI

WRAP A
11

o=

Normal
format
1 = A is a
wrapped
number

WRAPB
10

o=

Normal
format
1 = B is a
wrapped
number

I
I

Table 11. Independent Multiplier Operations Selected by 14-12 (19 = 0, 16 = 1)
ABSOLUTE
VALUE A

ABSOLUTE
VALUE B

NEGATE
RESULT

OPERATION SELECTED

14

13

12

14-12

RESULTS

O=A
1 = IAI

0= B
1 = IBI

O=Y
1 = -y

000
001
010
011
100
101
110
111

A*B
-(A * B)
A * IBI
-(A * IBI)
IAI * B
-(IAI * B)
IAI * IBI
-(IAI * IBI)

Table 12. Operations Selected by 18-17 (19 - 0, 16 - 1)
PRECISION
SELECT RA

18

(I)

:2

"~
»
(")
-4

00
00
W

"

PRECISION
RAINPUT

PRECISION
SELECT RB

17

PRECISION
RBINPUT

PRECISION
OF RESULT

0

Single

0

Single

Single

0

Single
Converted
to Double

1

Double

Double

1

Double

0

Single
Converted
to Double

Double

1

Double

1

Double

Double

Master/Slave Comparator
A master/slave comparator is provided to compare data bytes from the Y output
multiplexer and the status outputs with data bytes on the external Y and status ports
when OEY, OES and OEC are high. If the data bytes are not equal, a high signal is
generated on the master/slave error output pin (MSERR).

Status and Exception Generator/Register
A status and exception generator produces several output signals to indicate invalid
operations as well as overflow, underflow, non numerical and inexact results, in
conformance with IEEE Standard 754" 1985. If output registers are enabled
(PIPES2 = 0), status and exception results are latched in a status register on the rising
edge of the clock. Status results are valid at the same time that associated data results
are valid. Status outputs are enabled by two signals, O'EC for comparison status and
OES for other status and exception outputs. Status outputs are summarized in
Tables 14 and 15.
During a compare operation in the ALU, the AEQ8 output goes high when the A and
8 operands are equal. When any operation other than a compare is performed, either
by the ALU or the multiplier, the AEQ8 signal is used as a zero detect.
5-34

Table 13. Chained Multiplier/ALU Operations (19 = 1)
CHAINED PRECISION PRECISION
OPERATION
RB
RA
17
19
18
1 = Chained 0= A(SP) o = B(SP)
1 = A(DP) 1 = B(DP)

OUTPUT
SOURCE

ADD ZERO

16

O=ALU
result
1 = Multiplier
result

MULTIPLY
BY ONE

15

o=

Normal
operation
1 = Forces
B2 input
of ALU
to zero

CJ1

W

CJ1

SN74ACT8837

NEGATE
NEGATE MULTIALU RESULT PlIER RESULT

14

o=

Normal
operation
1 = Forces
B1 input
of multiplier to
one

12

11-10

RESULT

Normal
operation
1 = Negate
multiplier
result

00
01
10
11

A+B
A-B
2 - A
B - A

13

o=

Normal
operation
1 = Negate
ALU
result

ALU
OPERATIONS

o=

Table 14. Comparison Status Outputs
SIGNAL
AEQB

RESULT OF COMPARISON (ACTIVE HIGH)
The A and B operands are equal. (A high signal on the AEQB output indicates a
zero result from the selected source except during a compare operation in the
ALU.)

AGTB

The A operand is greater than the B operand. (Only during a compare operation
in the ALU)

UNORD

The two inputs of a comparison operation are unordered, i.e., one or both of
the inputs is a NaN.

Table 15. Status Outputs
SIGNAL
CHEX

DENIN
DENORM

STATUS RESULT
If 16 is low, indicates the multiplier is the source of an exception during a
chained function. If 16 is high, indicates the ALU is the source of an exception
during a chained function.

Input to the multiplier is a denorm. When DENIN goes high, the STEX pins
indicate which port had a denormal input.
The multiplier output is a wrapped number or the ALU output is a denorm. In
the FAST mode, this condition causes the result to go to zero.

INEX

The result of an operation is not exact.

IVAL

A NaN has been input to the multiplier or the ALU, or an invalid operation
(0
00 or ± oo:j: (0) has been requested. When IVAL goes high, the STEX
pins indicate which port had a NaN.

»
(")

OVER

The result is greater than the largest allowable value for the specified format.

-4

RNDCO

The mantissa of a wrapped number has been increased in magnitude by
rounding and the unwrap round instruction can be used to unwrap properly
the wrapped number (see Table 8).

SRCEX

The status was generated by the multiplier. (When SRCEX is low, the status
was generated by the ALU.)

STEXO

A NaN or a denorm has been input on the B port.

STEX1

A NaN or a denorm has been input on the A port.

UNDER

The result is inexact and less than the minimum allowable value for the
specified format. In the FAST mode, this condition causes the result to go to
zero.

en
z
.....
~

CO
CO
W

.....

5-36

*

In chained mode, status results to be output are selected based on the state of the
16 (source output) pin (if 16 is low, ALU status will be selected; if 16 is high, multiplier
status will be selected). If the nonselected output source generates an exception, CHEX
is set high. Status of the nonselected output source can be forced using the SELST
pins, as shown in Table 16.
Table 16. Status Output Selection (Chain Model
SELST1SELSTO

00
01
10
11

STATUS SELECTED
Invalid
Selects multiplier status
Selects ALU status
Normal operation (selection based on result source specified by 16 input)

Flowthrough Mode
To enable the device to operate in pipelined or flowthrough modes, registers can be
bypassed using pipeline control signals PIPES2-PIPESO (see Table 17).
Table 17. Pipeline Controls (PIPES2-PIPESOI
PIPES2PIPESO
X X 0

Enables input registers (RA, RB)

X X 1

Disables input registers (RA, RB)

X 0 X

Enables pipeline registers

X 1 X

Disables pipeline registers

0 X X

Enables output registers (P, S, Status)

1 X X

Disables output registers (P, S, Status)

REGISTER OPERATION SELECTED

......

M

00
00
IU

«~

......

z

en

FAST and IEEE Modes
The device can be programmed to operate in FAST mode by asserting the FAST pin.
In the FAST mode, all denormalized inputs and outputs are forced to zero.
Placing a zero on the FAST pin causes the chip to operate in IEEE mode. In this mode,
the ALU can operate on denormalized inputs and return denormals. If a de norm is input
to the multiplier, the DENIN flag will be asserted, and the result will be invalid. If the
multiplier result underflows, a wrapped number will be output.

5-37

Rounding Mode
The' ACT8837 supports the four IEEE standard rounding modes: round to nearest,
round towards zero (truncate), round towards infinity (round up), and round towards
minus infinity (round down). The rounding function is selected by control pins RND1
and RNDO, as shown in Table 18.
Table 18. Rounding Modes
RND1-

ROUNDING MODE SELECTED

RNDO

o0
o1

Round towards nearest

1 0
1 1

Round towards infinity (round up)

Round towards zero (truncate)
Round towards negative infinity (round down)

Test Pins
Two pins, TP1-TPO, support system testing. These may be used, for example, to place
all outputs in a high-impedance state, isolating the chip from the rest of the system
(see Table 19).
Table 19. Test Pin Control Inputs
TP1-

OPERATION

en

TPO

0
0

0

All outputs and 1I0s are forced low

~

1

All outputs and I/0s are forced high

1

0

All outputs are placed in a high impedance state

1

1

Normal operation

:2

....
l>

(")
~

CO
CO
~

....

Summary of Control Inputs
Control input signals for the' ACT8837 are summarized in Table 20.

5-38

Table 20. Control Inputs
SIGNAL
SYTEP

HIGH
Selects byte parity generation
and test

LOW
Selects single bit parity generation
and test

Clocks all registers except C

No effect

Clocks C register

No effect

CLKMODE

Enables temporary input register
load on f.alling clock edge

Enables temporary input register load
on rising clock edge

CONFIG1CONFIGO

See Table 3 (RA and RS register
data source selects)

See Table 3 (RA and RS register data
source selects)

ENRA

If register is not in flow through,
enables clocking RA register

If register is not in flow through, holds
contents of RA register

ENRS

If register is not in flow through,
enables clocking of RS register

If register is not in flow through, holds
contents of RS register

FAST

Places device in FAST mode

Places device in IEEE mode

HALT

No effect

Stalls device operation but does not
affect registers, internal states, or
status

OEC

Disables compare pins

Enables compare pins

OES

Disables status outputs

Enables status outputs

OEY

Disables Y bus

Enables Y bus

See Table 17 (pipeline mode
control)

See Table 17 (pipeline mode control)

RESET

No effect

Clears internal states and status but
does not affect data registers

RND1RNDO

See Table 18 (rounding mode
control)

See Table 18 (rounding mode control)

See Tables 6 and 7
(multiplier/ALU operand selection)

See Tables 6 and 7 (multiplier/ALU
operand selection)

Selects MSH of 64-bit result for
output on the Y bus

Selects LSH of 64-bit result for output
on the Y bus (no effect during single
precision operation)

See Table 15 (status output
selection)

See Table 15 (status output selection)

Selects multiplier result for input
to C register

Selects ALU result for input to C
register

See Table 19 (test pin control
inputs)

See Table 19 (test pin control inputs)

CLK
CLKC

PIPES2PIPESO

SELOP7SELOPO
SELMS/LS

SELST1 SELSTO
SRCC
TP1-TPO

5-39

INSTRUCTION SET
Configuration and operation of the ~ACT8837 can be.selected to perform single- or
double-precision floating-point .calculations in operating modes ranging from
flowthrough to fully pipeliried. Timing and sequences of operations are affected by
settings of clock mode, data and status registers, input data configurations, and
rounding mode, as well as the instruction inputs controlling the ALU and the multiplier.
The ALU and the multiplier of the 'ACT8837 can operate either independently or
simultaneously, depending on the setting of instruction inputs 19-10 and related controls.
Controls ·for data flow and status results are discussed separately, prior to the
discussions of ALU and multiplier operations. Then, in Tables 22 through 25, the
instruction inputs to the ALU and the multiplier are summarized according to operating
mode, whether independent or chained (ALU and multiplier in simultaneous operation).

Loading External Data Operands
Patterns of data input to the' ACT8837 vary depending on the precision of the operands
and whether they are being input as A or B operands. Loading of external data operands
is controlled by the settings of CLKMODE and CONFIG 1-CONFIGO, which determine
the clock timing and register destinations for data inputs.

Configuration Controls (CONFIG 1-CONFIGO)

en

Three input registers are provided to handle input of data operands, either single
precision or double precision. The RA, RB, and temporary registers are each 64 bits
wide. The temporary register is only used during input of double-precision operands.

~ When single-precision or integer operands are loaded, the ordinary setting of CON FIG 1~ CONFIGO is LH, as shown in Table 4. This setting loads each 32-bit operand in the
most significant half (MSH) of its respective register. The operands are loaded into
~ the MSHs and adjusted to double precision because the data paths internal to the device
00 are all double precision. It is also possible to load single-precision operands with
~ CONFIG 1-CONFIGO set to HH but two clock edges are required to load both the A
~ and B operands on the DA bus.

»

Double-precision operands are loaded by using the temporary register to store half
of the operands prior to inputting the other half of the operands on the DA and DB
buses. As shown in Tables 3 and 5, four configuration modes for selecting input sources
are available for loading data operands into the RA and RB registers.

CLKMODE Settings
Timing of double-precision data inputs is determined by the clock mode setting, which
allows the temporary register to be loaded on either the rising edge (CLKMODE = L)
or the falling edge of the clock (CLKMODE = H). Since the temporary register is not
used when single-precision operands are input, clock modes 0 and 1 are functionally
equivalent for single-precision operations.

5-40

The setting of CLKMODE can be used to speed up the loading of double-precision
operands. When the CLKMODE input is set high, data on the DA and DB buses are
loaded on the falling edge of the clock into the MSH and LSH, respectively, of the
temporary register. On the next rising edge, contents of the DA bus, DB bus, and
temporary register are loaded into the RA and RB registers, and execution of the current
instruction begins. The setting of CONFIG1-CONFIGO determines the exact pattern
in which operands are loaded, whether as MSH or LSH in RA or RB.
Double-precision operation in clock mode 0 is similar except that the temporary register
loads only on a rising edge. For this reason the RA and RB registers do not load until
the next rising edge, when all operands are available and execution can begin.
A considerable advantage in speed can be realized by performing double-precision ALU
operations with CLKMODE set high. In this clock mode both double-precision operands
can be loaded on successive clock edges, one falling and one rising, and the ALU
operation can be executed in the time from one rising edge of the clock to the next
rising edge. Both halves of a double-precision ALU result must be read out on the Y
bus within one clock cycle when the' ACT8837 is operated in clock mode 1.

Internal Register Operations
Six data registers in the' ACT8837 are arranged in three levels along the data paths
through the m,ultiplier and the ALU. Each level of registers can be enabled or disabled
independently of the other two levels by setting the appropriate PIPES2-PIPESO inputs.
The RA and RB registers receive data inputs from the temporary register and the DA
and DB buses. Data operands are then multiplexed into the multiplier, ALU, or both.
To support simultaneous pipelined operations, the data paths through the multiplier
and the ALU are both provided with pipeline registers and output registers. The control
settings for the pipeline and output registers (PIPES2-PIPES 1) are registered with the
instruction inputs 19-10.

,....
M

~

IU
ct
'd"
A seventh register, the constant (C) register is available for storing a 64-bit constant ,....
or an intermediate result from the multiplier or the ALU. The C register has a separate
clock input (CLKC) and input source select (SRCC). The SRCC input is not registered
with the instruction inputs. Depending on the operation selected and the settings of
PIPES2-PIPESO, an offset of one or more cycles may be necessary to load the desired
result into the C register.

Status results are also registered whenever the output registers are enabled. Duration
and availability of status results are affected by the same timing constraints that apply
to data results on the Y output bus.

Data Register Controls (PIPES2-PIPESO)
Table 1 7 shows the settings of the registers controlled by PIPES2-PIPESO. Operating
modes range from fully pipelined (PIPES2-PIPESO = LLL) to flowthrough
(PIPES2-PIPESO = HHH).

5-41

Z

en

Ih flowthrough mode all three levels of registers are disabled, a circumstance which
may affect some double-precision operations. Since double-precision operands require
two steps to input, at least half of the data must be clocked into the temporary register
before the remaining data is placed on the DA and DB buses.
When all registers (except the C register) are enabled, timing constraints can become
critical for many double-precision operations. In clock mode 1, the ALU can perform
a double-precision operation and output a result during every clock cycle, and both
halves of the result must be read out before the end of the next cycle. Status outputs
are valid only for the period during which the Y output data is valid.
Similarly, double-precision multiplication is affected by pipelining, clock mode, and
sequence of operations. A double-precise multiply requires two cycles to execute,
depending on the settings of PIPES2-PIPESO. The output may be valid for one or two
cycles, depending on the precision of the next operation.
Duration of valid outputs at the Y multiplexer depends on settings of PIPES2-PIPESO
and CLKMODE, as well as whether all operations and operands are of the same type.
For example, when a double-precision multiply is followed by a single-precision
operation, one open clock cycle must intervene between the dissimilar operations.

C Register Controls (SRCC, CLKC)

en

:2

......
~

l>

n

-I
00
00
eN

......

The C register loads from the P or the S register output, depending on the setting of
SRCC, the load source select. SRCC = H selects the multiplier as input source.
Otherwise the ALU is selected when SRCC = L. In either case the C register only loads
the selected input on a rising edge of the CLKC signal.
The C register does not Imid directly from an external data bus. One method for loading
a constant without wasting a cycle is to input the value as an A operand during an
operation which uses only the ALU or multiplier and requires no external data inputs.
Since the B operand can be forced to zero in the ALU or to one,in the multiplier, the
A operand can be passed to the C register either by adding zero or multiplying by one,
then selecting the input source with SRCC and causing the CLKC signal to go high .
Otherwise, the C register can be loaded through the ALU with the Pass A Operand
instruction, which requires a separate cycle.

Operand Selection (SELOP7-SELOPO)
As shown in Tables 6 and 7, data operands can be selected as five possible sources,
including external inputs from the RA and RB registers, feedback from the P and S
registers, and a stored value in the C register. Contents of the C register may be selected
as either the A or the B operand in the ALU, the multiplier, or both. When an external
input is selected, the RA input always becomes the A operand, and the RB input is
the B operand.

5-42

Feedback from the ALU can be selected as the A operand to the multiplier or as the
B operand to the ALU. Similarly, multiplier feedback may be used as the A operand
to the ALU or the B operand to the multiplier.
Selection of operands also interacts with the selected operations in the ALU or the
multiplier. ALU operations with one operand are performed only on the A operand.
Also, depending on the instruction selected, the B operand may optionally be forced
to zero in the ALU or to one in the multiplier.

Rounding Controls (RND1-RNDO)
Because floating point operations may involve both inherent and procedural errors,
it is important to select appropriate modes for handling rounding errors. To support
the IEEE standard for binary floating-point arithmetic, the' ACT8837 provides four
rounding modes selected by RND1-RNDO.
Table 18 shows the four selectable rounding modes. The usual default rounding mode
is round to nearest (RND1-RNDO = LL). In round-to-nearest mode, the 'ACT8837
supports the IEEE standard by rounding to even (LSB = 0) when two nearest
representable values are equa"ynear. Directed rounding toward zero, infinity, or minus
infinity are also available.
Rounding mode should be selected to minimize procedural errors which may otherwise
accumulate and affect the accuracy of results. Rounding to nearest introduces a
procedural error not exceeding half of the least significant bit for each rounding
operation. Since rounding to nearest may involve rounding either upward or downward
in successive steps, rounding errors tend to cancel each other.

"

('t)

In contrast, directed rounding modes may introduce errors approaching one bit for
each rounding operation. Since successive rounding operations in a procedure may
a" be similarly directed, each introducing up to a one-bit error, rounding errors may
accumulate rapidly, especially in single-precision operations.

Status Exceptions
Status exceptions can result from one or more error conditions such as overflow,
underflow, operands in illegal formats, invalid operations, or rounding. Exceptions may
be grouped into two classes: input exceptions resulting from invalid operations or
denormal inputs to the multiplier, and output exceptions resulting from i"egal formats,
rounding errors, or both.
To simplify the discussion of exception handling, it is useful to summarize the data
formats for representing IEEE floating-point numbers which can be input to or output
from the FPU (see Table 21). Since procedures for handling exceptions vary according
to the requirements of specific applications, this discussion focuses on the conditions
which cause particular status exceptions to be signalled by the FPU.

5-43

CO
CO
....
(.)

t'N

~

SECOND
RESULT ~

/ 4 - - - t pd1----+f

OUTPUT(31.01. STATUS(13.0)

Figure 2. Single-Precision Operation, All Registers Disabled
(PIPES - 111, CLKMODE - 0)

5-53

The second example shows a microinstruction causing the ALU to compare absolute
values of A and B. Only the input registers are enabled (PIPES2-PIPESO = 110) so
the result is output in one clock cycle.
CLKMODE

=0

000001 1010

~
.....,

=

o 01

Operation: Compare IA I' IB I

110

CCC
L 00 P P
K NN I I
M FF PP
0 II EE
D GG SS
E 1-02-0

I I
9-0

en

PIPES

SS
EE
LL
00
PP
7-0

110 xxxx 1111 00

01101000x11

Load First Operands
Begin First Operation

Load Second Operands
Begin Second Operation

~

~

I

I

i

CLK

RR
NN
DD
1-0

S
E
L
M
SS
BEE R
S
FEE S /
Y L L EH
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0

1 1 11

:

r---~I--~ rr~~~~~~~~~~r----~I--~ r~~nr~TV~nr~~~

~

»
C')

14- tsu 1-':" th 1+I

~

INSTRUCTION: FUNC(9.0). RND(1.0). FAST

CAl
.....,

(

00
00

Op~I~:~DS ~ o~i~~~~s ~

...... t su2 ' " th 1.+1
DATA(31.0) A AND B INPUTS

,4
tpd1-----..~
OUT(31.0) STATUS(13.0)

It-- tsu2-.~~t--~.'l-th1

14,4-.---tpd2-----..~1

Figure 3. Single-Precision Operation, Input Registers Enabled
(PIPES - 110, CLKMODE ... 0)

5-54

Input and output registers are enabled in the third example, which shows the subtraction
B - A. Two clock cycles are required to load the operands, execute the subtraction,
and output the result (see Figure 4).
CLKMODE

=

0

I I
9-0
0000000011

eLK

PIPES

C
L
K
M
0
D
E

=

010

CC
00 P P
NN I I
FF PP
II E E
GG SS
1-02-0

o 01

Subtract B - A

Operation:

SS
EE
LL
00
PP
7-0

S
E
L
M
S S
BEE R
S
FEE S /
Y L L EH
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0

RR
NN
DD
1-0

01 0 xxxx 1111 00

0000 1 000 x 11

Load First Operands
Begin First Operation

Load Second Operands
Begin Second Operation

~

~

I

I

I

1 1 11

I

l t i l l - - - - - - - t d1---------+l.,
I
I

I
I

J4- t su2 -M-th 1 +I
DATA(31.01 A AND B INPUTS

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _>@

n

~

Double-Precision ALU Operations with CLKMODE = 0

CO The first example shows that, even in flowthrough mode, a clock signal is needed
CO
eN to load the temporary register with half the data operands (see Figure 6). The selected

......

5-58

operation is executed without a clock after the remaining half of the data operands
are input on the RA and RB buses:
CLKMODE

=0

PIPES = 111

Operation: Add A

+ IB I
S

E
C
L
K
M

CC
00 P P
NN I I
FF PP
o II E E
D GG SS
E 1-02-0

I I

9-0

L

M

SS
EE
LL
00
PP

7-0

SS
BEE R
FEES/
YLLEH
ANNR
OOOTSSSATT
SRRCLEEEETT ELPP
TAB C S Y C S P 1-0 T T 1-0

S

RR
!''IN
DD
1-0

01 1000 1000 0 11 111 xxxx 1111 00

0 1 1 0 x 0 0 0 x 11

1 1 11

load Half of Data

~
ClK

(FIRST

INS~RUCTION
I

I4- t su1-+!
INSTRUCTION: FUNC(9.01. RND(1.01,FAST
I

I

(

~

HA~F OF____

____

~D~_T_A

I

14- tsu2 ____ th1

X

~,

~

REST OF
_____
D_AT_A________________________________________

~

DATA(31.01 A AND B INPUTS

SElMS/LS

~REST

_ _ _ _ _ _ _ _ _ _~....;.F.,;,;.IR,;,;.S...
T ________________________

I4-tpd1-+1
OUT(31.01 STATUS(13.01

Figure 6. Double-Precision ALU Operation, All Registers Disabled
(PIPES - 111, CLKMODE - 0)

5-59

In the second example the input register is enabled (PIPES2-PIPESO = 110). Operands
A and B for the instruction, I B I - I A I, are loaded using CON FIG = 00 so that B is
loaded first into the temporary register with MSH through the DA port and LSH through
the DB port. On the second clock rising edge, the A operand is loaded in the same
order directly to RA register while B is loaded from the temporary register to the RB
register (see Figure 7).
CLKMODE = 0

PIPES = 110

C
L
K
M
I I
9-0
01 1001 1011

5-60

CC

00 P P

NN I I
FF PP
0 II EE
D GG SS
E 1-02-0

o

Operation: I B I - IAI

SS
EE
LL

00
PP
7-0

RR
NN
DD
1-0

00 11 0 xxxx 1111 00

S
E
L
M
S S
BEE R
S
Y L L EH
FEE S /
OOOTSSSATT
ANNR
SRRC[EEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0
011 OxOOOx 11

1 1 11

load Rest
of First
Operands
load Half
of First
Operands

Begin First
Operation

load Half
of Second
Operands

Begin Second
Operation

1

I

1

""----1---,

+

I

load Rest
of Second
Operands

•

+

+

L

ClK
FIRST INSTRUCTION
,

THIRD INSTRUCTION

SECOND INSTRUCTION
I

th1~ I4-- t su1 ~

14-- tsu1--+1

14- th1 ~ 14-- tsu1~

INSTRUCTION: FUNC(9,0), RND(1,0),FAST
I

I
HALF
1ST OPS

I4-- t su2

HALF
2ND OPS

REST
1ST OPS

REST
2ND OPS

HALF
3RD DPS

REST
3RD OPS

I

.ll4--tsu2 _ _ th1~ I4--tsu2--+14th1~ I4--tsu2--+14-th1~ I4--tsu2---+1f-th1~ I4-- t su2--+t

.'4
th1

DATA(31,0) A AND B INPUTS

L---~I-I,--------,
SElMS/lS

OUT(31 ,0) STATUS( 13,0)

(J1

OJ

)4--tpd2~

I4---*tpd5

)4--tpd2~

)4----+f-tpd5

)4--tpd2 ~

Figure 7. Double-Precision ALU Operation. Input Registers Enabled
(PIPES = 110. CLKMODE = 0)

SN74ACT8837

~tpd5

80th the input and output registers are enabled (PIPES2-PIPESO = 010) in the third
example. The instruction sets up the ALU to wrap a denormalized number on the OA
input bus. The wrapped output can be fed back from the S register to the multiplier
input multiplexer by a later microinstruction. Timing for this operation is shown in
Figure 8.
CLKMOOE = 0

PIPES = 010

Operation: Wrap Oenormal Input

S
C
L
K
M
I I

9-0

CC
00 P P
NN I I
FF PP
o II E E
oGG SS
E 1-02-0

E
L

M

SS
EE
LL
00
PP

7-0

01 101 0 1000 0 01 01 0 xxxx 11 xx 00

5-62

SS

S
8 EE R
RR F,EESI
YLLEH
NN ANNR
555TSSSATT
DO SR~C[EEEETTELPP
1-0 T A 8 C S Y C S P 1-0 T T 1-0

0 1 1 0 x 0 0 0 x 11

1 1 11

Load Rest
of First
Operands

l

Load Half
of First
Operands

Begin First
Operation

~

~

,

I

I

CLK

Begin Second
Operation

Load Output

+

I

:

Load Rest
of Second
Operands

Load Half
of Second
Operands

+

I

I

Load Half
of Third
Operands
Load Output

~

U

r--I---,

L

14-- td1 ---+!

I
I

I
I

FIRST INSTRUCTION

SECOND INSTRUCTION

I

I

I4-tsu1~

THIRD INSTRUCTION

I4th1~ 14- tsu1---+1

.th1~ I4-tsu1 ~

,

I4-th1~

INSTRUCTION: FUNC(9.0}. RND(1.0}. FAST
I

I
REST
1ST OPS

HALF
1ST OPS

I4- t su2

th1

HALF
2ND OPS

REST
2ND OPS

HALF
3RD OPS

REST
3RD OPS

I

~ I4-tsu2~th1~ I4-tsu2-+1f-th1~ I4-tsu2---+14th1~ I4-tsu2~th1~ 14- tsu2---+14-th'l~

DATA(31.0} A AND B INPUTS

--1L...-..--_
SELMS/LS

OUT(31.0) STATUS(13.0)

0'1

a,
W

I4-tpd4-+1

I4-tpdS+I

I4-tpd4-+1

I4-tpdS-+I

Figure 8. Double-Precision ALU Operation. Input and Output Registers Enabled
(PIPES = 010. CLKMODE = 0)

SN74ACT8837

I4-tpd4~

In the fourth example with CLKMODE = L, all three levels of internal registers are
enabled. The instruction converts a double-precision integer operand to a doubleprecision floating-point operand. Figure 9 shows the timing for this operating mode.
CLKMODE

=0

I I
9-0
01 10100010

5-64

PIPES

= 000

C CC
L 00 P P
K NN I I
M FF PP
0 II EE
DGGSS
E 1-02-0

o

Operation: Convert Integer to Floating Point

SS
EE
LL
00
PP
7-0

RR
NN
DO
1-0

11 000 xxxx 1100 00

S
E
L
M
SS
BEE R
S
FEE S /
Y L L EH
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 TT1-0
0110xOOOxll

1 1 11

load Rest
of First
Operands

load Half
of First
Operands

I

load Half
of Second
Operands

Begin First
Operation

~

~

I

I

I
~ td2

ClK

load Rest
of Second
Operands
Begin Second
Operation

load Pipeline

load Output

~

~

I
.'4

L

I
td2 ---+i

I

I
FIRST INSTRUCTION
I

14-- tsu1-+1
INSTRUCTION:

SECOND INSTRUCTION

THIRD INSTRUCTION

I

th 1*----+I

th 1 --J4---t.I 14-tsu 1 ----+I

th1-i4--+114- tsu1--+1
I

FUNC(9,O). RND(1,O), FAST
HALF
2ND OPS

REST
1ST OPS

REST
2ND OPS

HALF
3RD OPS

REST
3RD OPS

I

I4--tsu2~ I+-- t su2

th1

~ I+-tsu2.'"

th1

.'l+-tsu2

~l+-tsu2

th1

th1

*--+II+--tsu2
th1

~..

., th1

DATA(31,O) A AND B INPUTS

__ J
SElMS/ls

OUT(31,O) STATUS{13,O)

(Jl

OJ
(Jl

i+-+I

i4-+!

i4-+!

tpd4

tpd5

tpd4

Figure 9. Double-Precision ALU Operation. All Registers Enabled
(PIPES ... 000. CLKMODE ... 0)

SN74ACT8837

-

14-+1
tpd5

14-+1
tpd4

Double-Precision ALU Operations with CLKMODE = 1
The next fo~r examples are similar to the first four except that CLKMODE = H so that
the temporary register loads on the falling edge of the clock. When the ALU is operating
independently, setting CLKMODE high enables loading of both double-precision
operands on successive falling and rising clock edges.
In this clock mode a double-precision ALU operation requires one clock cycle to load
data inputs and execute, and both halves of the 64-bit result must be read out on
the 32-bit Y bus within one clock cycle. The settings of PIPES2-PIPESO determine
the number of clock cycles which elapse between data input and result output.
In the first example all registers are disabled (PIPES2-PIPESO = 111), and the addition
is performed in flOwthrough mode. As shown in Figure 10, a falling clock edge is needed
to load half of the operands into the temporary register prior to loading the RA and
RB registers on the next rising clock.
CLKMODE = 1

~

II

-...I

9-0

~

~

PIPES = 111

C CC
L 00 P P
K NN I I
M FF PP
0 II EE
DGGSS
E 1-02-0

Operation: Add A + IBI

SS
EE
LL
00
PP
7-0

RR
NN
DD
1-0

01 1000 1000 1 11 111 xxxx 1111 00

-t

CO
CO
eN
-...I

5-66

S
E
L
M
S S
BEE R
S
YLLEH
FEE S I
OOOTSSSATT
ANNR
SRRCLEEEETTELPP
TAB C S Y C S P 1-0 T T 1-0
0 1 1 0 x 0 0 0 x xx

1 1 11

-----------""f

~--------------------------------------------------------------------

ClK

~

LOAD HAC' D' O"RANOS

FIRST INSTRUCTION

~

~

tsu1

INSTRUCTION: FUNC(9.01. RND(1.0}. FAST

I
(

~

HALF1STOPS

tsu2

.~

==>C~

_________________________________
REST 1ST OPS

th1~

DATA(31.0) A ANDB INPUTS

SElMS/lS

~_J
~ ~~:~~

__________________________________________

QUT(31.0} STATUS (13.0)

~

tpd1

~

~
I+--tpd5~

Figure 10. Double-Precision ALU Operation, All Registers Disabled
(PIPES = 111, CLKMODE = 1)

(11

0,
-..J

SN74ACT8837

~~~~

The second example executes subtraction of absolute values for both operands. Only
the RA and RB registers are enabled (PIPES2-PIPESO = 110). Timing is shown in
Figure 11.
CLKMODE

=

1

I I
9-0

PIPES

C
L
K
M
0
D
E

en
:2
.....
~

- IA I

S
E
L
M
SS
S
BEE R
RR FEE S I
Y L L EH
NN ANNR
OOOTSSSATT
DD SRRCLEEEETTELPP
1-0 TAB C S Y C S P 1-0 T T 1-0

SS
EE
LL
00
PP
7-0

1 1 11 0 xxxx 1111 00

o

1 1 0 x 000 x xx

1 1 11

load half
of First
Operands

load Rest
of First
Operends

load Half
of Second
Operands

load Rest
of Second
Operands

load Half
of Third
Operands

load Rest
of Third
Operands

+

+

+

I

+

I

I

I

+

+

I

I
I
I

I

I

I

I

FiRST INSTRUCTION

~

14- tsu1~

(')

INSTRUCTION: FUNC(9.01.

-4

Operation: Subtract IB I

110

CC
00 P P
NN I I
FF PP
II EE
GG SS
1-02-0

01 1001 1011

elK

=

00
00
CAl

I

I

THIRD INSTRUCTION

SECOND INSTRUCTION

I.- th1-+114---+1- tsu1
~ND(1.01. FAST

I

I

.....

I

I4--tsu2~ I4--tsu2~

I4- t su2 ~101

th1
th1
DATA(31.01 A AND B INPUTS

~ll4-tsu2"'th1-+1 ~th1-+1l4-tsu2+14-*th1

th1

tsu2

SElMS/lS

OUT(31.01 STATUS(13.01

1+---+1
tpd2

Figure 11. Double-Precision ALU Operation, Input Registers Enabled
(PIPES - 110, CLKMODE - 1)
5-68

The third example shows a single denormalized operand being wrapped so that it can
be input to the multiplier. Both input and output registers are enabled
(PIPES2-PIPESO = 010). Timing is shown in Figure 12.
CLKMODE

I I
9-0

=

1

PIPES

C
L
K
M
0
D
E

= 010

CC
00 P P
NN I I
FF PP
II EE
GG SS
1-02-0

Operation: Wrap Denormal Input

SS
EE
LL
00
PP
7-0

RR
NN
DD
1-0

01 1010 1000 1 11 010 xxxx 11xx 00

S
E
L
M
SS
BEE R
S
Y L L EH
FEE S I
ANNR
555TSSSATT
S R R C lEE E E T TEL PP
TAB C S Y C S P 1 -0 T T 1-0

o

1 0 0 x 0 0 0 x xx

1 1 11

.....

M

CO
CO

t-

U

q<

.....
2

CJ)

5-69

L£88.1::HfvLNS
01

.!.J
o

-.J

load Rest
of First
Operands

load Rest
of Second
Operands

load Half
of First
Operands

Begin First
Operation

load Half
of Second
Operands

Begin Second
Operation

~

~

~

~

I

•

I

141-- - -

I

ClK

,

I

I

td3 - - -

I

I
FIRST INSTRUCTION
I

I

14- tsu 1 ~

I

.....-,-----.

.,

SECOND INSTRUCTION

I
th 1 ~

If--- tsu 1 --.I

load Output

THIRD INSTRUCTION

14- th1 +I 14- tsu1-+1

I4- t h1-+i

INSTRUCTION: FUNC(9.0). RND{1.0). FAST
I
I

HALF
1ST OPS

tsu2

101

HALF
2ND OPS

.101
th1

.'l4-

REST
2ND OPS

I
t su2+14---+1l4- t su2-+1+th1-+1 ~
th1
tsu2

HALF
3RD OPS

th1--+1 ~
tsu2

REST
3RD OPS

th1 --+114+14-- th1 ~
tsu2

DATA{31.0) A AND B INPUTS

~---,ISElMS/lS

OUT{31.0) STATUS{13.0)

14--+1

I4--+i

I4--+i

I4--+i

tpd4

tpd5

tpd4

tpd5

Figure 12. Double-Precision ALU Operation. Input and Output Registers Enabled
(PIPES = 010. CLKMODE = 1)

. --1

The fourth example shows a conversion from integer to floating point format. All three
levels of data registers are enabled (PIPES2-PIPESO) so that the FPU is fully pipelined
in this mode (see Figure 13).
CLKMODE = 1

PIPES = 000

Operation: Convert Integer to Floating Point

S
C CC
L 00 P P
K NN I I

M FF PP

o
I I

9-0

II E E
D GG SS
E 1-02-0

E
L
SS

EE
LL
00
PP

7-0

RR
NN
DD
1 -0

01 101 0 001 0 1 1 1 000 xxxx 1 100 00

M
SS
S
BEE R
FEES!
YLLEH
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0
0 1 1 x x 0 0 0 x xx

1 1 11

,....
M

CO
CO
~

u

«
~
,....
z
en

5-71

L£88.1::>~VLNS
01

Load Rest
of Third
Operands

~

N

Load Half
of First

Load Rest
of First
Operands

Load Half
Begin First of Second

~perands

1peration

~perands

J

I

I

I

CLK

1

I

I

14

Load Rest
of Second
Operands

1

Load Half
of Third

1

Operands

ad Pipeline

Begin Third
Operation
Load Pipeline
load Output

r

I
~

td2

td2

.1

I
FIRST
INSTRUCTION

tsu1~

SECOND
INSTRUCTION

I
th1-t4---+1 I4---* t su1

THIRD
INSTRUCTION

th1~ ~tsu1

FOURTH
INSTRUCTION

th1~ ~tsu'l

th1-l4----+1

INSTRUCTION: FUNC(9.01. RND(1.0). FAST
I

I

i'I

I

.,01 ., ,01 .'01

tsu2 th 1

~

tsu2 th 1

i'I

.14

~

tsu2 th 1

'OIl .'4 .,
tsu2 th 1

,4 .'4 ., ,4 .'4 ., i'I .,4 ., ,4 .,4 .,
tsu2 th 1

tsu2 th 1

tsu2 th 1

tsu2

th 1

DATA(31.0) A AND B INPUTS

L

SElMS/LS

OUT(31.0) STATUS(13.0)

tpd4~

tpd5-14--+1 tpd4~

tpd5-14--+1

Figure 13. Double-Precision ALU Operation, All Registers Enabled
(PIPES = 000, CLKMODE = 1)

tpd4~ tpd5~

Double-Precision Multiplier Operations
Independent multiplier operations may also be performed in either clock mode and with
various registers enabled. As before, examples for the two clock modes are treated
separately. A double-precision multiply operation requires two clock cycles to execute
(except in flowthrough mode) and from one to three other clock cycles to load the
temporary register and to output the results, depending on the setting of
PIPES2-PIPESO.
Even in flowthrough mode (PIPES2-PIPESO = 111) two clock edges are required, the
first to load half of the operands in the temporary register and the second to load the
intermediate product in the multiplier pipeline register. Depending on the setting of
CLKMODE, loading the temporary register may be done on either a rising or a falling
edge.

Double-Precision Multiplication with CLKMODE = 0
In this first example, the A operand is multiplied by the absolute value of B operand.
Timing for the operation is shown in Figure 14:
CLKMODE = 0

C
L
K
M
0
I I
9-0
01 1100 1000

Operation: Multiply A

PIPES = 111

CC
00 P P
NN I I
FF PP
II EE
oGG SS
E 1-02-0

o

11

SS
EE
LL

00
PP
7-0

RR
NN
DO
1-0

111 1111 xxxx 00

* IBI

S
E
L
M
SS
BEE R
S
Y L L EH
FEE S /
OOOTSS SA TT
ANNR
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0

o

"CO

('I)

CO
~

U

~

II::t

x x x x 0 0 0 x xx

1 1 11

"2

(J)

5-73

L£88.L::>V17LNS
01

~

"'"

_______--'r~d.

r". ;~

load Half of

ClK

(

FIRST INSTRUCTION

I
I
~tsu1~
INSTRUCTION:

(
I

FUNC(9,0~,

RND(1,0), FAST

----- -------x

~tsu2

HALF
lSTOPS

~

1~

th1

~

______

REST
~l~S~T~O~P~S

tsu3

____

~I

____________________________________________________________

~

DATA(31 ,0) A AND B INPUTS

SELMS/LS

~

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _~

OUT(31,0) STATUS(13,0)

~

tpd2

~

HALF
FIRST

~
~

~tpd5---+1

Figure 14. Double-Precision Multiplier Operation. All Registers Disabled
(PIPES = 111. CLKMODE = 0)

REST
FIRST

The second example assumes that the RA and RB input registers are enabled. With
CLKMODE = 0 one clock cycle is required to input both the double-precision operands.
The multiplier is set up to calculate the negative product of IA I and B operands:

=0

CLKMODE

PIPES

C
L
K
M

NN I I
FF PP
0 II EE
D GG SS
E 1-02-0

I I

9-0

SS
EE
LL
PP

RR
NN
DD

7-0

1-0

00

01 1101 0100 0 11 110 1111 xxxx 00
Load Rast
of First
Operands

l

S S
BEE R
FEE S I
Y L L EH
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1-0 T T 1-0

0 1 1 x x 0 0 0 x xx 1 1 11
Load Rest
of Second
Operands

Load Half
of Second
Operands

Load half
of First
Operands

Begin First
Operation

Load Pipeline

Begin Second
Operation

+

+

+

+

I

I

I

I

I

I

:+-- td2 - - + I

I
I
I

CLK

B)

S
E
L
M
S

CC

00 P P

*

Operation: Multiply - ( IA I

= 110

,.....
(¥)
ex;)
ex;)

I

I

I

l-

FIRST INSTRUCTION

!+-

\+-tsu1-.l

e.>

SECOND INSTRUCTION

th1

~

~ \4-tsu1-+J

V
,.....

INSTRUCTION: FUNC(9.01. RND(1.01. FAST
HALF
1ST OPS

I+---tsu2~

REST
1ST OPS

th1--+114-- t su24-- th1-.1

j4- t su2+14-

z
en

REST
2ND OPS

HALF
2ND OPS

th1-.1

DATA(31.01 A AND B INPUTS

14

.14

th1--+1

tsu2

SELMS/LS

HALF
REST
HALF
REST
~
1ST
1ST
2ND
2ND

-------------------------------- 14-+1

14-+1

H

H

tpd2

tpd5

tpd2

tpd5

OUT(31.01 STATUS(13.01

Figure 15. Double-Precision Multiplier Operation. Input Registers Enabled
(PIPES - 110. CLKMODE = 0)
5-75

Enabling both input and output registers in the third example adds an additional delay
of one clock cycle, as can be seen from Figure 16. The sample instruction sets up
calculation of the product of 1A 1and 181:
CLKMODE = 0

I I

9-0

PIPES = 010

C
L
K
M
0
D
E

CC
00
NN
FF
II
GG

Operation: Multiply IAI

PP
I I
PP
EE
SS

SS
EE
LL
00
PP

RR
NN
DD

1-02-0

7-0

.1-0

01 11 01 1000 <> 10 010 1111 xxxx 00

Load Half
of First
Operands

+

Load Rest
of First
Operands

Load Half
of Second
Operands

Begin First
Operation

* 181

S
E
L
M
S S
8 EE R
S
YLLE'H
FEE S /
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
T A 8 C S Y C S P 1 -0 T T 1-0

0 1 1 x x 0 0 0 x xx 1 1 11
Load Rest
of Second
Operands
Begin Second
Operation

+

en
Z
.....

eLK

.s::a.

l>

-I
CO
CO
Co\)

SECOND INSTRUCTION

FIRST INSTRUCTION

(")
14

_I

tsu1

THIRD INSTRUCTION

I+- th1-t-1 I+ t su1"i

I

INSTRUCTION: FUNC(9.0). RND(1.0). FAST
r

.....

I
REST
3RD OPS

.. tsu2 +14- th1 +I ... t su2 " th1 -.t ... tsu2+14- th 1 -.t ... tsu2" th1 -.t
DATA(31.0) A AND B INPUTS

L
SELMS/Ls

~

--------------------~
OUT(31.0) STATUS(13.0)
tpd4-11~.-~.1
Figure 16. Double-Precision Multiplier Operation, Input and Output Registers Enabled
(PIPES - 010, CLKMODE - 0)
5-76

With all registers enabled, the fourth example shows a microinstruction to calculate
the negated product of operands A and B:
CLKMOOE = 0

PIPES = 000

C
L
K
M
0
I I
9-0
01 11000100

CC
00 P P
NN I I
FF PP
II EE
o GG SS
E 1-02-0

o

Operation: Multiply - (A

SS
EE
LL
00
PP
7-0

RR
NN
DO
1-0

01 000 1111 xxxx 00

* B)

S
E
L
M
SS
BEE R
S
FEE S /
Y L L EH
OOOTSSSATT
ANNR
SRRCLEEEETTELPP
TAB C S Y C S P 1-0 T T 1-0

o

1 1 x x 0 0 0 x xx

1 1 11

.....

('I)

CO
CO
I-

(J



2

(')

-4

CO
CO
W

MULTIPLIER/ALU
OPERATIONS

3

.....
4

PSEUDOCODE

A - RA. B - RB
C - RA. P - RB
A * a- P(AB)
P(AB) + 0 - S(AB)
E - RA. F - RB
C * D- P(CD)
S(AB) + P(CD) - S(AB + CD)
G - RA. H - RB
E * F - P(EF)
S(AB + CD) + P(EF) - S(AB + CD + EF)
G * H - P(GH)
S(AB + CD + EF) + P(GH) - S(AB + CD + EF + GH)

A microcode sequence to generate this sum of product is shown in Table 28. Only
three instructions in chained mode are required. since the multiplier begins the
calculation independently and the ALU completes it lndependently.

5-86

Table 28. Sample Microinstructions for Single-Precision Sum of Products

C
L
K
M

I I
9-0

S
E
L
M
S

CC

00 pp

NN I I
FF pp
0 II EE
D GG SS
E 1-0 2-0

SS
EE
LL

00

RR
NN

pp

DD

7-0

1-0

0001000000
1001100000
1000000000
1000000000
0000000000

0 01
0 01
0 01
0 01
0 01

010
010
010
010
010

1111 xxxx
1111 xxxx
11111010
xxxx 1010
xxxx 1010

xx xxxx xxxx

x xx

xxx

xxxx xxxx xx

00
00
00
00
00

S S
BEE
FEE S I
Y L L
OOOTSS
ANN R
S R R C LEE E E TT
TABCSYCS P 1-0

R
E H
SA TT
E L PP
T T 1-0

x
x
x
x
ox x x
x x x x

1 1 11
1 1 11
11
11
11
11

0
0
0
0

x
x
x
x
x
x

x
x
x
x
x
0

x
x
x
x
x

o

x
x
x
x
x

x
x
x
x
x
0 x

xx
xx
xx
xx
xx
xx

Fully Pipelined Double-Precision Operations
Performing fully pipelined double-precision operations requires a detailed understanding
of timing constraints imposed by the lTlultiplier. In particular, sum of products and
product of sums operations can be executed very quickly, mostly in chained mode,
assuming that timing relationships between the ALU and the multiplier are coded
properly.
Pseudocode tables for these sequences are provided, (Table 29 and Table 30) showing
how data and instructions are input in relation to the system clock. The overall patterns
of calculations for an extended sum of products and an extended product of sums
are presented. These examples assume FPU operation in CLKMODEO, with the CONFIG
setting HL to load operands by MSH and LSH, all registers enabled
(PIPES2 - PIPESO = LLL), and the C register clock tied to the system clock.

'"

M

~

t(.)

:;J
'"

Z

In the sum of products timing table, the two initial products are generated in (/)
independent multiplier mode. Several timing relationships should be noted in the table.
The first chained instruction loads and begins to execute following the sixth rising
edge of the clock, after the first product P1 has already been held in the P register
for one clock. For this reason, P1 is loaded into the C register so that P1 will be stable
for two clocks.
'
On the seventh clock, the ALU pipeline register loads with an unwanted sum, P1 + P1.
However, because the ALU timing is constrained by the multiplier, the S register will
not load until the rising edge of CLK9, when the ALU pipe contains the desired sum,
P1 + P2. The remaining sequence of chained operations then execute in the desired
manner.

5-87

L£88.l~nfvLNS
en

cD

(Xl

Table 29. Pseudocode for Fully Pipelined Double-Precision Sum of Products
(CLKM = 0, CONFIG = 10, PIPES = 000, CLKC-SYSCLK)
ClK

I1
I2
I3
I4
I5
I6
I7
IS
I9
IlO

I11
S12

DA

DB

TEMP

INS

INS

RA

RB

MUl

P

C

ALU

BUS

BUS

REG

BUS

REG

REG

REG

PIPE

REG

REG

PIPE

A1 *B1

A1

B1

A2 MSH B2 MSH A2.B2MSH A2*B2 A1 *B1

A1

B1

A1 *B1

A2 LSH

A2

B2

A1 *B1

A2

B2

A2*B2

P1

A3

B3

A2*B2

P1

P1

A3

B3

A3*B3

P2

P1

P1 +P1

A4

B4

A3*B3

P2

P1

P1 +P1

A4

B4

A4*B4

P3

P2

S1 +P2

S1

A5

B5

A4*B4

P3

P3

S1 +P3

S1

A5

B5

A5*B5

P4

P3

XXXXX

S2

S

REG BUS

A1 MSH B1 MSH A1.B1MSH A1 *B1
A1 LSH

B1 LSH A1.B1MSH A1 *B1

B2 LSH A2.B2MSH A2*B2 A2*B2
PR+CR

A3 MSH B3 MSH A3.B3MSH

A3 LSH

B3 LSH A3.B3MSH

A4 MSH B4 MSH A4.B4MSH

A4 LSH

B4 LSH A4.B4MSH

A5 MSH B5 MSH A5.B5MSH

A5 LSH

B5.LSH A5.B5MSH

A6 MSH B6 MSH A6.B6(M)

---------- ...

----

A3*B3

A2*B2

PR+CR PR+CR.
A3*B3 A3*B3
PR+SR PR+SR.
A4*B4 A3*B3
PR+SR PR+SR.
A4*B4 A4*B4
PR+SR PR+SR.
A5*B5 A4*B4
PR+SR PR+SR.
A5*B5 A5*B5
PR+SR PR+SR.
A6*B6 A5*B5
-

- _.. _-

---

-

~

--

y

Table 30. Pseudocode for Fully Pipelined Double-Precision Product of Sums
(CLKM ... 0, CON FIG = 10, PIPES = 000, CLKC-SYSCLK)
CLK

Sl
S2
S3
S4
S5

DA

DB

TEMP

INS

INS

RA

RB

MUL

P

C

ALU

BUS

BUS

REG

BUS

REG

REG

REG

PIPE

REG

REG

PIPE

Al(M)

Bl(M)

Al,Bl(M)

Al +Bl

Al(L)

Bl (L)

Al,Bl(M)

Al +Bl

Al +Bl

Al

Bl

A2(M)

B2(M)

A2,B2(M)

A2+B2 Al +Bl

Al

Bl

Al +Bl

A2(L)

B2(L)

A2,B2(M)

A2+B2 A2+B2

A2

B2

Al +Bl

51

A2+B2

A2

B2

51

A2+B2

51

CR*SR CR*SR
A3+B3 A3+B3

A3

B3

51

A2+B2

52

A3

B3

51 *52

51

A3+B3

52

PR*SR CR*SR ENRA=L ENRB=L
51 *52
A4+B4 A3+B3
A3
B3

51

A3+B3 XXX

A3(M)

B3(M)

A3,B3(M)

Sa

A3(L)

B3(L)

A3,B3(M)

S7

XXX

XXX

XXX

sa

A4(M)

B4(M)

A4,B4(M)

S9

A4(L)

B4(L)

A4,B4(M)

SlO

XXX

XXX

XXX

S11

A5(M)

B5(M)

A5,B5(M)

S12

A5(L)

B5(L)

A5,B5(M)

CR*SR
A3+B3

SP Add

CR*SR
A3+B3

PR*SR PR*SR

A4

B4

XXX

Pl

51

XXX

53

A4

B4

Pl *53

Pl

51

A4+B4

53

PR*SR PR*SR ENRA=L ENRB=L
Pl *53 XXX
A4
B4
A5+B5 A4+B4

51

A4+B4 XXX

PR*SR PR*SR
A5+B5 A5+B5

51

A4+B4 A4+B4
SPAdd

PR*SR
A4+B4

A5

85

XXX

U1

eX!

co

S

NOTE: On CLK 7 and CLK10, put 0000000000 (Single-Precision Add) on the instruction bus.

SN74ACT8837

P2

XXX

V

REG BUS

54

In the product of sums timing table, the two initial sums are generated in independent
ALU mode. The remaining operations are shown as alternating chained operations
followed by single-precision adds. The SP adds are necessary to provide an extra cycle
during which the multiplier outputs the current intermediate product. The current sum
and the latest intermediate product are then fed back to the multiplier inputs for the
next chained operations. In this manner, a double-precision product of sums is
generated in three system clocks, as opposed to two clocks for a double-precision
sum of products.

Mixed Operations and Operands
Using mixed-precision data operands or performing sequences of mixed operations
may require adjustments in timing, operand precision, and control settings. To simplify
microcoding sequences involving mixed operations, mixed-precision operands, or both,
it is useful to understand several specific requirements for mixed-mode or mixedprecision processing.
Calculations involving mixed-precision operands must be performed as double-precision
operations (see Table 12). The instruction settings (18-17) should be set to indicate
the precision of each operand from the RA and RB input registers. (Feedback operands
from internal registers are also double-precision.) Mixed-precision operations should
not be performed in chained mode.

en

Timing for operations with mixed-precision operands is the same as for a corresponding
double-precision operation. In a mixed-precision operation, the single-precision operand
must be loaded into the upper half of its input register.

:2 Most format conversions also involve double-precision timing. Conversions between
single- and double-precision floating point format are treated as mixed-precision
operations. During integer to floating point conversions, the integer input should be
loaded
into the upper half of the RA register.
-I

~

""""
»
(")

CO
CO In applications where mixed-precision operations is not required, it is possible to tie
W the 18-17 instruction inputs together so that both controls always select the same

"""

precision.

5-90

Sequences of mixed operations may require changes in multiple control settings to
deal with changes in timing of input, execution, and output of results. Figure 22 shows
a simplified timing waveform for a series of mixed operations:

CLOCK CYCLE

FUNCTION
AND DATA

A,B

RESULTS
AND STATUS

6

7

8

9

10

11

12

E,F

G,H

G,H

I.J

I,J

K,L

M,N

XXXX A,B XXXX C,D

E,F

E,F

G,H

G,H

I.J

K,L

2

3

4

A,B

C,D

C,D

5

13

M,N

A,B,C,D - double precIsion multiply; E,F - single precIsion operation; G,H,I,J - double
precision add; K,L - single precision opration. A double precision number is not required to
be held on the outputs for two cycles unless it is followed by a like double precision function.
If a double precision multiply is followed by single precision operation, there must be one open
clock cycle.

Figure 22. Mixed Operations and Operands
(PIPES2-PIPESO '"' 110, CLKMODE = 0)
In this sequence, the fifth cycle is left open because a single-precision multiply follows
a double-precision mUltiply. If the SP multiply were input during the period following
the fourth rising clock edge, the result of the preceding operation would be overwritten,
since an SP multiply executes in one clock cycle. To avoid such a condition, the FPU
will not load during the required open cycle.
Because the sequence of mixed operations places constraints on output timing, only
one cycle is available to output the double-precision (e
0) result. By contrast, the
SP multiply (E
F) is available for two cycles because the operation which follows
it does not output a result in the period following the seventh rising clock edge. In
general, the precision and timing of each operation affects the timing of adjacent
operations.

*

*

5-91

,....
M

~
IU

:i
,....
Z

en

Control settings for CLKMOOE and registers must also be considered in relation to
precision and speed of execution. In Figure 23, a similar sequence of mixed operations
is set up for execution in fully pipelined mode:

CLOCK CYCLE

2
FUNCTION
AND DATA

A.B

4

3
C.D

RESULTS
AND STATUS

5

6

7

8

9

10

E,F

G.H

I.J

K.L

M.N

O.P

A.B

A.B

C.D

E.F

G.H

I.J

11

12

13

Q.R
K.L

M.N

M.N

A.B.C,D - double precision multiply; E,F - single precision operation; G,H, - double precision
add; I,J,K,L,M,N - single precision operation; D,P,Q,R - double precision multiply. In clock
mode 1, a double precision result is two cycles long only when a double precision multiply is
followed by a double precision multiply.

Figure 23. Mixed Operations and Operands
(PIPES2-PIPESO = 000, CLKMOOE ... 1)
Although the data operands can be loaded in one clock cycle with CLKMODE set high,
enabling two additional internal registers delays the (A
B) result one cycle beyond
the previous example. Again, an open cycle is required after the (C
0) operation
0) multiply is
because the next operation is single precision. The result of the (C
available for one cycle instead of two, also because the following operation is single
precision. With this setting of CLKMOOE and PIPES2-PIPESO, a double-precision result
is only available for two clock cycles when one OP multiply follows another DP multiply.

*

en
2

-.J
~

*
*

l>
(') Matrix Operations
-4

co
co

eN
-.J

The' ACT8837 floating point unit can also be used to perform matrix manipulations
involved in graphics processing or digital signal processing. The FPU multiplies and
adds data elements, executing sequences of microprogrammed calculations to form
new matrices.

Representation of Variables
In state representations of control systems, an n-th order linear differential equation
with constant coefficients can be represented as a sequence of n first-order linear
differential equations expressed in terms of state variables:
dx1
dt

5-92

=

x 2, ... ,

dx(n-1)
dt

=

xn

For example, in vector-matrix form the equations of an nth-order system can be
represented as follows:

d
dt

a11

x1
x2

a12

a1n

~

b11

b1n

x2

:

xn

an1

+

or, X = ax

an2

ann

~
u2

+

xn

bn1

bnn

un

bu

Expanding the matrix equation for one state variable, dx 1/dt, results in the following
expression:
X1 = (a11

*

x1

+ ... + a1 n

* xn)

+ (b11

* u1

+ ... + b1 n

* un)

where X 1 = dx 1/dt.
Sequences of multiplications and additions are required when such state space
transformations are performed, and the' ACT8837 has been designed to support such
sum-of-products operations. An n X n matrix A multiplied by an n x n matrix X yields
an n x n matrix C whose elements cij are given by this equation:

n

cij =

E

aik

* xkj

"~

M
for i = 1, ... ,n

j = 1, ... ,n

(1)

I-

k=1

(J

For the cij elements to be calculated by the' ACT8837, the corresponding elements
aik and xkj must be stored outside the' ACT8837 and fed to the' ACT8837 in the
proper order required to effect a matrix multiplication such as the state space system
representation just discussed.

Sample Matrix Transformation
The matrix manipulations commonly performed in graphics systems can be regarded
as geometrical transformations of graphic objects. A matrix operation on another matrix
representing a graphic object may result in scaling, rotating, transforming, distorting,
or generating a perspective view of the image. By performing a matrix operation on
the position vectors which define the vertices of an image surface, the shape and
position of the surface can be manipulated.

5-93

c:(
~

z"

en

The generalized 4 x 4 matrix for transforming a three-dimensional object with
homogeneous coordinates is shown below:

a
e
T

b
f

c
g
k
...
0
:

.....
m n

d
h
I
p

The matrix T can be partitioned into four component matrices, each of which produces
a specific effect on the resultant image:

3
3

x 3

x

1 x 1

1 x 3

The 3 x 3 matrix produces linear transformation in the form of scaling, shearing and
rotation. The 1 x 3 row matrix produces translation, while the 3 x 1 coiumn matrix
produces perspective transformation with multiple vanishing points. The final single
element 1 x 1 produces overall scaling. Overall operation of the transformation matrix
:2 T on the position vectors of a graphic object produces a combination of shearing,
..., rotation, reflection, translation, perspective, and overall scaling.

en
~

~ The rotation of an object about an arbitrary axis in a three-dimensional space can be
-4 carried out by first translating the object such that the desired axis of rotation passes
CO through the origin of the coordinate system, then rotating the object about the axis
~ through the origin, and finally translating the rotated object such that the axis of rotation
..., resumes its initial position. If the axis of rotation passes through the point P = [a b c 1],
then the transformation matrix is representable in this form:
[x y z h] = [x y z 1]

1

0

0
0

1

0
0

0

1

0
0
0

-a

-b

-c

1

R

5-94

0
1

0
0

0

1

0
0
0

b

c

1

~

~
translation
to origin

1

0
0
a

rotation
about
origin

translation
back to initial
position

(2)

where R may be expressed as:

n12

R

=

+ (1-n)2 cosc/J

n 1n2( 1-cosc/J) + n3sinc/J n 1n3( 1-cosc/J) - n2sinc/J

+ (1-n2)2 cosc/J

n 1 n2( 1-cosc/J) - n3sinc/J

n22

n 1 n3( 1-cosc/J) + n2sinc/J

n2n3(1-cosc/J) - n1 sinc/J

0
and

n2n3( 1-cosc/J) + n 1sinc/J
n3 2

0

+ (1-n3)2 cosc/J

0
0
0

0

n1

=

q1/(q12

+ q22 + q3 2 )1/2

n2

=

q2/(q 12

+ q22 + q3 2 ) 1/2 = direction cosine for y-axis of rotation

n3 = q3/(q 12

n=

+ q22 + q3 2 ) 1/2

direction cosine for x-axis of
rotation

= direction cosine for z-axis of rotation

= unit vector for Q

(n1 n2 n3)

Q = vector defining axis of rotation = [q1 q2 q3]
c/J = the rotation angle about Q

.....

A general rotation using equation (2) is effected by determining the [x y z] coordinates
of a point A to be rotated on the object, the direction cosines of the axis of rotation
[n1, n2, n31. and the angle c/J of rotation about the axis, all of which are needed to
define matrix [R]. Suppose, for example, that a tetrahedron ABCD, represented by
the coordinate matrix below is to be rotated about an axis of rotation RX which passes
through a point P = [5 - 6 3 1] and whose direction cosines are given by unit vector
[n1 = 0.866, n2 = 0.5, n3 = 0.707]. The angle of rotation 0 is 90 degrees (see
Figure 24). The rotation matrix [R] becomes
2
1
2
2

R

-3
-2
-1
-2

0.750
-0.274
1.112
0

3
2
2
2

1.140
0.250
-0.513
0

A
B

C
D

0.112
1.220
0.500
0

0
0
0
1
5-95

M
CO
CO

t;
«

~
.....
Z

en

y

z·

r-I

(2)1

DT

+ - - - - ---------...,

BT " " - - AT

1(1)

Q

55 0

I

I

I

X·+-------------~~~~~~4~5~0----------~----------------~X
I

L_-+
BR
I

I

z

C·

IL ____
(3)

-+
B'

t~~

D'
__________

900
~~_L~

IA'

P (5. -6.3)

I
I

I

y'
(1) THIS ARROW DEPICTS THE FIRST TRANSLATION
(2) THIS ARROW DEPICTS THE 90 0 ROTATION
(3) THIS ARROW DEPICTS THE BACK TRANSLATION

Figure 24. Sequence of Matrix Operations

rJ)

Z

.....
~
l>

n

The point transformation equation (2) can be expanded to include all the vertices of
the tetrahedron as follows:

-4

(XI
(XI

W

.....

xa
xb
xc
xd
2 -3
1 -2
2 -1
2 -2

ya
yb
yc
yd
3
2
2
2

za
zb
zc
zd
1
1
1
1

h1
h2
h3
h4
1 0 00
01 00
00 1 0
-56-31

~
translation
to origin

5-96

0.750 1.140 0.112 0 1 000
-0.274 0.250 1.22 0 0 1 0 0
1.112 -0.513 0.5000 o 0 1 0
0
0
0
1 5-6 3 1

~

rotation about origin

~

translation
back to
initial
position

(3)

The 'ACT8837 floating-point unit can perform matrix manipulation involving
multiplications and additions such as those represented by equation (1). The matrix
equation (3) can be solved by using the' ACT8837 to compute, as a first step, the
product matrix of the coordinate matrix and the first translation matrix of the righthand side of equation (3) in that order. The second step involves postmultiplying the
rotation matrix by the product matrix. The third step implements the back-translation
by premultiplying the matrix result from the second step by the second translation
matrix of equation (3). Details of the procedure to produce a three-dimensional rotation
about an arbitrary axis are explained in the following steps:
Step 1
Translate the tetrahedron so that the axis of rotation passes through the origin. This
process can be accomplished by multiplying the coordinate matrix by the translation
matrix as follows:

1

-3
-2

2
2

-2

2

-1

3
2
2
2

1
0
0

0
1
0

0
0
1

-5

6

-3

(2-5)
(1 - 5)
(2-5)
(2-5)

0
0
0
1

(-3+6)
(-2+6)
(-1 +6)
(-2+6)

(3-3)
(2-3)
(2-3)
(2-3)

~

~

translation
to origin

vertices of translated
tetrahedron

-3
-4
-3
-3

+3
+4
+5
+4

0
-1
-1
-1

1
1
1
1

AT
BT
CT
DT

The' ACT8837 could compute the translated coordinates AT, BT, CT, DT as indicated
above. However, an alternative method resulting in a more compact solution is
presented below.

5-97

Step 2
Rotate the tetrahedron about the axis of rotation which passes through the origin after
the translation of Step 1. To implement the rotation of the tetrahedron, postmultiply
the rotation matrix [Rl by the translated coordinate matrix from Step 1. The resultant
matrix represents the rotated coordinates of the tetrahedron about the origin as follows:

-3
-4
-3
-3

1.140 0.112 0
3
0 1
0.750
4 -1 1 -0.274
0.250 1.22 0
5 -1 1
1.112 -0.513 0.500 0
4 -1 1
1
0
0
0

-3.072
-5.208
-4.732
-4.458

-2.670
-3.047
-1.657
-1.907

3.324
3.932
5.264
4.044

~

~

rotation about origin

rotated coordinates

1
1
1
1

Step 3
Translate the rotated tetrahedron back to the original coordinate space. This is done
by premultiplying the resultant matrix of Step 2 by the translation matrix. The following
calculations produces the final coordinate matrix of the transformed object:

- 3.072
-5.208
-4.732
-4.458

5-98

- 2.670
-3.047
-1.657
-1.907

3.324
3.932
5.264
4.044

1
1
1
1

1
0
1
0
0 0
5 -6

0
0
1

3

0
0
0
1

1.928
-0.208
0.268
0.542

- 8.670
-9.047
-7.657
-7.907

6.324
6.932
8.264
7.044

~

~

translate back

final rotated coordinates

1
1

1
1

A more compact solution to these transformation matrices is a product matrix that
combines the two translation matrices and the rotation matrix in the order shown in
equation (3). Equation (3) will then take the following form:

xa
xb
xc
xd

ya
yb
yc
yd

za
zb
zc
zd

h1
h2
h3
h4

2
1
2
2

-3
-2
-1
-2

3
2
2
2

0.750
-0.274
1.112
-3.730

1.140
0.250
-0.513
-B.661

0.112
1.220
0.500
B.260

0
0
0
1

~
transformation matrix
The newly transformed coordinates resulting from the postmultiplication of the
transformation matrix by the coordinate matrix of the tetrahedron can be computed
using equation (1) which was cited previously:

"

M

CO
CO

n

cij

=

E

aik

* xkj

for i = 1, ... ,n

j = 1, ... ,n

(1)

t;



n

-I
00
00
W

......

PSEUDOCODE
a11 - RA, x11 -RB
p1=a11*x11
a12 -RA, x21 -RB
p2 = a12 * x21
p1 - P(p1)

8

en

MULTIPLIER/ALU
OPERATIONS
Load a11, x11
SP Multiply

The h-scalars h1, h2, h3, and h4 are equal to 1. The number of clock cycles to generate
each 4-tuple can then be decreased from 16 to 13 cycles. Total number of clock cycles
to calculate all four vertices is reduced from 66 to 54 clocks. Figure 25 summarizes
the overall matrix transformation.

5-102

v

Z'

x'----------------------~~----------------~--------------------_7X

1°
I
I

I
I

0

S
C'

I

Z

0

0'

S'

:A'

90°
P (5, -6,3)

I

I
I
I

V'

Figure 25. Resultant Matrix Transformation
This microprogram can also be written to calculate sums of products with all pipeline
registers enabled so that the FPU can operate in its fastest mode. Because of timing
relationships, the C register is used in some steps to hold the intermediate sum of
products. Latency due to pipelining and chained data manipulation is 11 cycles for
calculation of the first coordinate, and four cycles each for the other three coordinates.

U

After calculation of the first vertex, 16 cycles are required to calculate the four
coordinates of each subsequent vertex. Table 33 presents the sequence of calculations
for the first two coordinates, xa and ya.

z
en

5-103

,...
M
CO
CO
....

«~
,...

Table 33. FuliV Pipelined Sum of Products (PIPES2-PIPESO .. 000)
(Bus or Register Contents Following Each Rising Clock Edge)

CLOCK
CYCLE
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

I

BUS
Mul
Mul
Chn
Mul
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn

DA
BUS
x11
x21
x31
x41
x12
x22
x32
x42
x13
x23
x33
x43
x14
x24
x34
x44

DB
BUS
a11
a12
a13
a14
a11
a12
a13
a14
a 11
a12
a13
a14
a 11
a12
a13
a14

I
REG

RA
REG

RB
REG

MUL
PIPE

ALU
PIPE

P
REG

S
REG

Mul
Mul
Chn
Mul
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn

x11
x21
x31
x41
x12
x22
x32
x42
x13
x23
x33
x43
x14
x24
x34

a11
a12
a13
a14
a11
a12
a13
a14
a11
a12
a13
a14
a11
a12
a13

p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14

s1
t
s2
s3
s4
xa
s5
s6
s7
ya
s8
s9

p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13

s1
t
s2
s3
s4
xa
s5
s6
s7
ya
s8

Y
C
REG BUS

p2
p2
p2
s2
p6
p6
p6
s5
p10
p10
p10

xa

ya

tContents of this register are not valid during this cycle.
Products in Table 33 are numbered according to the clock cycle in which the operands
and instruction were loaded into the RA, RB, and I register, and execution of the
instruction began. Sums indicated in Table 33 are listed below:

en

s1 =
s2 =
s3 =
s4 =

z
....,
~

»
C')
-t

00
00
W
....,

5-104

p1 + 0
p1 + p3
p2 + p4
p5 + 0

s5 = p5 + p7
s6 = p6 + p8
s7 = p9 + 0
s8 = p9 + p11

s9 = p10 + p12
xa = p1 + p2 + p3 + p4
ya = p5 + p6 + p7 + p8

SAMPLE MICROPROGRAMS FOR BINARY DIVISION AND
SQUARE ROOT
The SN74ACT8837 Floating Point Unit supports binary division and square root
calculations using the Newton-Raphson algorithm. The' ACT8837 performs these
calculations by executing sequences of floating-point operations according to the
control settings contained in specific microprogrammed routines. This implementation
of the Newton-Raphson algorithm requires that a seed ROM provide values for the
first approximations of the reciprocals of the divisors.
This application note presents several microprograms for floating-point division and
square root using the Newton-Raphson algorithm. Each sample program is analyzed
briefly to show details of the floating-point procedures being performed.

Binary Division Using the Newton-Raphson Algorithm
Binary division can be performed as an iterative procedure using the Newton-Raphson
algorithm. For a dividend A, divisor B, and quotient Q, this procedure calculates a value
for 1 /B which is then used to evaluate the expression Q = A
1/B. The calculation
can be performed with either single- or double-precision operands, and examples of
each precision are shown.

*

The basic algorithm calculates the value of a quotient Q by approximating the reciprocal
of the divisor B to adequate precision and then multiplying the dividend A by the
approximation of the reciprocal:
Q = A/B = A

* Xn, where Xn

= the value of X after the nth iteration
n = the number of iterations to achieve the
desired precision

t-

Intermediate values of X are calculated using the following expression:
Xi

+

1 = Xi

*

(2 - B

* Xi),

,...
M
00
00

O

«
,...

where XO = approximates 1/B for
the range 0 < XO < 2/B

~

Z

To illustrate a program using the Newton-Raphson algorithm, the sequence of
calculations is presented in detail. For double-precision operations, three iterations are

5-105

en

needed to achieve adequate precision in the value of 1lB. A value for the seed XO
(approximately equal to lIB) is assumed to be given, and the following operations are
performed to evaluate Q from double-precision inputs:
Xl = XO(2 - B

=

X2

* XO)

Xl (2 - B * Xl)

X3 = X2(2 - B

=

XO(2 - B * XO) * (2 - B * XO(2 - B * XO))

* X2)

X3 = XO(2-B * XO) * (2-B * XO(2-B * XO)) * (2-B * XO * (2-B
* XO) * (2-B * XO * (2-B * XO)))
Q

=

AlB

A

=

*

=A

lIB

* X3

A * XO(2-B * XO) * (2-B * XO(2-B * XO)) * (2-B * XO
* (2-B * XO) * (2-B * XO * (2-B * XO)))

Xl

Xl

X1

X2

X1

X2
X3

en
~

Table 36 presents decimal and hexadecimal values for A, B,and XO, which are used
in the sample calculation. The computed value of the quotient Q is also included,
showing the representations of the results of this sample division.

~

Table 34. Sample Data Values and Representations

~

l>

CO
CO
W
-..J

TERM
A
B

XO
Q

VALUE

22
7
1/7
22/7

DECIMAL REPRESENTATION
MANTISSA • 2 EXPONENT

1.375 * 2 4
1.75*22
1.140625 * 2 (-3)
1.5714285714285713 * 2 1

IEEE HEXADECIMAL
REPRESENTATION

40360000
401 COOOO
3FC24000
40092492

00000000
00000000
00000000
49249249

In Table 35, the sequence and timing of this procedure is shown exactly as performed
by the' ACT8837. This example shows the steps in a double-precision division requiring
three iterations to achieve the desired accuracy. In this table each operation is
sequenced according to the clock cycles during which the instruction inputs for that
operation are presented at the pins of the' ACT8837. Operations are accompanied
by a pseudocode summary of tHe operations performed by the' ACT8837 and the clock
cycle when an operand is available or a result is valid.
Each line of pseudocode indicates the operands being used, the operations being
performed, the registers involved, and the clock cycles when the results appear. Each
5-106

register is represented by its usual abbreviation (RA, RB, P, S, or C) followed by the
number of the clock cycle when an operand will be valid or available at the register.
For example, "P.4" refers to the contents of the Product Register after the fourth
clock cycle.
Table 35. Binary Division Using the Newton-Raphson Algorithm
CLOCK
CYCLES

OPERATIONS

1, 2
3, 4
5, 6

X1

7, 8
9, 10
11, 12

X2

13, 14
15, 16
17, 18
19,20
21,22

* XO
2 - B * XO
= XO(2 - B * XO)
B * X1
2 - B * X1
= X1(2 - B * X1)
B * X2
2 - B * X2
= X2(2 - B * X2)
A * X3
B

X3

Output MSH

PSEUDOCODE

B - RA.2, XO - RB.2
RB.2 - P.4
RA.2
2 - P.4 - S.6

*

RB.2
RA.2

* S.6 * P.8 -

P.8
P.10

P.8 - C.9, 2 - P.1O - S.12

*
*

C.9
S.12 P.14
P.14-P.16
RA.2
P.14-C.15,2 - P.16-S.18
A - RA.18, C.15
RA.18

* P.20

-+

* S.18

-+

P.20

P.22

P.22.MSH - Y

The sequence of operations can be microcoded for execution exactly as listed in the
table above. Sample microprograms (with data and parity fields provided) are given
below. To make the programs easier to follow, comment lines have been included to
indicate clock timing, calculation performed by the instructions being loaded, and
operations being represented, in the same pseudocode as in the preceding table. The
fields in the microinstruction sequences presented below are arranged in the following
order:

M
CO

~
U


-t

(')

19 0 0 040 0 0 2 EF 0 0 0 0 1 1 0 0 0 0 3 1
20 1 0 040 0 0 2 EF 0 0 0 0 1 1 0 0 0 0 3 1

3 00000000 00000000 0 0
3 00000000 00000000 0 0

00
00
Co\)

....,

;Lines 21-22

Operation:

P.22 .... Y

21 0 0 020 0 0 2 EF 0 0 0 0 1 1 0 0 0 0 3 1
22 1 0 020 0 0 2 EF 0 0 0 0 1 1 0 0 0 0 3 1

5-110

3 00000000 00000000 0 0
3 00000000 00000000 0 0

Double-Precision Newton-Raphson Binary Division
If the value of B is given as a double-precision number and XO is looked up in a doubleprecision seed ROM, no conversions are required prior to performing a double-precision
division using the Newton-Raphson algorithm. Three iterations are used in the doubleprecision example (n = 3). The following formula represents the sequence of
calculations to be performed:
AlB = A

*

*

*

* [2 - B * XO * (2 - B * XO)]
* XO) * [2 - B * XO .(2 - B * XO)])

*

XO
(2 - B
XO)
(2 - B
XO .(2 - B

*

Table 37 shows a double-precision division using a double-precision seed ROM. The
example divides 22/7.
Table 37. Double-Precision Newton-Raphson Binary Division

;Lines 1-4

01
02
03
04

0
1
0
1

0
0
0
0

1 CO
1CO
1 CO
1CO

Calculation: B
Operations: B
0
0
0
0

0
0
0
0

2
2
2
2

FF
FF
FF
FF

0
0
0
0

0
0
0
0

0
0
1
1

0
0
1
1

* XO
-+

RA.4, XO

1
1
1
1

1
1
1
1

0
0
0
0

0
0
0
0

0
0
0
0

-+

0
0
0
0

3
3
3
3

RB.4, RA.4
1
1
1
1

1
1
1
1

3
3
3
3

* RB.4

3FC24000
3FC24000
401 COOOO
401 COOOO

-+

P.8

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

"

(II)

00
00

*

;Lines 5-8

Calculation: 2 - (B
Operation:
2 - P.8

XO)
-+ S.12

I-

U


~

=

and Xl

(")

-i

ex>
ex>

A

w

"""

5-114

B

*

*

*

0.5
[3 - B

0.5

*

*

Xl

*

[3 - B

*

XO

*

[3

*

B

*

*
*

(Xl 2)]
(XO 2)]

0.5
XO
[3 - B
(XO 2)]
(0.5
XO
[3 - B
(XO 2)]) 2]

*

*

*

*

Table 38. Single-Precision Binary Square Root

;Lines 1-2

01 0 0 026 1
02 1 0 026 1

;Lines 3-4

Calculation: B s.p .... d.p.
Operations: B'" RA.1, (s.p. to d.p.l(RA.11 ... S.2
3 FF 0 0 1 0 1 1 0 0 0 0 3 1
3 FF 0 0 1 0 1 1 0 0 0 0 3 1

Calculation: Load XO
Operation:
XO'" RA.4

03 0 0 126 1 0 2 FF 0 0 1 0 1 1 0 0 0 0 3 1
04 1 0 126 1 0 2 FF 0 0 1 0 1 1 0 0 0 0 3 1

;Lines 5-6

0 0 0 0 3 1
0 0 0 0 3 1

*

*

"

* C.7'" P.10

3 40000000 00000000 0 0
3 40000000 00000000 0 0

«~

"

*

0 0 0 0 3 1 1 3 40400000 00000000 0 0
0 0 0 0 3 1 1 3 40400000 00000000 0 0

*

Calculation: 3 - (B
XO 2)
S.12 - P. 12 ... S. 14
Operation:

11 0 0 003 0 0 2 FA 0 0 0 0 1 1 0 0 0 0 3 1
12 1 0 003 0 0 2 FA 0 0 0 0 1 1 0 0 0 0 3 1

M
00
00
....
(,)

z
en

Calculation: B
XO 2
C.7'" P.12, 3'" RA.10'" S.12
Operations: P.10

09 0 0 260 0 0 2 6F 0 0 1 0 1
10 1 0 260 0 0 2 6F 0 0 1 0 1

;Lines 11-12

3 3FE6AOOO 00000000 0 0
3 3FE6AOOO 00000000 0 0

Calculation: Load B, B
XO
Operations: S.6'" C.7, B'" RB.8, RB.8

07 0 1 040 1 0 2 7F 0 0 0 1 0 1 0 0 0 0 3 1
08 1 0 040 1 0 2 7F 0 0 0 1 0 1 0 0 0 0 3 1

;Lines 9-10

3 3FE6AOOO 00000000 0 0
3 3FE6AOOO 00000000 0 0

Calculation: XO d.p .... s.p.
Operations: (d.p. to s.p.l(RA.41 ... S.6

05 0 0 126 1 0 2 FF 0 0 1 0 1
06 1 0 126 1 0 2 FF 0 0 1 0 1

;Lines 7-8

3 40000000 00000000 0 0
3 40000000 00000000 0 0

3 00000000 00000000 0 0
3 00000000 00000000 0 0

5·115

Table 38. Single-Precision Binary Square Root (Continued)

;Lines 13-14

Calculation: XO
Operations: C.7

* (3 - (B * XO
* 8.14 ...... P.16,

2))
1/2 ...... RA.14 ...... 8.16

13 0 0 260 0 0 2 9F 0 0 1 0 1 1 0 0 0 0 3 1
14 1 0 260 0 0 2 9F 0 0 1 0 1 1 0 0 0 0 3 1

;Lines 15-16

*
*

*

3 3FOOOOOO 00000000 0 0
3 3FOOOOOO 00000000 0 0

*

Calculation: 1/2
XO
(3-(B
XO 2)) ...... X 1
P.16 ...... P.18, 0 ...... RA.16,
Operations: 8.16
RA.16 + RB.8 8.18

1 5 0 0 240 0 0 2 AF 0 0 1 0 1 1 0 0 0 0 3 1
16 1 0 240 0 0 2 AF 0 0 1 0 1 1 0 0 0 0 3 1

;Lines 17-18

*

3 00000000 00000000 0 0
3 00000000 00000000 0 0

Calculation: B
X1
Operations: 8.18
P. 18 ...... P.20

*

1 7 0 0 040 0 0 2 AF 0 0 0 0 1 1 0 0 0 0 3 1 1 3 00000000 00000000 0 0
1 8 1 0 040 0 0 2 AF 0 0 0 0 1 1 0 0 0 0 3 1 1 3 00000000 00000000 0 0

en
2

-..J

~

;Lines 19-20

(")

-I
CO
CO
W

-..J

*

Calculation: B
X1 2
Operations: P.18 ...... C.19, P.20
C.19 ...... P.22,
3 ...... RA.20 ...... 8.22

*

19 0 1 260 0 0 2 6F 0 0 1 0 1 1 0 0 0 0 3 1
20 1 0 260 0 0 2 6F 0 0 1 0 1 1 0 0 0 0 3 1

;Lines 21-22

*

Calculation: 3 - (B
X1 2)
Operations: 8.22 - P.22 ..... 8.24

21 0 0 003 0 0 2 FA 0 0 0 0 1 1 0 0 0 0 3 1
22 1 0 003 0 0 2 FA 0 0 0 0 1 1 0 0 0 0 3 1

;Lines 23-24

*

3 00000000 00000000 0 0
3 00000000 00000000 0 0

*

Calculation: X1
(3 - (B
X1 2))
8.24 ...... P.26, 1/2 ..... RA.24 ..... 8.26
Operations: C.19

*

23 0 0 260 0 0 2 9F 0 0 1 0 1 1 0 0 0 0 3 1
24 1 0 260 0 0 2 9F 0 0 1 0 1 1 0 0 0 0 3 1
5-116

3 40400000 00000000 0 0
3 40400000 00000000 0 0

3 3FOOOOOO 00000000 0 0
3 3FOOOOOO 00000000 0 0

Table 38. Single-Precision Binary Square Root (Concluded)

;Lines 25-26

*
*

*

25 0 0 240 0 0 2 AF 0 0 1 0 1 1 0 0 0 0 3 1
26 1 0 240 0 0 2 AF 0 0 1 0 1 1 0 0 0 0 3 1

;Lines 27-28

*

3 00000000 00000000 0 0
3 00000000 00000000 0 0

Calculation: B
X2 ... A
Operations: 5.28
P.28 ... P.30

*

27 0 0 040 0 0 2 AF 0 0 0 0 1 1 0 0 0 0 3 1
28 1 0 040 0 0 2 AF 0 0 0 0 1 1 0 0 0 0 3 1

;Lines 29-30

*

Calculation: 1/2
Xl
(3 - (B
Xl 2))'" X2
Operations: 5.26
P.26 ... P.28, 0'" RA.26,
RA.26 + RB.8 5.28

3 00000000 00000000 0 0
3 00000000 00000000 0 0

Calculation: NOP
Operation:
Y ... Output

29 0 1 OOA 0 0 2 FF 0 0 0 0 1 1 0 0 0 0 3 1
30 1 0 OOA 0 0 2 FF 0 0 0 0 1 1 0 0 0 0 3 1

3 00000000 00000000 0 0
3 00000000 00000000 0 0

.....

M

Double-Precision Square Root

00
00

The value of B is given as a double-precision number so XO can be looked up from
a double-precision seed ROM without conversion from one precision to the other. Three
iterations (n = 3) are required in the double-precision calculation, and the following
formula for sqrt(B) is to be evaluated:

A = B

*
*
*

*

0.5
[3 - B
[3 - B
[3 - B

* 0.5 * 0.5 * XO * [3 - B * (XO
* (0.5 * XO * [3 - B * (XO 2)]) 2]
* (0.5 * 0.5 * XO * [3 - B * (XO 2)]
* (0.5 * XO * [3 - B * (XO 2)]) 2]) 2]

to-

~
~

.....

Z

en

2)]

5-117

Table 39. Double-Precision Binary Square Root

*

;Lines 1-4

01
02
03
04

0
1
0
1

0 3EO
03EO
0 3EO
0 3EO

XO
Calculations: Load B, Load XO, B
Operations: B -+ RB.4~ XO -+ RA.4, RA.4
RA.4 -+ 5.8 -+ C.1 0
0
0
0
0

0
0
0
0

0
1
0
1

0
0
0
0

3EO
3EO
3EO
3EO

FF
FF
FF
FF

0 0 0 0 1
00001
0 0 1 1 1
0 0 1 1 1

1
1
1
1

*

;Lines 5-8

05
06
07
08

2
2
2
2

0 0 0 0 3
00003
0 0 0 0 3
0 0 0 0 3

Calculations: B
XO 2
Operations: P.8
5.8
0
0
0
0

0
0
0
0

2
2
2
2

AF
AF
AF
AF

0
0
0
0

0
0
0
0

0
0
1
1

0
0
0
0

1
1
1
1

*

1
1
1
1

0
0
0
0

0
0
0
0

-+

0
0
0
0

1
1
1
1

P.12, 3
0
0
0
0

3
3
3
3

1
1
1
1

* RB.4

3
3
3
3

40000000
40000000
3FE6AOOO
3FE6AOOO

-+

RA.8

3
3
3
3

00000000
00000000
40080000
40080000

3
3
3
3

00000000
00000000
00000000
00000000

-+

-+

P.8

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

5.12

*

;Lines 9-12

XO 2)
Calculations: 3 - (B
Operations: 5.12 - p.12 -+ 5.16

rJ)

Z

t

-..J

(")

09
10
11
12

0
1
0
1

0
1
0
0

183
183
183
183

0
0
0
0

0
0
0
0

2
2
2
2

FA
FA
FA
FA

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
1
1

1
1
1
1

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

3
3
3
3

1
1
1
1

1
1
1
1

~

CO
CO
CAl
-..J

;Lines 13-16

13
14
15
16

0
1
0
1

5-118

0
0
0
0

3EO
3EO
3EO
3EO

*

*

Calculations: XO
(3 - (B
XO 2))
Operations: C.10
5.16 -+ P.20, 1/2
0 0 2 9F 0
0 0 2 9F 0
0 0 2 9F 0
0 0 2 9F 0

0
0
0
0

0
0
1
1

0
0
0
0

1
1
1
1

1
1
1
1

*

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

3
3
3
3

1
1
1
1

1
1
1
1

3
3
3
3

-+

RA.16

00000000
00000000
3FEOOOOO
3FEOOOOO

-+

5.20

00000000
00000000
00000000
00000000

Table 39. Double-Precision Binary Square Root (Continued)

* *
*

;Lines 17-20

17
18
19
20

0
1
0
1

0
0
0
0

3CO
3CO
3CO
3CO

0
0
0
0

0
0
0
0

0
1
0
1

0
0
0
0

1 CO
1 CO
1CO
1CO

2
2
2
2

AF
AF
AF
AF

0
0
0
0

0
0
0
0

0
0
1
1

0
0
0
0

1
1
1
1

1
1
.1
1

*

;Lines 21-24

21
22
23
24

*

Calculations: 1/2
XO
(3-(B
XO 2)) -+ X 1
P.20 -+ P.24 -+ C.25, 0 -+ RA.20,
Operations: 5.20
RA.20 + RB.4 -+ 5.24
0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

Calculations: B
Xl
Operations: 5.24
P.24
0
0
0
0

0
0
0
0

2
2
2
2

AF
AF
AF
AF

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

1
1
1
1

*

1
1
1
1

*

;Lines 25-28

0
0
0
0

0
0
0
0

0
0
0
0

-+

0
0
0
0

Calculations: B
Xl 2
Operations: P.28
C.25

*

3
3
3
3

3
3
3
3

00000000
00000000
00000000
00000000

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

3
3
3
3

00000000
00000000
00000000
00000000

00000000
00000000
00000000
00000000

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

P.28

3
3
3
3

-+

1
1
1
1

1
1
1
1

P.32, 3

-+

RA.28

-+

5.32

"

(\')

25
26
27
28

0
1
0
1

1
0
0
0

3EO
3EO
3EO
3EO

0
0
0
0

0
0
0
0

2
2
2
2

6F
6F
6F
6F

0
0
0
0

0
0
0
0

0
0
1
1

0
0
0
0

1
1
1
1

1 0 0 0 0 3 1
1 000 0 3 1
1 0 0 0 0 3 1
1000031

1
1
1
1

3
3
3
3

00000000
00000000
40080000
40080000

00000000
00000000
00000000
00000000

CO

~

()



C")

m" = 2 I m - delta

-4

(X)
(X)
Co\)

-....I

Where delta = 2(-8)
So m"

=

2 I m, and e"

= (- e)

- 1.

Since IEEE exponents are represented in excess 1023 notation, a formula for X" must
be determined, given that X is the IEEE exponent. As an IEEE exponent,
X = e + 1023 -+ e = X - 1023 and X" = e" + 1023. So, for X" in terms of X,
X" = e" + 1023
( - e) - 1 + 1023
= (- (X - 1023)) + 1022
= 1023 - X + 1022
= 2045 - X

So given the 11 bits of X as address of the seed exponent, the value stored at address
X is
X" = 2045 - X
5-126

(2)

Given that the mantissa seed ROM uses 10 bits of the mantissa to determine the seed,
delta).
each seed Xm will be used for some range of mantissas, 8m to (8m + 2
The formula for Xm is from formula (1).

*

218m
2/(8m + 2

-Xm
-Xm

* delta)

Where delta = 2( -11)
This value is used since the actual Xm should be generated by the mantissa in the
center of the given range:
Xm = 2/(8m + delta)
This would result in a more accurate seed on the average. Therefore, the formula used
to generate the mantissa part of the seed is
Xm = 2/(8m

+

(2(-11)))

(3)

Square Root PROMs
The seed for the square root, XO, is actually the reciprocal of the square root of the
data, 8:
XO = 1 1(8(112))

*

Given 8 = m
(2 e ) and XO = m'
by substitution and reduction:

*

* (2 e '), the expression for XO can be evaluated

XO = 1 I ((m
(2 e ))(1/2))
= 1 I (m(1/2)
(2(e/2)))
= m( - 1/2)
(2( - e/2))

*

"

*

M
CO
CO

....
(.)

Then m' and e' may be written as m' = m( - 1/2) and e' = - e/2.

c:t

Next, it is necessary to verify that the above m' and e' form a valid normalized IEEE
number. When e is an odd number, e' is not an integer and, therefore, it is not valid
IEEE exponent. If the above expression is separated into two cases, e' can be
represented in terms of a valid IEEE exponent, e":

e' = -e/2

for e even
for e odd

e' = e" + 112

Rewriting e" in terms of e produces this expression:

e" = e' - 1/2

=

(-e/2) - 1/2

for e odd

Then a valid IEEE exponent, e", can be written for all e as

e"
- e/2
e" = (-e/2) - 112

for e even
for e odd

5-127

~

"

2:

en

This is equivalent to e" = intI - e/2) for all e. However, the 1/2 affects the mantissa:

* (2e')
* (2(e" + 112))
for odd e
* (21/2) * (2e")
for odd e
m" * (2 e ") m" can be rewritten as

XO = m'
XO = m'
XO = m'
Since XO =

mil = m'
m" = m'

*

for even e
for odd e

(21/2)

In terms of m, m" = m - 1/2
m" = (m-1/2)

* (2112)

for even e
for odd e

Simplifying m" for odd e,
m"
(1/m1/2)
m" = (21m 112)

* (21/2)

for odd e
for odd e

Just as the divide exponent needed to be converted to excess 1023 notation, so the
same must be done for the square root:
X" = e" + 1023
X = e + 1023
X" = intI - e/2) + 1023
X" = int((1023-X) I 2) + 1023

en
Z
.....

The IEEE bits for the exponent seed, X", can be expressed in terms of the IEEE bits
for the exponent of B, X:

~

X" = intI (1023-X) 12)

+

1023

~ Because the formula for m" depends on the least significant bit of e, that bit must
CO be used as an address line to the mantissa.
CO
eN Since X = e + 1023, an odd value of e will result in an even value of X, and an even
..... value of e will result in an odd value of X. Therefore,
m" = m- 1/2
m" = 2/m1/2

5-128

for odd X
for even X

SN74ACT8841

Digital Crossbar Switch

6-1

6-2

SN74ACT8841
Digital Crossbar Switch
The SN74ACT8841 is a single-chip digital crossbar switch that cost-effectively
eliminates bottlenecks to speed data through complex bus architectures.
The' ACT8841 has 16 four-bit bidirectional ports which can be connected in
any conceivable combination. Total time fot data transfer is 14-ns flowthrough.
The' ACT8841 is ideal for multiprocessor application, where memory bottlenecks
tend to occur. For example, four 32-bit buses can be easily connected by two
'ACT8841 devices. System architectures based on the 16-port 'ACT884 1 can
include up to 16 switching nodes (i.e., processors, memories, or bus interfaces).
Larger processor arrays can be built with multistage interconnect schemes.

6-3

en
2

--..I
~

l>

(")

-I
CO
CO
~

....

6-4

SN74ACT8841
DIGITAL CROSSBAR SWITCH
JUNE 1988

•
•

High-Speed Programmable Switch for
Parallel Processing Applications

(TOPVIEWI
2

Dynamically Reconfigurable for FaultTolerant Routing

A

B

•

64 Bidirectional Data I/Os in 16 Nibble
(Four-Bitl Groups

•

Data 110 Selection Programmable by Nibble

•

Eight Banks of Control Flip-Flops for Storing
Configuration Programs

•

Two Selectable Hard-Wired Switching
Configurations

•

G

H

J

Selectable Stored-Data or Real-Time Inputs

K

L

•

156-Pin Grid-Array Package

•

CMOS 1 I'm EPIC"' Process

•

Single 5-V Power Supply

3:
w

Ga PACKAGE

M

p

3

4

5

6

7

8

9

10'1 12131415

•••••••••••••••

·· ..............
.
.• ... ·• ..
••
·· ·• ·
·
..
·• ...
• • • •
• ·• ·
• • •
• • • •
• •••
• • •
·· ·• ·•
• ••
• ·
• • •
• • •
· ............. .
•••••••••••••••

· ............. .
•••••••••••••

description
The SN74ACT8841 is a flexible, high-speed digital crossbar switch. It is easily microprogrammable to
support user-definable interconnection patterns. This crossbar switch is especially suited to multiprocessor
interconnects that are dynamically reconfigurable or even reprogram mabie after each system clock. The
'ACT8841 is built in Texas Instruments advanced 1 I'm EPIC"' CMOS process to enhance performance
and reduce power consumption. The switch requires only a 5-V power supply.
Because the' ACT8841 is a 16-port device, system architectures based on the' ACT8841 can include
up to 16 switching nodes, which may be processors, data memories, or bus interfaces. Larger processor
arrays can be built with multistage interconnection schemes. Most applications will use the crossbar switch
as a broadband bus interface controller, for example, between closely coupled processors which must
exchange data with very low propagation delays.
The' ACT8841 has ten selectable control sources, including eight banks of programmable control flip-flops
and two hard-wired control circuits. The device can switch from 1 to 16 nibbles (4 to 64 bits) of data
in a single cycle.
The 64 110 pins of the' ACT8841 are arranged in 16 switch able nibbles (see Figure 1). A single input nibble
can be broadcast to any combination of 15 output nibbles, or even to 16 nibbles (including itself) if operating
off registered data. Multiple input nibbles can be switched to multiple outputs, depending on the programmed
configurations available in the control flip-flops.
The digital crossbar switch is intended primarily for multiprocessor interconnection and parallel processing
applications. The device can be used to select and transfer data from multiple sources to multiple
destinations. Since it can be dynamically reprogrammed, it is suitable for use in reconfigurable networks
for fault-tolerant routing.

EPIC is a trademark of Texas Instruments Incorporated

PRODUCT PREVIEW documents contain information
on products in the formative or design phase of
development. Charact.ristic dati anil other

~:::~::t:=:sl:ht dt~iXa=;:IS';r T3i~::~~:~~~h:::
products without notica.

Copyright © , 988. Texas Instruments Incorporated

TEXAS " ,
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-5

:>w
a:

0.
I-

o::;)
c

oa:

0.

SN74JCT8841
DIGITAL CROSSBAR SWITCH
"'0

:J3

description (continued)

o

The' ACT8841 and the bipolar SN74AS8840 share the same architecture. Microcode for the' AS8840
can be tun on the' ACT8841 if the additional control inputs to the' ACT8841 are properly terminated.
However, because the' ACT8841 is a CMOS device with six additional control inputs, the' AS8840 and
the' ACT884 1 are not socket-compatible and cannot be used interchangably. A summary of the differences
between the SN74AS8840 and the SN74ACT8841 is provided in the 'AS8840 and 'ACT8841
FUNCTIONAL COMPARISON at the end of the data sheet.

C

c:

(")
~

"'0

The SN74ACT8841 is characterized for opertion from OOC to 70°C.

:J3

m

S
m

:E

Table 1. 'ACT8841 Pin Grid Allocation
PIN
NO.
Al

PIN
NAME
GND
GNO
037

C12

A5
A6

D35
033

C13
C14

WE

A7
A8
A9

CAAOAI
CNTR7
CNTA4

C15
Dl
02

AlO
All

0El57

03
D7

VCC
GND

D29

D8

A12

027
D25
GND
GND

D9
013
D14
D15

VCC
GNO

GND
GND

El

A2
A3
A4

A13
A14
A15
81
82
83
84

039
D36
034

85
86
87

0Ei58

88
89

CASACE
CNTA5

810
811
812
813
814
815
Cl
C2
C3

PIN

PIN

NO.
Cl0
Cll

NAME
D31

0ED6
VCC
GNO
D23
D21
043
D42

H13
H14
H15
Jl
J2
J3
J4
J12
J13
J14
J15

NAME

CAEAOO

SELOLS
CNTA3

N9
Nl0

VCC
DO

OEC
CAWAITEO

NIl
N12

03
06

CAWAITEI
GND

N13
N14

GNO
D8

GNO
CNTA2

N15
PI
P2

D9
GNO
GND

P3
P4
P5
P6

D58
D60

CNTAI
CNTAO
CAWAITE2
OE012

D20
D19
D45
044

K13
K14

E3
E13

5EiITO

K15
L1

lIDJ3

E14

018
D17

OE02

0Ei55

D48
D15
D14

P7

CNTA13

056

062
CNTA12
CNTA15

P8
P9

TPO

050
OED13

Pl0
Pll
P12

OEDO
D2
D4

P13
P14

D7
GND

D49

OEDll

L2
L3
L13

F2

046

L14

012

F3
F13

D47
D16

L15
Ml

GND

D28
D26
D24

F14
F15
Gl

OED4
CASEL3
CNTR8

M2
M3
M7

D13
D51
052

P15

D30

Al
A2

GNO
GNO

054
GND

A3
A4

D57
D59

GNO

G2

GND
D41
D40

G3
G4

CNTA9
CNTA10

M8
MID

VCC
GND

A5
A6

0EliT5

GNO

G13
G14
G15

CAADAO

D38

C4
C5
C6

0ED9

C7
C8

VCC
CACLK

C9

CNTR6

032

E15
Fl

D61

GND

M13
M14
M15

VCC
010
Dl1

A7
R8

CNTA14

GND
CASEL2

A9

CAEAD2

CASELI
CASELO

Nl
N2

D53
D55

AID
All

TPI
Dl

HI
H2

CNTAll
SELOMS

N3
N4

GND

H3
H4

MSCLK

N5
N6

A12
A13
R14

OEDT

VCC
OED14
D63

R15

GNO

G12

VCC

TEXAS ."

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TeXAS 75265

6-6

NAME

VCC
LSCLK

NO.
N7
N8

Kl
K2
K3

E2

D22

NO.
H12

CAEAOI

D5
GNO

SN74ACTB841
DIGITAL CROSSBAR SWITCH

3:
w

Table 2. 'ACT8841 Pin Functional Description
PIN
NAME

NO.

CNTRO

J15

CNTRl

J14

CNTR2

J13

CNTR3

H15

CNTR4

A9

CNTR5

69

CNTR6

C9

CNTR7

A8

CNTR8

Gl

CNTR9

G2

CNTR10

G3

CNTRll

Hl

CNTR12

P7

CNTR13

N7

CNTR14

R7

CNTR15

P8

CRADRO

87

CRADRl

A7

CRCLK

C8

CREADO

N8

CREADl

R8

CREAD2

R9

CRSELO

G15

CRSEL 1

G14

CRSEL2

G13

CRSEL3

F15

CRSRCE

88

110

:>w

DESCRIPTION

a:
a.

I-

o:::)
110

Control 1/0. Inputs four control words to the control flip-flops on each CRCLK cycle. As outputs, the
same addresses can be used to read the flip-flop settmgs.

c
oa:

a.

I
I
I

Control register address. Selects 1 6-blts of control flip-flops as a source/destination for outputs/inputs

on CNTRO-CNTR15. (see Table 7)
Control register clock. Clocks CNTRO·CNTR15 into the control flip-flops on low-te-high transition.
Selects one of eight banks of control flip-flops to read out on eNTRD-eNTRl 5 in 16-blt words

addressed by CRADR1-CRADRO.

I

Selects one of ten control configurations.

I

Load source select. When low selects CNTR inputs, when high selects DATA Inputs.

TEXAS . "

INSTRUMENTS
POST OFFICE BOX 655012. OALLAS. TEXAS 75265

6-7

SN74ACT8841
DIGITAL CROSSBAR SWITCH
"tI

Table 2. 'ACT8841 Pin Functional Description (continued)

::a

o
C
C

(")

-I
"tI

::a
m

S

m

~

PIN
NAME

NO.

CRWRITEO
CRWRITEl
CRWRITE2
00
01
02
03
04
05
06
07
08
09
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035

J2
J3
Kl
Nl0
Rll
Pll
Nll
P12
R13
N12
P13
N14
N15
M14
M15
L14
L15
K14
K13
F13
E15
E14
015
014
C15
013
C14
813
A13
812
A12
811
All
810
Cl0
C6
A5
85
A4

110

I

DESCRIPTION

Destination select. Selects one of eight control banks. (see Table 4)

1/0

1/0 data bits 0 through 31 (data bits 0 through 31 are the least significant half I.

110

1/0 data bits 32 through 35 (data bits 32 through 63 are the most significant half).

TEXAS •

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TEXAS 75265

6-8

SN74ACTBB41
DIGITAL CROSSBAR SWITCH

~

Table 2. 'ACT8841 Pin Functional Description (continued)

w

PIN
NAME

036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO
GNO

NO.
84
A3
C4
83
C2
Cl
02
01
E2
El
F2
F3
K3
L1

L2
M1
M2
Nl
M3
N2
P3
R3
P4
R4
P5
R5
P6
N6
A1
A2
A14
A15
81
82
814
815
C3
C13
07
09
G4
G12

110

:>w

DESCRIPTION

a:
a..

I-

o

:;:)

c
a:
a..

o

110

1/0 data bits 36 through 63 (data bits 32 through 63 are the most significant half).

Ground /all pins must be used).

TEXAS ."

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-9

SN74ACT8841
DIGITAL CROSSBAR SWITCH
."

Table 2. 'ACT8841 Pin Functionel Description (continued)

:zJ

o
C

c:

(")

-f
."

:zJ

rn

S
rn

:e

PIN
NAME

NO.

GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
LSCLCK
MSCLK
OEC
OEDO
OEDl
OED2
OED3
OED4
OED5
OED6
OED7
orns
OED9
OED 10
OEDll
OED12
orn13
OED14
OED15

J4
J12
M7
Ml0
N3
N13
Pl
P2
P14
P15
Al
A2
A14
A15
H13
H3
Jl
Pl0
A12
L13
K15
F14
E13
Cll
Al0
86
C5
E3
Fl
K2
L3
N5
A6

110

DESCRIPTION

Ground lall pins must be used).

I

Clocks the least significant half of data inputs into the input registers on a low-ta-high transition.

I

Clocks the most significant half of data inputs into the input registers on a low-ta-high transition.

I

Output enable for control flip-flops, active low

I

Output enables for data nibbles. active low

TEXAS . "
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-10

SN74ACT8841
DIGITAL CROSSBAR SWITCH

~

Table 2. 'ACT8841 Pin Functional Description (concluded)
PIN
NAME

NO.

SELOLS

H14

I

SELDMS

H2

I

TPO
TPl

P9
Rl0

I

Vee
Vee
Vee
Vee
Vee
Vee
Vee
Vee
Vee
Vee

e7
e12
03
DB
H4
H12
M8
M13
N4

WE

A6

w

DESCRIPTION

110

When low, selects the stored. least significant data input to the main internal bus. When high, real·

:>w
~

time data is selected.
When low, selects the stored, most significant data input to the main internal bus. When high, real-

Q.

time data is selected.

I-

o

Test pins. High during normal operation. (see Table 9)

::::»

o

o

~

Q.
5-V supply

N9
I

Write enable for control flip-flops, active low

overview
The 64 110 pins of the' ACT8841 are arranged in 16 nibble (four-bitl groups where each set of four pins
serves as bidirectional input!; to and outputs from a nibble multiplexer. During a switching operation, each
nibble passes four bits of either stored or real-time data to the main internal 64-bit data bus. Each output
multiplexer will independently select one of the 16 nibbles from this 64-bit data bus.
Data nibbles are organized into two groups: the least significant half (031-00) and the most significant
half (063-0321. Stored versus real-time data inputs can be selected separately for the LSH and the MSH.
Two clock inputs, LSCLK and MSCLK, are available to latch LSH and MSH data inputs, respectively, into
the data register.
The pattern of output nibbles resulting from the switching operation is determined by a selectable control
source, either one of eight banks of programmable control flip-flops or one of two hard-wired switching
configurations. Inputs to the control flip-flops can be loaded either from the data bus or from control liDs.
A separate clock (CRCLKI is provided for loading the banks of control flip-flops.

TEXAS . .

INSTRUMENTS
POST OFFICE BOX 655012 • DA.LLAS, TeXAS 75265

6-11

SN74ACT8841
DIGITAL CROSSBAR SWITCH
"tJ logic symbol

::xl

0
C
C

.

DIGITAL CROSSBAR SWITCH
'ACT8841

WE

(")

CREAOO

-t

CREAOI
CAEA02

"0

SELECT

CRClK

::xl

m

DESTINATION

I

CRWRITEO
CRWRITEI
CRWRITEZ

:S
m
:E

CRSRCE
CASElO
CRSEll

OEC

SELECTi
REAO

CNTR3-CNTRO

CONTROL

CNTR7 -CNTR4

'il

CNTR I I -CNTR8

CONTROL
REGISTER

CRSEl3
CRAORO
ADDRESS \

lOAD

CRAORI
TPO

CNTR I 5-CNTR I 2
lSClK

CRSELZ

ClK

ClK
MSHI

IlSH

SElOlS

TPI
MSClK
SELOMS

MUX
OE08

OEOO
8

0

03-00

035-032

OE09

OEOI

039-036

07-04
MUX

OE010

OE02
10

011-08

OE03

3
lSH
MUX

.....

019-016

l:-

OE05

~

MUX
OE01Z

023-020

~

00
00

12

4
DATA

MUX

n

047-044

MSH

OE04

2

OEOll

II

015-012

(J)

043-040

051-048
MUX

5

13

6

14

OE014

OE06

~

027-024

OE013
055-052

059-056

~

MUX
OE07

15

031-028

FIGURE 1

TEXAS . "
INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TeXAS 75265

6-12

OE015
063-060

SN74ACT8841
DIGITAL CROSSBAR SWITCH
architecture
The' ACT884 1 digital crossbar switch has its 64 data II0s arranged in 16 multiplexer logic blocks, as shown
in Figure 2. Each nibble multiplexer logic block handles four bits of real-time input and four bits of storeddata input, and either input can be passed to the common data bus.
Two input multiplexer controls are provided to select between stored and real-time inputs. SELOLS controls
input data selection for the LSH (031 -00) of the 64-bit data input, and SELOMS for the MSH (063-032).
The input register clocks, LSCLK and MSCLK, are grouped in the same way and are used to clock data
into the registers in the multiplexer logic blocks. The 16 data input nibbles make up the 64 data bits on
the internal main bus.
This common bus supplies 16 data nibbles to a 1 6-to- 1 output multiplexer in each multiplexer logic block
(see Figure 3), As determined by one of ten selectable control sources, the 16-to- 1 output multiplexer
selects a data nibble to send to the outputs via the three-state output driver.
Control of the input and output multiplexers determines the input-to-output pattern for the entire crossbar
switch. Many different switching combinations can be set up by programming the control flip-flop
configurations to determine the outputs from the 1 6-to- 1 multiplexers.
For example, the switch can be programmed to broadcast one data input nibble through the other 15 nibbles
(60 outputs). Conversely, a 1 5-to- 1 nibble multiplexer can be configured by programming the switch to
select and output a single data nibble from the 64-bit bus. Several examples are described in more detail
in a later section.

TEXAS .."
INSTRUMENTS
POST OFFICE BOX 655012. DALLAS. TEXAS 75265

6-13

~
w

:>w
a:

a..

I-

U

::>

o

oa:
a..

SN74ACT8841
DIGITAL CROSSBAR SWITCH
"tJ

:%I

functional block diagram

o

SELOlS

C
C

+-----,

,--------

(")

-t

6'EC

(X)
(X)

~
~

FIGURE 2

TEXAS " ,
INSTRUMENTS
POST OFFICE BOX 655012· DALLAS, TEXAS 75265

6-14

SElDLS OR seLDMS

~

lSCLK OR MSCLK

,-----1-----------------------------------------------~-OEDX
DATA

eus

(tl4/

)

bU,

~

;

~l

\

~ ~

4,
;

DXX - DXX

CRSEl3
CRSElO

z

CREAD2
CREADO

i:l;;;i

OUTPUT

~

CONTROl

NIBBLE

~~

~4r

111.11.11.11.11.11.11.

,CReLK

CI

C;

~
TO
\

lOGIC
CRSRce

::j

>
,...

n

::ICI

o
en
en en

NIBBLE FROM
DATA BUS
CONTROL FLIP-FLOP
NIBBLE INPUT

CICIZ

>-...1
::ICI~

FIGURE 3. DATA NIBBLE MULTIPLEXER LOGIC

>
enn

:::e:-I
-=

Cf>

-1=
n~

~

::c -

01

SN74ACT8841

PRODUCT PREVIEW

SN74ACT8841
DIGITAL CROSSBAR SWITCH
""0

::0

o
C
C

(")

-t
""0

::0

m

S
m

:e

multiplexer logic group
There are 16 multiplexer logic blocks, one for each nibble. External data flows from four data 110 pins
into a logic block. A block diagram of the multiplexer logic is shown in Figure 3. The data inputs are either
clocked into the data register or passed directly to the main internal bus. The 64 bits of data from the
main bus are presented to a 16-to-l multiplexer, which selects the data nibble output.
Each of the 16 nibble multiplexer logic blocks contains eight control flip-flop (CF) groups, one for each
of the control banks. A control bank stores one complete switching configuration. Each CF group consists
of four D-type edge-triggered flip-flops. In Figure 3, the CF groups are shown as CFXXO to CFXX7, where
XX indicates the number of the nibble multiplexer logic group (0 < = XX < = 15). CFXXO represents the
16 CF groups (one from each logic block) which make up flip-flop control bank 0, CFXX 1 the 16 CF groups
in bank 1, etc.
In addition to the eight banks of programmable flip-flops, two hard-wired switching configurations can
be selected. The MSH/LSH exchange directs the input nibbles from each half of the switch to the data
outputs directly opposite. Thi~ switching pattern is shown in Table 3 below. For example, data input on
D ll-D8 is output on D43-D40, and data input on D43-D40 is output on Dll-D8.
Table 3. MSH/LSH Exchange
LSH
03·00

MSH
035·032

07·04

039·036

011·08

043·040

015·012

047·044

019·016

051·048

023·020

055·052

027·024

059·056

031·028

063·060

The second hard-wired configuration, a read-back function, causes all 64 bit to be output on the same
I/0s on which they were input. Neither of the hard-wired control configurations affects the contents of
the control banks.
The control source select, CRSEL3·CRSELO, determines which switching pattern is selected, as shown
in Table 4.
Table 4. 16-to-l Output Multiplexer Control Source Selects
CRSEL3

CRSEL2

CRSELl

CRSELO

L

L

L

L

Control bank 0

CONTROL SOURCE SELECTED
(programmable)

L

L

L

H

Control bank 1

(programmable)

L

L

H

L

Control bank 2

(programmable)

L

L

H

H

Control bank 3

(programmable)

L

H

L

L

Control bank 4

(programmable)

L

H

L

H

Control bank 5

(programmable)

L

H

H

L

Control bank 6

(programmable)

L

H

H

H

Control bank 7

(programmable)

H

X

X

L

MSH/LSH exchange *

H

X

X

H

Read-back (output echoes input) *

*Hard-wired switching configuration
X "" don't care

TEXAS " ,

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-16

SN74ACTBB41
DIGITAL CROSSBAR SWITCH
control words
A CF group can store a four-bit control word (CFN3-CFNO) to select the output of the 16-to-1 multiplexer
for that nibble port. One control word is loaded in each CF group. A total of 16 words, one per multiplexer
logic block, are loaded in a bank to configure one complete switching pattern. Table 5 lists the control
words and the input data each selects.
Each control word can be stored in a CF group and sent as an internal control signal to select the output
of a 16-to-1 multiplexer in a nibble logic block. For example, any CF group loaded with the word "LHHH"
will select the data input on 031-028 as the outputs of the associated nibble. If all 16 CF groups in a
bank were loaded with "LHHH," the same output (031-028) would be selected by the entire switch.

CFN3

CFN2

CFNI

MULTIPLEXER OUTPUT
03·00

L

L

L

L

L

L

L

H

07·04

L

L

H

L

011·08

L

L

H

H

015·012

L

H

L

L

019·016

L

H

L

H

L

H

H

L

023·020
027-024

L

H

H

H

031·028

H

L

L

L

035·032

H

L

L

H

039·036

H

L

H

L

043·040

H

L

H

H

047·044

H

H

L

L

D51·D48

H

H

L

H

055·052

H

H

H

L

059·056

H

H

H

H

063·060

CRWRITE2-CRWRITEO select which control bank is being loaded, as shown in Table 6.
Table 6. Control Flip-Flops Load Destination Select
CRWRITEI

CRWRITEO

DESTINATION

L

L

L

Control bank 0

L

L

H

Control bank 1

L

L

Control bank 2

L

H
H

H

Control bank 3

H

L

L

Control bank 4

H

L

H

H

H

L

H

H

H

Control bank 5
Control bank 6
Control bank 7

(.)

::l
C

rl.

loading control configurations

CRWRITE2

rl.

I-

a:

INPUT DATA SELECTED AS
CFNO

5>
w

a:

o

Table 5. 16-to-1 Output Multiplexer Control Words
INTERNAL SIGNALS

~
w

TEXAS •

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TEXAS 75265

6-17

SN74ACT8841
DIGITAL CROSSBAR SWITCH
"tJ

The control words for a bank can be loaded either 16 bits at a time on the control 110 pins (CNTR1 5-CNTRO)
or all 64 bits at once on the data inputs (063-00). If the control load source select, CRSRCE, is high, the
words are loaded from the data inputs. When CRSRCE = L, the CNTR inputs are used.

:D

o
C

c:

When a control bank is loaded from the data inputs, WE, CRSRCE, CRWRITE2-CRWRITEO, and the control
register clock CRCLK are used in combination to load all 16 control words (64 bits) in a single cycle. A
MSH/LSH exchange like that shown in Table 3 is used to load the flip flops on a rising CRCLK clock edge.
For example, data inputs 03-00 go to the data bus and then to the CF group that selects the data outputs
for 035-032. CRWRITE2-CRWRITEO select the control bank that is loaded (see Table 6).

(")

-I
"tJ

:D

m

The CNTR 15-CNTRO inputs can also be used to load the control banks. The bank is selected by
CRWRITE2-CRWRITEO (see Table 6). Four control words per CRCLK cycle can be input to the CF groups
(CFXX) that make up the bank. The CF groups loaded are selected by CRAOR1-CRAORO, as shown in
Table 7. Four CRCLK cycles are needed to load an entire control bank.

~
m

:E

Table 7. Loading Control Flip-Flops from CNTR liDs
CF GROUPS LOADED BY
CRAD1

CRADO

WE

L

L

L

CRCLK

L

H

L

H

L

L

H

H

L

S
S
S
S

x

x

H

X

CONTROL ICNTRI I/O NUMBERS
.15·12

11-8

7-4

3-0

CF12

CF8

CF4

CFO

CF13

CF9

CF5

CFl

CF14

CF10

CF6

CF2

CF15

CFll

CF7

CF3

Inhibit write to flip-flops

To read out the control settings, the same address signals can be used, except that no CRCLK signal is
needed and DEC is pulled low. CREA02-CREADO select the bank to be read; the format is the same as
for CRWRITE2-CRWRITEO, shown in Table 6.
Using the control II.0s to read the control bank settings can be valuable during debugging or diagnostics.
Control settings are volatile and will be lost if the' ACT8841 is powered off. An external program controlling
switch operation may need to read the control bank settings so that it can save and restore the current
switching configurations.

test pins

en

:z
""-I

TP1-TPO test pins are provided for system testing. As Table 8 shows, these pins should be maintained
high during normal operation. To force all outputs and liDs low, low signals are placed on TP1-TPO and
all output enables (OE015-0EOO and DEC). To force all outputs and liDs high, TP1 and all output enables
are pulled low, and TPO is driven high. When TPO is left low and a high signal is placed on TP1 , all outputs
on the' ACT8841 are placed in a high-impedance state, isolating the chip from the rest of the system.

~
l>

Table 8. Test Pin Inputs

C')

-I
CO
CO
~
.-.

TP1

TPO

L

L

OED15-

lffilo
L

OEC
L

RESULT

All outputs and II0s forced low

L

H

L

L

All outputs and II0s forced high

H

L

X

X

All outputs placed in a high-impedance state

H

H

X

X

Normal operation (default state)

TEXAS . "
INSTRUMENTS
POST OFFICE BOX 655012 • DA.LLAS, TeXAS 75265

6-18

SN74ACT8841
DIGITAL CROSSBAR SWITCH
I!xamples
Most' ACT8841 switch configurations are straightforward to program, involving few control signals and
procedures to set up the control words in the banks of flip-flops. Control signals and procedures for loading
and using control words are shown in the following examples.

broadcasting a nibble

3:
w
>
w
a:

c..

Any of the 16 data input nibbles can be broadcast to the other 15 data nibbles for output. For ease of
presentation, input nibble 063-060 is used in this example. Example 1 presents the microcode sequence
for loading flip-flop bank 0 and executing the nibble broadcast.
The low signal on CRSRCE selects CNTR 1 5-CNTRO as the input source, and the low signals on
CRWRITE2-CRWRITEO select flip-flop bank 0 as the destination. Table 5 shows that to select data on
063-060 as the output nibble, the four bits in the control word CFN3-CFNO must be high; therefore the
CNTR15-CNTRO inputs are coded high. The four microcode instructions shown in Example 1 load the same
control word from CNTR 15-CNTRO into all 16 CF groups of bank O.
Once the control flip-flops have been loaded, the switch can be used to broadcast nibble 063-060 as
programmed. The microcode instruction to execute the broadcast is shown as the last instruction in
Example 1. WE is held high and the data to be broadcast is input on 063-060. The high signal on SELOMS
selects a real-time data input for the broadcast. MSCLK and LSCLK (not shown) can be used to load the
input registers if the input nibble is to be retained. No register clock signals are needed if the input data
is not being stored.
The banks of control flip-flops not selected as a control source can be loaded with new control words
or read out on CNTR15-CNTRO while the switch is operating. For example, the MSH data inputs can be
used to load flip-flop bank 1 of the LSH while bank 0 of the LSH is controlling data 1/0.

TEXAS " ,
INSTRUMENlS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-19

I-

o

:::l
C

oa:

c..

M31J\3Hd J.::>naOHd

L1788J.:nf17LNS
m
,:.,
o

CI

lNSTi
NO.

1

CRSRCE CRWRITE2 CRWlUTEl

±

,I

CRWRITEO CRADR' CRADRO

CRSEL3

CRSEL2

eRSEL 1

CRSELO

WE

1111

1111

1111

1111

X

X

X

X

0

X

X

1111

1111

1111

1111

X

X

X

X

0

X

1111

1111

1111

1111

X

X

X

X

0

X

1111

',11

1111

1111

CNTR I/O NUMBERS
15-12 11-8
7-4
3-0

xxxx xxxx xxxx xxxx

SElDMS SELDLS

OEOl6-0ED0

OEC
1

X

xxxx xxxx xxxx· XXXX
xxxx xxxx xxxx XXXX

X

xxxx xxxx xxxx

XXXX

1

1

X

X

X

X

0

X

X

xxxx xxxx xxxx xxxx

1

0

0

0

0

1

1

X

1000

1

0000

0000

0000

CRCLK

en:!::
en

None

=e

=t

Selects bank 0 for switching control
• Selects real-time data inputs

I~

~4r

Example 2. Programming an MSH/LSH Exchange on CNTR Inputs
CRWRITEO CRAORl CRAORO

tNTH 110 NUMBERS

15-12

11-8

7·'

3·0

0100

0000

1100

1000

0101

0001

1101

1001

0111

0011

1111

1011

0111

0011

1111

lOll

CRSEL3

CRSEl2

C"SEL 1

CRSELO

WE

Comments
COMMENT

Loads CF12. CFa. CF4. Cfa 01 bank 7
Loads CF13. CF9. CF5. Cfl of bank 7
Loads Cf14. CflO. Cf6. Cf2 of bank 7
Loads eF15. CF11. CF7. CF3 of bank 7
Selects bank 7 tor SWltchmg control
Selects registered data Inputs

0eC

OE015-0£00

SElDMS SELDLS

CRCUt

xxxx
xxxx xxxx
xxxx xxxx
xxxx xxxx

xxxx
xxxx
xxxx
xxxx

xxxx
xxxx
xxxx
xxxx

.r
.r.r

0000

0000

0000

None

XXXX

xxxx xxxx xxxx xxxx

INST. NO.

=

>
::Ill
n
:z:

Loads CF14. CF10. CF6. CF2 of bank 0

CRSRCE CRWfUTE2 CRWRlTE'

e=

en

COMMENT

Loads CF16, CF11. CF7., CF3 of bank 0

NO.

0000

=

..r
..r

Loads CF13, CF9, CFS, CFl of bank 0

lNST.

n-t

::Ill

Loads CF12. CFe. CF4, CFO of bank a

Z

5! .....
-t "'"
»
r-n

I

S

Com.......
·INST. NO.

en

-2

Example 1. Programming a Nibble Broadcast

I

SN74ACT8841
DIGITAL CROSSBAR SWITCH
programming an MSH/LSH exchange
A second, more complicated example involves programming the switch to swap corresponding nibbles
between the MSH and the LSH (first nibble in the LSH for first nibble in the MSH, and so on). This swap
can be implemented using the hard-wired logic circuit selected when CRSEL3 is high and CRSELO is low.
Programming this swap without using the MSH/LSH exchange logic requires loading a different control
word into each mux logic block. This is described below for purposes of illustration.
Each nibble in one half, either LSH or MSH, selects as output the registered data from the corresponding
nibble in the other half. The registered data from 035-032 is to be output on 03-00, the registered data
from 03-00 is output on 035-032, and so on for the remaining nibbles. As shown in Table 4, the flip-flops
for 03-00 have to be set to 1000 and the 035-032 inputs must be low. The CF groups and control words
involved in this switching pattern are listed in Table 9.

CF

CNTRINPUTS
TO LOAD

CONTROL
WORD

FLIP-FLOPS

LOADED
0111

CNTR15CNTR12

0110
0101
0100

CF15

RESULTS

CF11
CF10

CNTR11-

CF9

CNTR8

0001

031-028 027-024 023-020 019-016 015-012 011-08 07-04 -

CNTR7·
CNTR4

0000
1111
1110
1101

03-00
063-060
059-056
055-052

CF14
CF13
CF12

CF8
CF7
CF6
CF5
CF4

1100

CF3
CF2
CF1
CFO

0011
0010

CNTR3CNTRO

063-060
059-056
055-052
051-048
047-044
043-040
039-036

-

035-032
031-028

051-048 047-044 -

027-024
023-020
019-016

1011
1010

015-012
043-040 -011-08

1001
1000

039-036
035-032

--

a:

Q.
~

(,)
~

C

0

a:
Q.

Table 9. Control Words for an MSH/LSH Exchange

GROUP

3:
W
>
W

07-04
03-00

With this list of control words and the signals in Table 7, the 16-bit control inputs on CNTR15-CNTRO
can be arranged to load the control flip-flops in four cycles. Example 2 shows the microcode instructions
for loading the control words and executing the exchange.
In Example 2, bank 7 of flip-flops is being programmed. Bank 7 is selected by taking CRWRITE2-CRWRITEO
high and leaving CRSRCE low (slle Table 4) when the control words are loaded on CNTR15-CNTRO. With
WE held low, the CRCLK is used to load the four sets of control words. Once the flip-flops are loaded,
data can be input on 063-00 and the programmed pattern of output selection can be executed. A
microinstruction to select registered data inputs and bank 7 as the control source is shown as the last
instruction in Example 2. The data must be clocked into the input registers, using LSCLK and MSCLK,
before the last instruction is executed.

TEXAS •

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TeXAS 75265

6-21

SN74ACT8841
DIGITAL CROSSBAR SWITCH
."

::a

o
C

c

The control flip-flops could also have been loaded from the data input nibbles in one CRCLK cycle. Input
nibbles from one half are mapped onto the control flip-flops of the other half. All control words to set up
a switching pattern should be loaded before the bank of flip-flops is selected as control source. The
microcode instructions to load bank 1 with the 16 control words in one cycle are presented in Example 3.

(")

Example 3. Loading the MSH/LSH Exchange from Data Inputs

-I

CRWRITE2

CRWRITE1

."

o

0

::a

m

:sm
:e

CRWRITEO

SElDMS

\III(

SElDlS

~15-~

1

1111 1111 1111 1111

o

These control nibbles may be loaded from the input as a 64-bit real-time input word or as two 32-bit words
stored previously. To use stored control words, MSCLK and LSCLK are used to load the LSH and MSH
input registers with the correct sequence of control nibbles. Whenever the flip-flops are loadecj from the
data inputs, all 64 bits of control data must be present when the CRCLK is used so that all control nibbles
in a program are loaded simultaneously. Example 4 presents the three microcode instructions to load the
MSH and LSH input registers and then to pass the registered data to flip-flop bank 2.
Example 4. Loading Control Flip-Flops from Input Registers
INST.
NO.

CRSRCE CRWRITE2 CRWRITE1 CRWRITEO

1

X

X

2

X

X

3

1

0

;,w

SELDMS SELDLS

lIED15·
~

CRCLK MSCLK LSCLK COMMENTS

X

x

1

X

X

1

None

S

X

X

1.

X

X

1

None

None

1

0

0

0

0

1

S

None

None

S
None

load inputs
063·032
Load inputs
031-00
Load control
bank 2

The control words in a program can also be read bflCk from the flip-flops using the CNTR outputs. Four
instructions are necessary to read the 64 bits in a bank of flip-flops out on CNTR15-CNTRO. WE is held
high and DEC is taken low. No CRCLK signal is required. CREAD2-CREADO select bank 2 of flip-flops,
and CRADR1-CRADRO select in sequence the four addresses of the 16-bit words to be read out on the
CNTR outputs. Example 5 shows the four microcode instructions.
Example 5. Reading Control Settings on CNTR Outputs
INST.
CREA02 CREA01 CREADO
NO.

l:IE

CRADR1 CRADRO lift

1

0

1

0

0

0

2
3

0

1
1

0

0

0

0

0

0

4

0

1

0

0

1
1

1

0
1

1

0

1

1

1

CNTR liD NUMBERS

3-0
15·12 11-8 7·4
0100 0000 1100 100e
0101 0001 1101 1001
0110 0010 1110 1010
0111 0011 1111 1011

TEXAS ",

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TeXAS 75265

6-22

COMMENT
Read CF12. CF8. CF4. CFO
Read CF13. CF9. CF5. CFl
Read CF14. CF10. CF6. CF2
Read CF15. CFll. CF7. CF3

SN74ACT8841
DIGITAL CROSSBAR SWITCH
absolute maximum ratings over operating free-air temperature range (unless otherwise noted)t
Supply voltage, VCC ....
Input clamp current, 11K (VI < 0 or VI > Vcc)
Output clamp current, 10K (Vo < 0 or Vo > VCC)
Continuous output current, 10 (VO = 0 to VCC)
Continuous current through VCC or GND pins ....... .
Operating free-air temperature range.
Storage temperature range

-0.5 V to 6 V
±20 mA
....... ±50 mA
..... ±50 mA
±100 mA
. ... ooC to 70°C
- 65°C to 150°C

tStresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings
only and functional operation of the device at these or any other conditions beyond those indicated under "recommended operating
conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.

recommended operating conditions
PARAMETER

Vee

Supply voltage

MIN

NOM

MAX

4.5

5.0

5.5

V

Vee
0.8

V

VIH

High-level input voltage

2

VIL

Low-Jevel input voltage

0

IOH

High-level output current

10L

Low-level output current

VI

Input voltage

Vo
dt/dv

Output voltage

0
0

Input transition rise or fall rate

TA

Operating free-air temperature

UNIT

V

-8

mA

8

mA

Vee

V

0

Vec
15

ns/V

0

70

V
·e

electrical characteristics over recommended operating free-air temperature range (unless otherwise
noted)
PARAMETER

TEST CONDITIONS

10H

~

- 20 ,A

~

-8 mA

VOH
10H

~

IOL

20 ,A

VOL
10L ~ 8 mA
10Z

Vo ~ Vee or 0

II

VI = Vee or 0
VI - Vee or 0, 10

lee
el
tThis is the increase

VI
In

~

Vee or 0

vee

25·C

TA MIN

TYP

MAX

4.5 V

MIN

TYP

MAX

5.5 V

5.4

4.5 V

3.8

3.7

5.5 V

4.8

4.7

V

4.5 V

0.1

5.5 V

0.1

4.5 V

0.32

0.4

5.5 V

0.32

0.4

5V
5.5 V

UNIT

4.4

±O.5
0.1

5.5 V

V

±O.S

,A

±1

,A

100

"A
pF

5V

supply current for each input that is at one of the specified TTL voltage levels rather than 0 V or

Vee.

TEXAS ."

INSTRUMENlS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-23

3:
w

:>w
a:
D..

t-

U

~

C

oa:

D..

SN74ACTBB41
DIGITAL CROSSBAR SWITCH

:2
-

O·
C

switching characteristics over recommended ranges of supply voltage and operating free-air temperature
(unless otherwise noted)
TVP!

MAX

7

14

10

18

9

15

CRCLK
CRSEL3-CRSELO

12

19

12

19

CREA02-CREAOO

10

18

10

18

PARAMETER

FROM

c:

TO

MIN

Data in
MSCLK. LSCLK

(")

Data out

SELDMS. SELOLS

-I
."

'pd

II

m

CRCLK

S
m
:E

CNTRn

CRA01. CRAOO

'en

tdis

tAli typical values are at

vee

8

16

TP1. TPO

All outputs

10

19

TP1. TPO

All outputs
Data out

10

15

OEO
DEC

CNTRn

7

12

8

14

10

15

TP1. TPO

All outputs

OED

Data out

5

8

DEC

CNTRn

6

10

UNIT

ns

ns

ns

= 5 V. TA = 25°C.

timing requirements over recommended ranges of supply voltage and operating free-air temperature
(unless otherwise noted)
PARAMETER
'w

Pulse duration

MIN

LSCLK. MSCLK. CRCLK h;gh or low

7

Data

7
7

CNTRn
'su

Setup time before CRClK

SELDMS. SELDLS

9

CRADR1.CRADRO

8

CRSRCE. CRWRITE2-CRWRITEO

8

LSCLK. MSCLK

'su

'h

Hold time after CRCLK

C/)

:2

.....

'h

B

Data

0

7
CNTRn

0

SELDMS. SELDLS

0

CRADR1. CRADRO

0

CRSRCE. CRWRITE

0

WE

0
0

Hold time, data after LSCLK or MSCLK

~

l>
(")

-I
00
00
~
~

TEXAS . "

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS. TEXAS 75265

6-24

UNIT
ns

ns

10

WE
Setup time, data before LSCLK or MSCLK

MAX

ns

ns

ns

SN74ACT8841
DIGITAL CROSSBAR SWITCH

:=w

'AS8840 AND 'ACT8841 FUNCTIONAL COMPARISON
differences between the SN74AS8840 and the SN74ACT8841
The SN74AS8840 and the SN74ACT8841 digital crossbar switches essentially perform the same function.
The SN74AS8840 and the SN74ACT8841 are based on the same 16-port architecture, differing in the
number of control registers, power consumption, and pin-out.
One difference is in the number of programmable control flip-flop banks available to configure the switch.
The 'AS8840 has two programmable control banks, while the 'ACT8841 has eight. Both have two
selectable hard-wired switching configurations.
The increased number of control banks in the 'ACT884 1 require six additional pins not found on the
'AS8840. These are: CRWRITE2, CRWRITE1, CREAD2, CREAD1, CRSEL3, and CRSEl2. CREAD and
CRWRITE on the '8840 become CREADO and CRWRITEO on the '8841. On the '8840, CRSEl1 selects
the hardwired control functions when high. This function is performed by the CRSEl3 signal on the '8841.
Therefore, CRSEl2 and CRSEl1 are actually the added signals.
The' ACT8841 is a low-power CMOS device requiring only 5-V power. Because of its STl internal logic
and TTL 1I0s, the 'AS8840 requires both 2-V and 5-V power.
Both the' AS8840 and the' ACT8841 are in 156 pin grid-array packages, however, the two devices are
not pin-for-pin compatible. Control signals were added to the' ACT8841 and the 2-V VCC pins (' AS8840
onlyl were assigned other functions in the' ACT8841 .

changing 'AS8840 microcode to 'ACT8841 microcode
Since only six signals have been added to the 'ACT8841, changing existing 'AS8840 microcode to
'ACT8841 microcode is straight forward. CRSEl3 on the' ACT8841 is functionally equivalent to CRSEl1
on the' AS8840. CREAD2, CREAD1, CRWRITE2, CRWRITE1, CRSEl2, and CRSEl1 bits must be added.
These can always be 0 if no additional control banks are needed. Additional control configurations can
be stored by programming these bits.
All other signals in the' AS8840 microcode remain the same when converting to 'ACT8841 microcode.

TEXAS ~

INSTRUMENTS
POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

6-25

>
w
ex:
D..

I-

()

::J

C

oex:
D..

6-26

SN74ACT8847

64-Bit Floating Point/Integer Processor

7-1

7-2

SN74ACT8847
64·8it Floating Point Unit
•

Meets IEEE Standard for Single- and DoublePrecision Formats

•

Performs Floating Point and Integer Add,
Subtract, Multiply, Divide, Square Root, and
Compare

•

64-Bit IEEE Divide in 11 Cycles, 64-Bit Square
Root in 14 Cycles

•

Performs Logical Operations and Logical Shifts

•

Superset of TI's SN74ACT8837

•

30-ns, 40-ns and 50-ns Pipelined Performance

•

Low-Power EPIC" CMOS
The SN74ACT8847 is a high-speed, double-precision floating point and integer
processor. It performs high-accuracy, scientific computations as part of a
customized host processor or as a powerful stand-alone device. Its advanced
math processing capabilities allow the chip to accelerate the performance of both
CISC- and RISC- based systems.
High-end computer systems, such as graphics workstations, mini-computers and
32-bit personal computers, can utilize the single-chip' ACT884 7 for both floating
point and integer functions.

"

~

CO
CO
IU

EPIC is a trademark of Texas Instruments Incorporated.



(')

37

-I

00
00
~

-..J

38

7-12

Page

Single-Precision Independent ALU Operation, All
Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = X) · ............................
Double-Precision Independent ALU Operation, All
Registers Disabled (PIPES2-PIPESO = 111,
CLKMODE = 0) · ............................
Double-Precision Independent ALU Operation, Input
Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = 0) · ............................
Double-Precision Independent ALU Operation, Input
and Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE = 1) · ............................
Double-Precision Independent ALU Operation, All
Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = 0) · ............................
Single-Precision Independent Multiplier Operation,
All Registers Disabled(PIPES2-PIPESO = 111,
CLKMODE = X) · ............................
Single-Precision Independent Multiplier Operation,
Input Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = X) · ............................
Single-Precision Independent Multiplier Operation,
Input and Output Registers Enabled
(PIPES2-PIPESO = 010, CLKMODE = X) ...........
Single-Precision Independent Multiplier Operation,
All Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = X) · ............................
Double-Precision Independent Multiplier Operation,
All Registers Disabled (PIPES2-PIPESO = 111,
CLKMODE = 0) · ............................
Double-Precision Independent Multiplier Operation,
Input Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = 1) · ............................
Double-Precision Independent Multiplier Operation,
Input and Output Registers Enabled
(PIPES2-PIPESO = 010, CLKMODE = 0) ...........

7-101

7-102

7-103

7-104

7-105

7-106

7-107

7-108

7-109

7-110

7-111

7-112

List of Illustrations (Continued)
Figure

39

40

41

42

43
44

45

46

47
48
49
50
51
52

Page

Double-Precision Independent Multiplier Operation,
All Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = 0) ............................ .
Single-Precision Floating Point Division, Input
Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Single-Precision Floating Point Division, Input and
Pipeline Registers Enabled (PIPES2-PIPESO = 100,
CLKMODE = X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Single-Precision Floating Point Division, Input and
Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE = X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Single-Precision Floating Point Division, All Registers
Enabled (PIPES2-PIPESO = 000, CLKMODE = X)
Double-Precision Floating Point Division, Input
Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = 0) ............................ .
Double-Precision Floating Point Division, Input and
Pipeline Registers Enabled (PIPES2-PIPESO = 100,
CLKMODE = 0) ............................ .
Double-Precision Floating Point Division, Input and
Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE = 1) ............................ .
Double-Precision Floating Point Division, All Registers
Enabled (PIPES2-PIPESO = 000, CLKMODE = 1) .....
Integer Division, Input Registers Enabled
(PIPES2-PIPESO = 100, CLKMODE = X) . . . . . . . . . . .
Integer Division, Input and Pipeline Registers Enabled
(PIPES2-PIPESO = 100 CLKMODE = X) . . . . . . . . . . .
Integer Division, Input and Output Registers Enabled
(PIPES2-PIPESO = 010, CLKMODE = X) . . . . . . . . . . .
Integer Division, All Registers Enabled
(PIPES2-PIPESO = 000, CLKMODE = X) . . . . . . . . . . .
Single-Precision Floating Point Square Root, Input
Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = X) ............................ .

7-113

7-114

7-114

7-115
7-115

7-116

7-116

7-117
7-117
7-118

,....

"d"
7-11800
00
I7-119 U
<{

"d"
7-119 ,....

Z

en
7-120

7-13

List of Illustrations (Continued)
Figure
53

54

55

56

57

58

Page
Single-Precision Floating Point Square Root, Input
and Pipeline Registers Enabled
(PIPES2-PIPESO = 100, CLKMODE = X) ...........
Single-Precision Floating Point Square Root, Input
and Output Registers Enabled
(PIPES2-PIPESO = 010, CLKMODE = X) ...........
Single-Precision Floating Point Square Root,
All Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = X) · . ".' ..........................
Double-Precision Floating Point Square Root, Input
Registers Enabled (PIPES2-PIPESO = 11O,
CLKMODE = 1) · ............................
Double-Precision Floating Point Square Root, Input and
Pipeline Registers Enabled (PIPES2-PIPESO = 10O,
CLKMObE = 01 · ............................
Double-Precision Floating Point Square Root, Input and
Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE. = 1 I
Double-Precision Floating Point Square Root,
AII'Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = 0) · ............................
Integer Square Root, Input Registers Enabled
(PIPES2-PIPESO = 110, CLKMODE = X) ...........
Integer Square Root, Input and Pipeline Registers
Enabled (PIPES2-PIPESO = 100, CLKMODE = XI
Integer Square Root, Input and Output Registers
Enabled (PIPES2-PIPESO = 010, CLKMODE = XI
Integer Square Root, All Registers Enabled
(PIPES2-PIPESO = 000, CLKMODE = X) ...........
Single-Precision Chained Mode Operation, All
Registers Disabled (PIPES2-PIPESO = 111,
CLKMODE = X) · ............................
Single-Precision Chained Mode Operation, Input
Registers Enabled (PIPES2-PIPESO = 11O,
CLKMODE = 1) · ............................
Single-Precision Chained Mode Operation, Input and
Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE = X) · ............................
•

59

60
61
62
63
(J)

2

"

64

~

l>
("")

-I
CO
CO

65

~

"

66

7-14

•••••

0"

• • • • • • • • • • • • • • • • • • • • •

7-120

7-121

7-121

7-122

7-122

7-123

7-123
7-124
7-124
7-125
7-125

7-126

7-127

7-128

List of Illustrations (Concluded)
Figure

67
68
69

70

71

72
73
74
75
76
77
78a
78b
79

Page

Single-Precision Chained Mode Operation, All Registers
Enabled (PIPES2-PIPESO = 000, CLKMODE = X) ....
Double-Precision Chained Mode Operation, All Registers
Disabled (PIPES2-PIPESO = 111, CLKMODE = 0)
Double-Precision Chained Mode Operation, Input
Registers Enabled (PIPES2-PIPESO = 110,
CLKMODE = 1) ............................ .
Double-Precision Chained Mode Operation, Input and
Output Registers Enabled (PIPES2-PIPESO = 010,
CLKMODE = 0) ............................ .
Double-Precision Chained Mode Operation, All
Registers Enabled (PIPES2-PIPESO = 000,
CLKMODE = 0) ............................ .
Sequence of Matrix Operations ... . . . . . . . . . . . . . . . .
Resultant Matrix Transformation ................. .
SN74ACT8837 Floating Point Unit ................ .
SN74ACT8847 Floating Point Unit. ............... .
Creating a 3-D Image ......................... .
View Volume ............................... .
Model of Procedure for Creating a 3-D Graphic ...... .
Model of Creating and Transforming a 3-D Graphic ... .
Viewing Pyramid Showing Six Clipping Planes ....... .

7-129
7-130

7-131

7-132

7-133
7-154
7-161
7-225
7-226
7-245
7-246
7-247
7-247
7-251

,....
'lit

ex)
ex)

t-

O

 B Comparison Function Table .................. .
Data Flow for Accept/Reject Testing ................ .
Data Flow for the X Processor. . . . . . . . . . . . . . . . . . . . . .
Program Listing for the X Processor ................. .
Summary of Graphics Systems Performance .......... .
Available Options for Graphic System Designs ......... .

7-257
7-258
7-259
7-260
7-262
7-262
7-263
7-263

.....

..t

00
00

....(,)
«

..t

.....
Z
en

7-21

7-22

Overview
Using a top-down approach, this user guide contains the following major sections:
Introduction (to Microprogrammed Architectures and the' ACT884 7)
SN74ACT8847 Architecture
Microprogramming the 'ACT884 7
Easy-to-Access Reference Guide
Application Notes
The SN74ACT8847 combines a multiplier and an arithmetic-logic unit in a single
microprogrammable VLSI device. The' ACT8847 is implemented in Texas Instruments
one-micron CMOS technology to offer high speed and low power consumption with
exceptional flexibility and functional integration. The FPUs can be microprogrammed
to operate in multiple modes to support a variety of floating point applications.
The 'ACT884 7 is fully compatible with the IEEE standard for binary floating point
arithmetic, STD 754-1985. This FPU performs both single- and double-precision
operations, integer operations, logical operations, and division and square root
operations (as single microinstructions).

Understanding the' ACT8847 Floating Point Unit
To support floating point processing in IEEE format, the' ACT884 7 may be configured
for either single- or double-precision operation. Instruction inputs can be used to select
three modes of operation, including independent ALU operations, independent multiplier
operations, or simultaneous AlU and multiplier operations.
Three levels of internal data registers are available. The device can be used in
flowthrough mode (all registers disabled), pipelined mode (all registers enabled), or
in other available register configurations. An instruction register, a 64-bit constant
register, and a status register are also provided.
Each FPU can handle three types of data input formats. The ALU accepts data operands
in integer format or IEEE floating point format. A third type of operand, denormalized
numbers, can also be processed after the ALU has converted them to "wrapped"
numbers, which are explained in detail in a later section. The' ACT884 7 multipli!~r
operates on normalized floating point numbers, wrapped numbers, and integer
operands.

Microprogramming the' ACT8847

I'

~
00
00
~

u

«

The' ACT884 7 is a fully microprogrammable device. Each FPU operation is specified
by a microinstruction or sequence of microinstructions which set up the control inputs ~
I'
of the FPU so that the desired operation is performed.

Z

en

7-23

Support Tools
Texas Instruments has developed functional evaluation models of the' ACT884 7 in
software which permit designers to simulate operation of the FPU. To evaluate the
functions of an FPU, a designer can create a microprogram with sample data inputs,
and the simulator will emulate FPU operation to produce sample data output files, as
well as several diagnostic displays to show specific aspects of device operation. Sample
microprogram sequences are included in this section.

Design Support
Texas Instruments Regional Technology Centers, staffed with systems-oriented
engineers, offer a training course to assist users of TI LSI products and their application
to digital processor systems. Specific attention is given to the understanding and
generation of design techniques which implement efficient algorithms designed to
match high-performance hardware capabilities with desired performance levels.
Information on VLSI devices and product support can be obtained from the following
Regional Technology Centers:
Atlanta
Texas Instruments Incorporated
3300 N.E. Expressway, Building 8
Atlanta, GA 30341
404/662-7945

Chicago
Texas Instruments Incorporated
51 5 Algonquin
Arlington Heights, IL 60005
312/640-2909

Boston
Texas Instruments Incorporated
950 Winter Street, Suite 2800
Waltham, MA 021 54
617/895-9100

Dallas
Texas Instruments Incorporated
10001 E. Campbell Road
Richardson, TX 75081
214/680-5066

Northern California
Texas Instruments Incorporated
5353 Betsy Ross Drive
Santa Clara, CA 95054
4081748-2220

Southern California
Texas Instruments Incorporated
17891 Cartwright Driv.e
Irvine, CA 92714
714/660-8140

fJ)

:2

Design Expertise

,J::a.

Texas Instruments can provide in-depth technical design assistance through
consultations with contract design services. Contact your local Field Sales Engineer
for current information or contact VLSI Systems Engineering at 214/997-3970.

......

~
~

00
00
,J::a.

......

7-24

, ACT884 7 Logic Symbol

•

'ACT8847
64-Bit Floating Point Unit

CLK

MASTER CLOCK (EXCEPT C REGISTER)
C REGISTER CLOCK

CLKC
CLKMOOE

CLOCK EDGE

BYTEP

PARITY GENERATION

CONFIG1-0

~

RND1-0

TP1-0
10

~
8
8

LJ::".

MULTIPLIER

I

~

PIPESI

STATUS, p, S'I FLOWTHROUGH
AND INST PIPELINE
REGISTERS
EN

~

PIPES2

SELECT

PARITY

110

I

4

DA DATA

4

DB DATA

4

Y BUS

MSHi
LSH Y BUS

STATUS
PARITY

2

I

COMPARISON
STATUS

12
13
14

ENRA
ENRB
OES
OEC
OEY

--;:::::
-,.;..
......

LOAD RA REGISTER
LOAD RB REGISTER
EXCEPTION & OTHER STATUS
COMPARISON STATUS
Y31-YO, PY3-PYO

UNORD

I

AGTB
AEQB
ED
D1VBYO
IVAL
IN EX
OVER
UNDER
OENORM
DENIN
RNDCO
SRCEX
CHEX
STEX1-0
NEG
INF

EXCEPTION
ANO
OTHER
STATUS

EN

~
DAO

DA31
DBO

DB31

··• ··
·· ··

r

0

31

PY3-0

PERRB

17
18
10

PB3-0

MSERR

INSTRUCTION

19

PA3-0

PERRA

DA DATA
DB DATA
MASTER/SLAVE
COMPARATOR

0

11

110

PIPESO

ALU, MULTIPLlER'I OWTHROUGH
AND INSTRUCTION FL
PIPELINE REGISTERS
EN

ALU
C REG
WRITE
8YPASS
OPERAND SOURCE
STATUS SOURCE

15
16

HALT

~

IUNDERFLOW

GRADUAL

ROUNDING MODE

SRCC
ENRC
FLOWC
SELOP7-0
SELST1-0
SELMS/LS

SUDDEN

RESET

INSTRUCTION, RA, & RB I FLOWTHROUGH
EN
REGISTERS

DATA SOURCE

FAST

CLEARS STATES
& STATUS /1
/1
STALLS OPERATION

···
0

~

~

31

···

YO

'I:t

Y31

00
00
~

0

31

.....
(,)

~



(")

-I

(X)
(X)
~

"

7-34

, ACT884 7 Specifications
absolute maximum ratings over operating free-air temperature range
(unless otherwise noted) t
Supply voltage, VCC ....................... -0.5 V to 6 V
Input clamp current, 11K (V, < 0 or V, > VCC) ......
± 20 mA
Output clamp current, 10K (Va < 0 or Va > VCC). . .
± 50 mA
Continuous output current, 10 (Va = VCC) . . . . . . . . .
± 50 mA
Continuous current through VCC or GND pins . . . . . ..
± 100 mA
Operating free-air temperature range . . . . . . . . . . . .. OoC to 70°C
Storage temperature range. . . . . . . . . . . . . . . .. - 65°C to 150°C
tStresses beyond those listed under "absolute maximum ratings" may cause permanent damage
to the device. These are stress ratings only and functional operation of the device at these or
any other conditions beyond those indicated under "recommended operating conditions" is
not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect
device reliability.

recommended operating conditions
PARAMETER

SN74ACT8847
NOM

M~!,

4.75

5.0

~~5

V

(;01ee
,:,;~~t~" 0.8

V

Vee

Supply voltage

VIH

High-level input voltage

2

VIL

Low-level input voltage

0

IOH

High-level output current

IOL

Low-level output current

VI

Input voltage

Vo
dt/dv

Output voltage
Input transition rise or fall rate

TA

Operating free-air temperature

UNIT

MIN

1"" ,
,,~,)

o~<)

V

-8

mA

8

mA

Vee

V

'''0

Vee
15

ns/V

0

70

~J

V
°e

I'
~

00
00

I-

U

«
~

I'

Z

en

7-35

electrical characteristics over recommended operating free-air
temperature range (unless otherwise noted)
PARAMETER

TEST CONDITIONS
10H = -20 pA

VOH
10H = -8 rnA
10L = 20 p.A
VOL
10L = 8 rnA

VCC
4.75 V
5.25 V

4.74

4.55

5.24

5.05

5.25 V

4.7
0.01

5.25 V
4.75 V

0.01

10Z

VI = VCC or 0, 10

5.25 V

ICCQ
Ci

VI = VCC or 0, 10

5.25 V
5V

UNIT

V

,

,,;;;::s:"

?~"

0.10
0.10

d(t:Y

,

0.45

....')"

5.25 V
5.25 V

TYP MAX

3.7

4.75 V

VI = VCC or 0

7-36

MIN

4.75 V

II

Vi = VCC or 0

SN74ACT8847

TA - 25°C
MIN TYP MAX

V

0.45

{~J

,;;;+('
10

±5

p.A

±10

p.A

200

p.A

pF

switching characteristics
NO.
1

PARAMETER
tpd1

FROM

TO

(INPUT)

(OUTPUT)

PIPELINE
CONTROLS
PIPES2-PIPESO

SN74ACT8847-30
MIN

DA/DB/lnst

Y OUTPUT

111

t

INPUT REG

Y OUTPUT

110

70

INPUT REG

STATUS

110

70

PIPElN REG

Y OUTPUT

lOX

48

PIPElN REG

STATUS

lOX

48

OUTPUT REG Y OUTPUT

OXX

20

OUTPUT REG

STATUS

OXX

20

Y OUTPUT

XXX

18

2

tpd2

3

tpd3

4

tpd4

5

tpd5

SELMS/lS

6

tpd6

ClKi

7

tpd7

ClKi

8

tpd8

SElMS/lS

9

tdl :I:

ClKi

ClKi

010

56

10

td2:1:

ClKi

ClKi

000

30

Y OUTPUT
INVALID
STATUS
INVALID
Y OUTPUT
INVALID

ns
ns
ns
ns

3.0

ns

all but 111

3.0

ns

XXX

1.5

ns

data captured in C register is data
td3

ns

all but 111

Delay time, ClKC after ClK to insure
11

UNIT

MAX

clocked into sum or product register by

ns
12

td-O§

that clock. (PIPES2-PIPESO = OXX)
12

tenl

OEY

Y OUTPUT

XXX

12

STATUS

XXX

12

ten2

OEC,OES

14

tdis1

OEY

Y OUTPUT

XXX

12

15

tdis2

OEC,OES

STATUS

XXX

12

13

ns

tThis parameter no longer tested and will be deleted on next Data Manual revision.
:I: Minimum clock cycle period not guaranteed when operands are fed back using FlOWC to bypass
the C register and operands are used on the same clock cycle.
§td is the clock cycle period.

7-37

setup and hold times
PIPELINE
NO.

PARAMETER

CONTROLS

SN74ACT8847-30

PIPES2-PIPESO MIN
16
17

12

tsu1

Inst/control before ClKi

XXO
XXO

11

tsu2

DA/DB before ClKi

18

tsu3

DA/DB before 2nd ClKi (DP)

XX1

40

19

tsu4

CONFIG1-0 before ClKi

XXO

12

20

tsu5

SRCC before ClKCi

XXX

10

21

tsu6

RESET before ClKi

XXO

12

22

th1

Inst/control after ClK!

XXX

1

XXX

1

23

UNIT

MAX

th2

DA/DB after ClK!

24

th3

SRCC after ClKC!

XXX

1

25

th4

RESET after ClK!

XXO

6

ns

ns

elK/RESET requirements
SN74ACT8847-30

PARAMETER

tw

CJ)

2

.....
.J:I.

»
(')
-I
CO
CO

.J:I.

.....

7-38

Pulse duration

MIN
ClK high

10

ClK low

10

RESET

10

MAX

UNIT

ns

switching characteristics
NO.
1

PARAMETER
tpd1

2

tpd2

3

tpd3

4

tpd4

5

tpd5

FROM
(INPUT)

TO
(OUTPUT)

PIPELINE
CONTROLS
PIPES2-PIPESO

SN74ACT8847-40
MIN

DA/DBllnst

Y OUTPUT

111

t

INPUT REG

Y OUTPUT

110

90

INPUT REG

STATUS

110

90

PIPELN REG

Y OUTPUT

10X

60

PIPELN REG

STATUS

10X
OXX
OXX
XXX

60

OUTPUT REG Y OUTPUT
OUTPUT REG
SELMS/LS

STATUS
Y OUTPUT
Y OUTPUT

24
24
20

ns

XXX

1.5

ns

CLKi

8

tpd8

SELMS/lS

9

td1 +

ClKi

ClKi

010

72

10

td2+

ClKi

ClKi

000

40

Y OUTPUT
INVALID

Delay time, ClKC after ClK to insure
data captured in C register is data
clocked into sum or product register by
that clock. (PIPES2-PIPESO =
12

ten1

OEY

Y OUTPUT
STATUS

ten2

OEC, OES

14

tdis1

OEY

Y OUTPUT

15

tdis2

OEC, OES

STATUS

13

ns

3.0

tpd7

td3

ns

all but 111

7

11

ns

ns

CLKi

INVALID

ns

3.0

tpd6

STATUS

ns

all but 111

6

INVALID

UNIT

MAX

OXX)
XXX
XXX
XXX
XXX

ns
16

td-O§

16
16
16

ns

16

tThis parameter no longer tested and will be deleted on next Data Manual revision.
+Minimum clock cycle period not guaranteed when operands are fed back using FlOWC to bypass
the C register and operands are used on the same cycle.
§td is the clock cycle period.

,....
~

00
00
l-

e.>

«~

,....
Z

CJ)

7-39

setup and hold times
PIPELINE
NO.

PARAMEtER

CONTROLS

SN74ACT8847-40

PIPES2-PIPESO MIN
16

tsu1

Inst/control before CLKt

17

tsu2

18

tsu3

DA/DB before CLKt
DA/DB before 2nd CLKt (DP)

19

tsu4

CONFIG 1-0 before CLKt

20

tsu5

21
22

tsu6
th1

23

th2

24

th3
th4

25

XXO
XXO

14
13

XX1

52

XXO

14

SRCC before CLKCt

XXX

14

RESET before CLKt
Inst/control after CLKt

XXO
XXX

14

DA/DB after CLKt

XXX

3

SRCC after CLKCt

XXX

3

RESET after CLKt

XXO

6

UNIT

MAX

ns

3
ns

elK/RESET requirements
SN74ACT8847-40

PARAMETER

tw

7-40

Pulse duration

MIN
CLK high

15

eLK low

15

REm

12

MAX

UNIT

ns

switching characteristics
NO.
1

PARAMETER
tpd1

2

tpd2

3

tpd3

4

tpd4

5

tpd5

FROM
(INPUT)

TO
(OUTPUT)

PIPELINE
CONTROLS
PIPES2-PIPESO

SN74ACT8847-50
MIN

DA/DB/lnst

Y OUTPUT

111

t

INPUT REG

Y OUTPUT

110

120

INPUT REG

STATUS

110

120

PIPELN REG

Y OUTPUT

10X

75

PIPELN REG

STATUS

10X

75

OUTPUT REG Y OUTPUT

OXX

36

OUTPUT REG

STATUS

OXX

36

Y OUTPUT

XXX

24

SELMS/CS

6

tpd6

CLKt

7

tpd7

CLKt

8

tpd8

SELMS/lS

9

td1 :j:

ClKt

10

td2:j:

ClKt

Y OUTPUT

ns
ns
ns
ns

all but 111

3.0

ns

XXX

1.5

ns

CLKt

010

100

ClKt

000

50

STATUS
INVALID
Y OUTPUT
INVALID

clocked into sum or product register by

ten 1

OEY

Y OUTPUT

13

ten2

QE<:, O'ES

STATUS

14

tdis1

OEY

Y OUTPUT

OXX)
XXX
XXX
XXX

15

tdis2

OEC,OES

STATUS

XXX

that clock. (PIPES2-PIPESO =
12

ns

3.0

data captured in C register is data
td3

ns

all but 111

INVALID

Delay time, ClKC after ClK to insure
11

UNIT

MAX

ns
16

td-O§

20
20
20

ns

20

tThis parameter no longer tested and will be deleted on next Data Manual revision,
:j: Minimum clock cycle period not guaranteed when operands are fed back using FlOWC to bypass
the C register and operands are used on the same cycle.
ttd is the clock cycle period.

,.....
..t-

OO

00

t-

U

«

..t,.....
Z

CJ)

7-41

setup and hold times
PARAMETER

NO.
16
17
18
19
20
21
22
23
24
25

tsu1
tsu2
tsu3
tsu4
tsu5
tsu6
th1
th2
th3
th4

Inst/control before CLKt
DA/DB before CLKt
DA/DB before 2nd CLKt (DP)
CONFIG1-0 before CLKt
SRCC before CLKCt
RESET before CLKt
Inst/control after CLKt
DA/DB after CLKt
SRCC after CLKCt
REm after CLKt

PIPELINE
SN74ACT8847-50
CONTROLS
PIPES2-PIPESO MIN
MAX
XXO
16
XXO
16
XX1
75
XXO
18
xxx
16
XXO
16
XXX
3
XXX
3
XXX
3
XXO
6

UNIT

ns

ns

elK/RESET requirements
PARAMETER

tw

7-42

Pulse duration

SN74ACT8847-50
MIN
MAX
CLK high
15
CLK low
15
RESET
15

UNIT

ns

, ACT884 7 load Circuit
The load circuit for the 'ACT884 7 is shown in Figure 1.
TESTER PIN
ELECTRONICS

TEST
FROM

S1

OUTPUT~

UNDER TEST

1 ,_

C>

T

Cl

TIMING
PARAMETER
ten
tdis
tpd

tpZH
tplH
tpHZ
tplZ

Cl t

IOH

IOl

IOH

Vl

S1

50 pF

1 rnA

-1 rnA

1.5 V

CLOSED

50 pF

16 rnA

-16 rnA

1.5 V

CLOSED

50 pF

-

-

OPEN

-

tCl includes probe and test fixture capacitance.
NOTE: All input pulses are supplied by generators having the following characteristics:
PRR :s 1 MHz, Zo = 50 n, tr :S 6 ns, tf :S 6 ns.

Figure 1. Load Circuit

,....

""

00
00
l-

e.>

«

"",....

Z
en

7-43

Lfl881~nfflLNS
2

3

4

5

8

9

~I

~I

~I

~I

~I

~I

~:

~:

~:

~:

I

I

f

I

I

I

I

I

-...J

.;,.
-1>0

CLOCKS

~I

CLK.CLKC

III
DATA INPUT
BUSES

{ DA31-0 (

1---17-----01

DB31 OK

,

CONFIG1.0
PIPES2-0
DATA INPUT.J
CONTROLS ..

OPO

~
i

IV

OP3

;'-23

xr------~~~--~------~--------~------~------~--------~------~------~

:
CONFIG1.0 - 00

'::

: ',--_---".__--''--__~i_ _ _' - -_ _..l--_ _--L-_ _---1.._ _- - ! .

X~----,r----__,----__r----_,_----...,...----_r_----,......----,....-

4~22

i

PIPES2-0. 000

i

i

l

i i i

I

V
7

II
1:

I
: ,

11\

I

16 --fi 1<"22

1\
II
i

RA
INPUT REGISTER {

I,

------i'i.

)

i I
I
OPO

I
RB

INSTRUCTION/{
CONTROLS

~

-.l

I ,

--\

22

I

:I

I

,

'i.
OP3
(,"----__,--:..:...:=--__r----_,_-----,-----_r_----,......----t_------r

r----~----~----~---=--=-----~----~----~-----"

~"----~--O~P-2-"""T----_;-----_r----_r_----,......----t_---__r
SELOP7-0 ~ XXXX 1001" SELOP7-0 I. XXXX 01X~'
t"11~\ltut"

CONTROLS ~

NOTES:

~

1

I,
If
I

OP1

(0'- - - - ' - - - - - - ' - - - - - - ' - -

'.{

: jo--16:::::::;1 10-22

('

CONTENTS

11

X
~~~--r------'----__r----""T"-----'-----_r_----,......----,....I -174

I

-

ENRB

OP2 MSH
f<-23
X OP2LSH

KCONFlG1.0_01~

CLKMODE!\
:
ENRA

OP1

II

~

10

+ \,;tu:.u\:)rl

,--,,::,"V~N.:v=.n:;.;.'.::u::.._.:.:v~.,::.._r'':'----_r----_r_----,......-

V M.LILI \;-VI'II I nUL ""..,..."VIVlr IIVI .. WI \a.CC .,",UlCI

Assume the following mixed precision operation.
Single precision OPO+OP1 =RA+RB - SUM1 - CREG, where OPO is SP and OP1 is SP.
Mixed precision OP2.0P2 = RA.RB - PRODUCT1, where OP3 is SP and OP2 is DP.
NOP (must be inserted).
Mixed precision (OP3.0P2) + (OPO + OP1) = PREG + CREG - SUM2 (DP), and then convert to SP.
Assume valid control signals for FAST, HALT = 1, PIPES2-0 =000 (fully pipelined mode), RESET = 1, RND1-0, SELST1-0 = 11, TP1-0 = 11.

Figure 2a. Timing Diagram for: SP ALU

-+

DP MULT

-+

DP ALU

-+

Convert DP to SP

2

3

II

I
I

iI

i

:J'

S R C C - - - ' - - - - - - ' - '- - - - - ' - , '

CREG CONTROLS

{

5

4

': _20:::0:
ENRC-----'-----'------L..'i,

9

10

1
:I
I

:
:I
I

I
I

8

i

-20~

"-24

: : :

I:

I

I

I

"

1"-24

: )'

1 "--164

6

,\

:

10-22

I

FLOWC

:

i

: _

16

,
--=>l ..-

22

1

1--16

11

i l~----'-:---'-I

:t
-I

~-:2c:2--r-------r--

-'-----'-----'------'-----""""'XPRODUCTl IOPlX
I
1

I
1

i

1

_ _ _---._ _ _--,.X SUM1 ISPI X
j

i

I

I ~----~,~-------r,---------,r_------~,---------r--1
1
1
:
1

i

X SUM2 I D P I X X SUM2 (SPI X

i

I '----,.--

.

~13

NOTES:

Assume the following mixed precision operation.
Single precision OPO + OP1 = RA + RB - SUM 1 - CREG, where OPO is SP and OP1 is SP.
Mixed precision OP2.0P2 = RA.RB - PRODUCT1, where OP3 is SP and OP2 is DP.
NOP (must be insertedl.
Mixed precision (OP3.0P21 + (OPO+OPlI = PREG+CREG - SUM2 (DPI. and then convert to SP.
Assume valid control signals for FAST, HALT = 1, PIPES2-0 =000 (fully pipelined model, RESET = 1, RND1-0, SELST1-0 = 11, TP1-0 = 11.

Figure 2b. Timing Diagram for: SP AlU ... DP MUl T ... DP AlU
-...J

./:..

m

SN74ACT8847

-+

Convert DP to SP

L1788.1~~17LNS
~

Ol

3

2

--.J

CLK,CLKC

CLOCKS

4

5

6

7

8

9

10

11

I
I

OA:~~~:UT {

to

OA31-0~

OPO ~~ OP~ MSH
10-17
I ~23
I~
"
J ~ ,~
OB31-0
OP1 LSH
OP1 MSH
OP2 LSH j(i-~_--I:I.I

CONFIG1,0
PIPES2-0

~

CLKMOOE
ENRA
ENRB

INPUT REGISTER {
CONTENTS

t

II

CONfl3.,0-01

r
r
I

DATA INPUT
CONTROLS

t

II

k

16

:
I

;x
:
I

-b

!

:I
----rl~
16 _

RA

:

RB

;

110-0 ,

CONTROLS

C

:
I

I

;Xr ---.J;----J....---..J..-----1------J

:
I

i

PIPES2-0;-000
I

i

:
I 1'r-_ _....L._ _ _-1._ _ _

K'

--1___- - l _

-OJ t- 22

I

I

I

:\

:

:

~
~

fPo

:

;

;

~P1

~(======~:~O:P2~====~::=======~:=======~=======~=======~=======~

:

_--...L--------1.

l'

..,. ...............
u
I

L

J

,1'

J. _

I I

[

:

I I

I'tr-

,

,II\.
I

I
J_

~I

j

VALID CONTROL ASSUMPTIONS Isee Notel

I
I

NOTES:

I

;

:~
I
:
:
:
:
1,1..
:
I
I
I
I
:~?
I
J- 22
,---16 ----0/ \.-'-2~2::----+1----+1----II-----!-----+---~I------!

{
CONTROLS

I

II

SELOP7-0

INSTRUCTIONI

CONFIG1;0-00;X

I i i

----'I"Ii:

___1--_ _ _L-_ _ _..L:_ _ _--l_ _ _--II.-___1--_ _ _L-_

II

I
I

I
I

I
I

Assume the following double precision operation,
OPO + OP1 = RA + RB - SUM1 - CREG
(OPO + OP11 • OP2 = SREG • RB - PRODUCT1
[(OPO + OP11 • OP21] + (OPO + OP11 = PREG + CREG - SUM2
Assume valid control signals for FAST, HALT = 1, PIPES2-0 = 000 (fully pipelined model. RESET = 1, RND1-0, SELST1-0 = 11, TP1-0 = 11.

Figure 3a. Timing Diagram for: DP ALU -- DP MULT -- DP ALU

2

3

II

4

--L,:I I

_ _ _......L_ _ _-JL-._ _ _.J.:_ _ _

5

6

8

II

i

II

I I

1/

SRCC~

-----r------,r----"T----r-~_--20--....j..1 !.-

CREG CONTROLS

{

ENRC

:

:'1

~

:~

:

INTERNAL

{.:.:

_.l---~---I,:X
ALU PIPE

REGISTER CONTENTS

REGIST~~T;~~TENTS

I!

PREG

OEY

I
I
I
I

I~~x ~{,,:

:

I

I
I
I

I
I

I

I
:
:
:I
I
'
:I
I
I
i

II
I

I

::x

:

I
I
I

I
I
I

II
I),

-!« _

12

71

SELMSILS

_

:

OEC.OES

:I I

Y31-0

:

OUTPUT BUSES {

i~SUMI

:I ,
I

STATUS

I

:

!
!
:
:
S U M 1 X X

SUM1:

_ _ _ _......L_ _ _-JL-._ _ _.J.I_ _ _-J:~~'---_HI.

{

11

I
I
I
I

: :

SREG

OUTPUT CONTROL

16 ---II Jo- 22

_

10

I
I
I
I

Ix I I
:::;;x

CREG
{

-, ,'- 24
:,(

,~

9

I
I
I
I

:~

I

I

I

-lot 10- 5 :
:

S~Ml

:

PR~DUCT1:X

:x

:

:x

I
I
I
I

I
I
I
I,

I

Y
:
I

:

:

:

:

:

I

SUM2X

:>e

~UM2

I
I

I
I

I

I

:):
JJ 10-:
I

::

I

5

I

VALID STATUS

14

I

t- ..
I

VALID STATUS

-l.I 10-13
NOTES:

Assume the following double precision operation.
OPO + OP1 = RA + RB .... SUM1 .... CREG
(OPO + OP1) • OP2 = SREG • RB .... PRODUCT1
[(OPO + OP1) • OP2)) + (OPO + OP1) = PREG + CREG .... SUM2
__
Assume valid control signals for FAST, HALT = 1, PIPES2-0 =000 (fully pipelined mode), RESET = 1, RND1-0, SELST1-0 = 11, TP1-0 = 11.

Figure 3b. Timing Diagram for: DP ALU .... DP MULT .... DP ALU
....,

./:.
....,

SN74ACT8847

I
I

:r

M+:
I
VALID STATUS

~

II

I:

Lv88.l:::>~vLNS

.....
.i(Xl>.

2

4

3

5

6

7

8

ClK.ClKC

DA31-0

~n,n'--"--"'i!'

"!~1-;3'--23
.1

"!~1;':0"23

~

\J

_~r----";"--."r----L-------'-------

ClKMODE
ENRA

ENRB
RA

i

~

-----r'

RB

/'

t"

l'

SElOP7-0

SElOP7-0 - 1101 1011

110-0

RA • CREG. PREG + RB

!"

/'1..---...::::.:..::.....-,-----,------

i

i

CONTROL ~

,I

VALID CONTROL ASSUMPTIONS Isee Notel

I
I
I

NOTES: Assume the following single precision operations,
(K * OPO) + OP1 = PRODUCT1 + OP1 .... SUM1
(K * OP2) + OP3 = PRODUCT2 + OP3 .... SUM2
(K * OP4) + OP5 = PRODUCT3 + OP5 .... SUM3
(K * OP6) + OP7 = PRODUCT4 + OP7 .... SUM4
Assume valid control signals for FAST , HALT = 1,
TP1-0 = 11.

I
I
I

PIPES2-0

= 010, RESET = 1, RND 1-0, SELST1-0 = 11 ,

Figure 4a. Timing Diagram for: SP [(Scalar * Vector) + Vector)

2

3

4

8

I

I

I

I

I

I

I

"'F --=-I

:

:

*-

20

::::
24

I :

I

I)

: _16-----.: 1.-22

I
:

FlOWC----------T----------+I----~--~:~
I

:
I

>

I (+--16---01

CREG

;

~

PREG

:

~

:x

PROOUCT1

I

;x
:x

SUM1

I

I

:

:
:

:
:

I

I

:

:

~22
:

PRODUCT2: X

I

I

~

,

ICONSTANT<

K ICONSTANTI:X

______~--------~I
OEY

7

I

I

ENRC - - - - + - - - - . . . . J . , I ) .

SREG

6

I

I

...l..----...L,:L:I

SRCC

PRODUCT3:X

:x

SUM2

I

PROOUCT4

:x

SUM3

I

~r---------:'---------

:x

SUM4

I

:x'-------I

) . I ! : : : : :"
I

SElM/LS (

~

- - --

~r-12

I

':
II

I,

I~

Y31-0

I

STATUS

I

K ICONSTANTIX

SUM1

SUM2

SN74ACT8847

SUM3

SUM4:

I
I

~)l-----15

VALID STATUS

~

VALID STATUS
10-4

~

14- 4

~

14-4

;-4

PIPES2-0

Figure 4b. Timing Diagram for: SP [(Scalar

~

14

IA'
I I

Assume the following single precision operations.
(K * OPO) + OP1 = PRODUCT1 + OP1 -+ SUM1
(K * OP2) + OP3 = PRODUCT2 + OP3 -+ SUM2
(K * OP4) + OP5 = PRODUCT3 + OP5 -+ SUM3
(K * OP6) + OP7 = PRODUCT4 + OP7 -+ SUM4
Assume valid control signals for FAST , HALT = 1.
TP1-0 = 11,

co

'4I .-41-------

!

!

I

-t*-13

NOTES:

!!

-1

__________+-______~: i
OEC,OES

-;J

5

I

I

*

°

~

10-4

= 10, RESET = 1, RND 1-0, SELST1-0 = 11 ,

Vector)

+

Vector)

SN74ACT8847 64-Bit Floating Point Unit
Introduction
Designing with the SN74ACT8847 floating point unit (FPU) requires a thorough
understanding of computer architectures, microprogramming, and IEEE floating point
arithmetic, as well as a detailed knowledge of the 'ACT8847 itself. This introduction
presents a brief overview of the 'ACT884 7 and discusses a number of issues when
designing and programming with this FPU.

Major Architectural Features
The overall architecture for a floating point system is determined by a combination
of design factors. The principal consideration is the set of performance targets that
the floating point processor has to achieve, usually exprE1ssed in terms of clock cycle
period, operating mode (vector or scalar), and operand precision (32 bit, 64 bit, or
other). Of almost equal importance are design constraints of cost, complexity, chip
count, power consumption, and requirements for interfacing to other processors.
The architecture of the 'ACT884 7 is optimized to satisfy several processing and
interface requirements. The FPU has two 32-bit input buses, the DA and DB data buses,
and one 32-bit output bus, the Y bus. This three-port design provides much greater
I/O bus bandwidth than can be achieved by a single-port device (one 32-bit I/O bus).
Two single-precision inputs can be simultaneously loaded on the input buses while
a result is being output on the Y bus.
Internally, the 'ACT8847 FPU consists of two main functional blocks: the multiplier
and the ALU (see Figure 5). Either the multiplier or the ALU can operate independently,
or the two functional units can be used simultaneously in "chained" mode. When
operating independently, each block of the FPU performs a separate set of arithmetic
or logical functions. The multiplier supports multiplication, division and square roots.
The ALU supports addition, subtraction, format conversions, logical operations, and
shifts. Integer division and integer square root require both the multiplier and the ALU;
the final result comes from the ALU.

en
2
-....I
,J:I.

l>

In chained mode, a multiplier operation executes in parallel with an ALU operation.
Possible examples include calculations of a sum of products (multiply and accumulate)
or a product of sums (add and then multiply). The sum of products computation requires
a total of four operands: two new inputs to be multiplied, the sum of previous products,
and the current product to be added to the sum, as shown in Table 3.

('")

-4

IX)
IX)

,J:I.
-....I

7-50

0831-080

OA31-0AO

/32

{32

V

CONFIGURATION
LOGIC

V

V

164

164

I

INPUT REGISTER

INPUT REGISTER

V
I

r

I

l.64

64

MULTIPLIER

I

ALU

V
/64

I

I

~

I

64

I
V MUX

7

v
I

32

V31-VO

Figure 5_ High Level Block Diagram
Table 3_ Sum of Products Calculation
MULTIPLIER OPERATION

ALU OPERATION

A*B
C*O
E * F

(A * B) + 0
(C * 0) + (A * B)

··
·

-

··
·

7-51

Because the' ACT8847 has multiple internal data paths and data registers, this sum
of products can be generated by simultaneous operations on new bus data and internal
feedback, without the necessity of storing either the previous accumulation or the
current product off chip. Data flow for the sum of products calculation is shown in
Figure 6.
A

*

B

PREG + SREG

Figure 6. Multiply/Accumulate Operation

Data Flow in Pipelined Architectures
Several levels of internal data registers are available to segment the internal data paths
of the' ACT884 7. The most basic choice is whether to use the device in flowthrough
mode (with no internal registers enabled) or whether to enable one or more registers.
When none of the internal registers are enabled, the paths through the multiplier and
the ALU are not segmented. In this case, the delay from data input to result output
is the longest.
Enabling one or more registers divides the data paths so that data can be clocked into
internal registers, instead of from an external source to an external destination. Enabling
the input registers permits data and instruction inputs to be registered on chip. Also,
the hardware division and square root operations which the' ACT884 7 performs require
that the input registers be enabled.
In the main data paths, three sets of internal registers are available in the ACT8847:
input registers, pipeline registers in the multiplier and ALU logic blocks, and output
registers to capture results from the multiplier and the ALU. When all three levels of
data registers are enabled, the register-to-register delay inside the device is minimized.
This is the fastest operating mode, and in this configuration the' ACT8847 is said
to be "fully pipelined." While one instruction is executing, the next instruction along
with its associated operands may be input to device so that overlapped operations
occur (see Figure 7).
The selection of operating mode, from flowthrough to fully pipelined, determines the
latency from input to output, the number of clock cycles required for inputs to be
processed and results to appear. For each register level enabled in the data path, one
clock cycle is added to the latency from input to output.

7-52

Inn

ClK

INSTRUCTION INPUTS [

DATA INPUTS

A

+ B

[-A~n

INPUT REGISTER . CONTENTS

I

REGISTE~l~;!~~~~~
OUTPUT SUM
REGISTER CONTENTS

I

r

C

+ D --

I

E

l--~~~

A ,B

n

il

+ F

E, F

~.

-

C ,-~- -[

T-~;;-r

E, F

C + D

A

+ B

Figure 7, Example of Fully Pipelined Operation

-..J

en

w

SN74ACT8847

l-l~

T

r--

E

+ F

C

+ D

1

ul
E

+ F

Control Architectures for High-Speed Microprogrammed Architectures
A separate control circuit is required to sequence the operation of the' ACT884 7. A
sequencer function within the control circuit controls both the sequencer and FPU as
determined by FPU status outputs. Either a standard microsequencer such as the
SN 7 4ACT881 8, or a custom controller such as a PLA or gate array can be used to
control the FPU. Figure 8 shows an example block diagram for a PLA control circuit.
If a standard microsequencer is used, execution addresses for routines stored in the
microprogram memory are generated by the microsequencer. As its name implies,
microprogram memory stores the sequences of microinstructions which control FPU
execution. The' ACT884 7 can be programmed by generating all control bits in a given
microinstruction to select an FPU operation.
One possible control circuit for the ' ACT884 7 consists of a microsequencer,
microprogram memory, and one or more microinstruction registers, together with status
logic as required to support a specific floating point implementation. A control circuit
without an instruction register is typically too slow for use with the' ACT884 7. At
least one microinstruction register is used to hold the current instruction being executed
by the FPU and sequencer (see Figure 9).
Inclusion of the microinstruction register divides the critical path from the sequencer
through the program memory to the FPU control inputs, permitting much faster
execution times. However, when all the internal registers of the FPU are enabled, FPU
operation may be fast enough to require a second register in the control circuit. In
this case, a register on the output bus of the sequencer captures each microprogram
address, and the microinstruction register captures each microinstruction (see
Figure 10).

EXTERNAL
CONTROL/STATUS

PROGRAMMABLE
LOGIC ARRAY
(PLAI

OA

DB

STATUS

Y

Figure 8. PLA Control Circuit Example

7-54

MICROCODE ADDRESS

I
MICROPROGRAM
MEMORY

INSTRUCTION
REGISTER

DA

DB

~32

V
"

"

-

.....
MICROSEQUENCER

t

r'

'ACT8847 FPU

I

I

STATUS
lOGIC

I

I

STATUS
~

Figure 9. Microprogrammed Architecture

-.J

in

(11

SN74ACT8847

32

I

"V
" 32
Y

Lv88l:J"vLNS
-..J

enOJ
MICROCODE ADDRESS

I

-.

-

MICROPROGRAM
MEMORY

INSTRUCTION
REGISTER

~

DB

DA

.

I

.. V
~r 32

32

r

MICROSEQUENCER

~

~

I

L

STATUS
LOGIC

I

J

-

....

'ACT8847 FPU

STATUS

ADDRESS
REGISTER

+
Figure 10. Microprogrammed Architecture with Address Register

~

I'

Y

32

Introducing registers in the FPU data paths and the control circuit complicates I/O
timing, status output timing, the status logic and the microprogram for the FPU and
the sequencer. These timing relationships affect branches, jumps to subroutine, and
other operations depending on FPU status. Some of these programming issues are
discussed below.

Microprogram Control of an 'ACT884 7 FPU Subsystem
A microprogram to control the' ACT884 7 must take into account not only the FPU
operation but also the sequencer operation, especially when the system is performing
a branch on status or handling an exception.
Several options are available for dealing with such exceptions. The' ACT884 7 can
be programmed to discard operands in invalid formats, and some exceptions caused
by illegal operations. In general, though, the microprogram should be designed to handle
a range of status results or exceptions. Hardware timing considerations such as pipeline
delays in both control and data paths must be studied to minimize the difficulty of
performing branches to status exception handlers.
Later sections of the 'ACT884 7 user guide present detailed examples of
microinstructions and timing waveforms, along with interpretations of status outputs
and the choices involved in handling IEEE status exceptions.

, ACT884 7 Data Formats
The' ACT884 7 accepts either operands as normalized IEEE floating point numbers,
(ANSI/IEEE standard 754-1985), unsigned 32-bit integers, or 2's complement integers.
Floating point operands may be either single precision (32 bits) or double precision
(64 bits).
IEEE formats for floating point operands, both single and double precision, consist of
three fields: sign, exponent, and fraction, in that order. The leftmost (most significant)
bit is the sign bit. The exponent field is 8 bits long in single-precision operands and
11 bits long in double-precision operands. The fraction field is 23 bits in single precision
and 52 bits in double precision. The value of the fraction contains a hidden bit, an
implicit leading" 1 ", as shown below:
1. fraction
The representation of a normalized floating point number is:
(-1)S * 1.f * 2(e-bias)
where the bias is either 127 for single-precision operands or 1023 for double-precision
operands.
The formats for single-precision and double-precision numbers are shown in Figure 11
and Figure 12, respectively. Further details of IEEE formats and exceptions are provided
in the IEEE Standard for Binary Floating Point Arithmetic, ANSI/IEEE Std 754-1985.

7-57

31 30

o

23 22

s: sign of fraction
e: a-bit exponent biased by 127
f: 23-bit fraction

Figure 11. IEEE Single-Precision Format

63 62

o

52 51

s: sign of fraction
e: 11-bit exponent biased by 1023
f: 52-bit fraction

Figure 12. IEEE Double-Precision Format
The' ACT884 7 also handles two other operand formats which permit operations with
very small floating point numbers. The ALU accepts denormalized floating point
numbers, that is, floating point numbers so small that they could not be normalized.
If these denormal operands are input to the multiplier, they will cause status exceptions.
Denormals can be passed through the ALU to be "wrapped," and the wrapped
operands can then be input to the multiplier.
A denormalized input has the form of a floating point number with a zero exponent,
a nonzero mantissa, and a zero in the leftmost bit of the mantissa (hidden or implicit
bit). Using single precision, a denorm is equal to:
(-1)S * (2) -126 * fraction
For double precision, a denorm is equal to:
(-1)S * (2) - 1022 * fraction
A denormalized number results from decrementing the biased exponent field to zero

~ before normalization is complete. Since a denormalized number cannot be input to
......
~

f;

~

~
~

......

the multiplier, it must first be converted to a wrapped number by the ALU. A wrapped
number is a number created by normalizing a denormalized number's fraction field and
subtracting from the exponent the number of shift positions (minus one) required to
do so. The exponent is encoded as a two's complement negative number. When the
mantissa of the denormal is normalized by shifting it left, the exponent field decrements
from all zeros (wraps past zero) to a negative two's complement number (except in
the case of 0.1 XXX ... , where the exponent is not decremented).
Floating point formats handled by the 'ACT8847 are presented in Table 4.

7-58

Table 4. IEEE Floating Point Representations
TYPE OF
OPERAND

EXPONENT (e)
SP (HEX) DP (HEX)

FRACTION (f) HIDDEN
(BINARY)

BIT

VALUE OF NUMBER REPRESENTED
SP (DECIMAL) t

DP (DECIMAL) t

Normalized
Number (max)

FE

7FE

All 1 's

1

( - 1)S (2127) (2 - 2 - 23)

( - 1)S (2 1023 ) (2 - 2 - 52)

Normalized
Number (min)

01

001

All O's

1

(-1)S (2- 126) (1)

( - 1)S (2 - 1022) (1)

Denormalized
Number (max)

00

000

All 1 's

0

(1-)S (2-126) (1-2-23)

( - 1)S (2 - 1022) (1 - 2 - 52)

Denormalized
Number (min)

00

000

000 ... 001

0

(_1)5 (2 -126) (2 - 23)

(-1)S (2- 1022) (2- 52 )

Wrapped
Number (max)

00

000

All 1 's

1

( - 1)5 (2 - 127) (2 - 2 - 23)

( - 1)5 (2 - 1023) (2 - 2 - 52)

Wrapped
Number (min)

EA

7eD

All O's

1

( - 1)S (2 - (22 + 127)) (1)

(_1)5 (2 - (51 + 1023)) (1)

Zero

00

000

Zero

0

(-l)S (0.0)

(-1)S (0.0)

Infinity

FF

7FF

Zero

1

( - 1)5 (infinity)

( - 1)S (infinity)

NaN (Not a
Number)

FF

7FF

Nonzero

N/A

None

None

ts

sign bit.

-..J

0,
CO

SN74ACT8847

Status Outputs
Status flags are provided to signal both floating point and integer results. Integer status
is provided using AEQ8 for zero, NEG for sign, and OVER for overflow/carryout.
Status exceptions can result from one or more error conditions such as overflow,
underflow, operands in illegal formats, invalid operations, or rounding. Exceptions may
be grouped into two classes: input exceptions resulting from invalid operations or
denormal inputs to the multiplier, and output exceptions resulting from illegal formats,
rounding errors, or both.

SN74ACT8847 Architecture
Overview
The SN74ACT8847 is a high-speed floating point unit implemented in Tl's advanced
1-ltm CMOS technology. The device is fully compatible with IEEE Standard 754-1985
for addition, subtraction, multiplication, division, square root, and comparison.
The' ACT884 7 FPU also performs integer arithmetic, logical operations, and logical
shifts. Absolute value conversions, floating point to integer conversions, and integer
to floating point conversions are also available. The ALU and multiplier are both included
in the same device and can be operated in parallel to perform sums of products and
products of sums (see Figure 13).

7-60

ENRA

SRCC

-+--j------+------'\

--------------4 FlOWC

___________ HArf

--------------4 BYTEP
--------------4CLK
--------------4 PlPESO

----------+ ClKMODE
-------+ RE"SEf
FlOWC

--r-2-+- TP1·TPO

-+--j-------+------'\

"

~VCC

~GNO

SElMS/LS

SElST1·
SElSTO

----r-----------'\

FROM
INSTRUCTION - PIPELINE
REGISTER0

11

-

PY3·PYO

=

Y31·YO

MSEAR

ED
DIVBYP
IVAl
INEX
OVER
UNDER
OENOAM

DENIN
ANDCO
SACEX
CHEX
STEX1·STEXO
NEG
IN'

Figure 13. 'ACT8847 Detailed Block Diagram
7-61

IEEE formatted denormal numbers are directly handled by the ALU. Denormal numbers
must be wrapped by the ALU before being used in multiplication, division, or square
root operations. A fast mode in which all denormals are forced to zero is provided
for applications not requiring gradual underflow.
The' ACT884 7 input buses can be configured to operate as two 32-bit data buses
or as a single 64-bit bus, providing a number of system interface options. Registers
are provided at the inputs, outputs, and inside the ALU and multiplier to support
multilevel pipelining. These registers can be bypassed for nonpipelined operation.
A clock mode control allows the temporary input register to be clocked on the rising
edge or the falling edge of the clock to support double-precision ALU operations at
the same rate as single-precision operations. A feedback register (C register) with a
separate clock is provided for temporary internal storage of a multiplier result, ALU
result or constant.
Four multiplexers select the multiplier and ALU operands from the input registers, C
register or previous multiplier or ALU result. Results are output on the 32-bit Y bus;
a Y output multiplexer selects the most significant or least significant half of the result
if a double-precision number is being output.
To ensure data integrity, parity checking is performed on input data, and parity is
generated for output data. A master/slave comparator supports fault-tolerant system
design, Two test pin control inputs allow alii/Os and outputs to be forced high, low,
or placed in a high-impedance state to facilitate system testing.

Pipeline Controls
Six data registers in the' ACT884 7 are arranged in three levels along the data paths
through the multiplier and the ALU. Each level of registers can be enabled or disabled
independently of the other two levels by setting the appropriate PIPES2-PIPESO inputs.
When enabled, data is latched into the register on the rising edge of the system clock
(CLK). A separate instruction pipeline register stores the instruction bits corresponding
to the operation being executed at each stage.
The levels of pipelining are shown in Figure 14. The first set of registers, the RA and
RB input registers, are controlled by PIPESO. These registe'rs may be used as inputs
to the ALU, multiplier, or both.
The pipeline registers are the second register set. When enabled by PIPES1, these
registers latch intermediate values in the multiplier or ALU.
The results of the ALU and multiplier operations may optionally be latched into two
output registers by setting PIPES2 low. The P (product) register holds the result of
the multiplier operation; the S (sum) register holds the ALU result.
Table 5 shows the settings of the registers controlled by PIPES2-PIPESO. Operating
modes range from fully pipelined (PIPES2-PIPESO = 000) to flowthrough
(PIPES2-PIPESO = 111). The instruction pipeline registers are also set accordingly.
7-62

PIPE SO

EN

INPUT REGISTER

EN

INPUT REGISTER

EN

MULTIPLIER
PIPELINE REGISTER

EN

PIPELINE REGISTER

EN

MULTIPLIER
PRODUCT REGISTER

EN

EN

INSTRUCTION
REGISTER

EN

INSTRUCTION
PIPELINE REGISTER

EN

INSTRUCTION
PIPELINE REGISTER

PIPES1

ALU

PIPES2

ALU
SUM REGISTER

CLK--

Figure 14. Pipeline Controls

-...J

m
w
SN74ACT8847

Table 5. Pipeline Controls (PIPES2-PIPESO)
PIPES2·PIPESO

X
X
X
X
0
1

X
X
0
1

X
X

0
1
X
X
X
X

REGISTER OPERATION SELECTED
Enables input registers (RA. RBI
Makes input registers (RA. RBI transpar(!nt
Enables pipeline registers
Makes pipeline registers transparent
Enables output registers (PREG. SREG. Status)
Makes output registers (PREG. SREG. Status) transparent

In flowthrough mode all three levels of registers are transparent. a circumstance which
may affect some double-precision operation~. Since double-precision operands require
two steps to input. at least half of the data must be clocked into the temporary register
before the remaining data is placed on the DA and DB buses.
When all registers (except the C register) are enabled. timing constraints can become
critical for many double-precision operations. In clock mode 1. the ALU can perform
a double-precision operation and output a result during every clock cycle. and both
halves of the result must be read out before the end of the next cycle. Status outputs
are valid only for the period during which the Y output data is valid.
Similarly. double-precision multiplication is affected by pipelining. clock mode. and
sequence of operations. A double-precise multiply may require two cycles to execute
and two cycles to output the result. depending on the settings of PIPES2-PIPESO.
Duration of valid outputs at the Y multiplexer depends on settings of PIPES2-PIPESO
and CLKMODE. as well as whether all operations and operands are of the same type.
For example, when a double-precision multiply is followed by a single-precision
operation. one clock cycle must intervene between the dissimilar operations. The
instruction inpl!ts are ignorC3 d during this clock cycle.

Temporary Input Register
A temporary input register is provided to enable loading of two double-precision
numbers on two 32-bit input buses in one clock cycle. The contents of the DA bus
are loaded into the upper 32 bits of the temporary register; the contents of DB are
loaded into the lower 32 bits.
(/)

2 A clock mode signal (CLKMODE) determines the clock edge on which the data will
~ bestored in the temporary register. When CLKMODE is low. data is loaded on the
~

(")

~

rising edge of the clock. With CLKMODE set high, the temporary register loads on
a falling edge and the RA and RB registers can then be loaded on the next rising edge.
The temporary register loads during every clock cycle.

(X)

~

-.oJ

7-64

RA and RB Input Registers
Two 64-bit registers, RA and RB, are provided to hold input data for the multiplier
and AlU. Data is taken from the DA bus, DB bus and the temporary input register.
The registers are loaded on the rising edge of clock ClK if the enables ENRA and ENRB
are set high. PIPESO must be low.
Data input combinations to the 'ACT884 7 vary depending on the precision of the
operands and whether they are being input as A or B operands. loading of external
data operands is controlled by the settings of ClKMODE and CONFIG 1-CONFIGO,
which determine the clock timing for loading and the registers that are used. (See Figure
15).

Configuration Controls
Three input registers are provided to handle input of data operands, either single
precision or double precision. The RA, RB, and temporary registers are each 64 bits
wide. The temporary register is (ordinarily) used only during input of double-precision
operands.
Double-precision operands are loaded by using the temporary register to store half
of the operands prior to inputting the other half of the operands on the DA and DB
puses. As shown in Table 6, four configuration modes for selecting input sources are
available for loading data operands into the RA and RB registers.
DA

DB

TEMPORARY REGISTER
l
LSH
MSH

CONFIG 1 --4......-\.

CONFIGO
ENRA

-----t----....---+----....---+-----'
MSH
LSH
RA INPUT REGISTER

LSH
II/ISH
RB INPUT REGISTER

ENRB-----------------'

Figure 15. Input Register Control

7-65

Table 6. Double Precision Input Data Configuration Modes
LOADING SEQUENCE
DATA LOADED INTO TEMP
DATA LOADED INTO RA/RB
REGISTER ON FIRST CLOCK
REGISTERS ON SECOND
AND RA/RB REGISTERS ON
CLOCK
SECOND CLOCK t
CONFIG1

CONFIGO

0

0

0

1

1

0

1

1

DA
B operand
(MSH)
A operand
(LSH)
A operand
(MSH)
A operand
(MSH)

DB
B operand
(LSH)
B operand
(LSH)
B operand
(MSH)
A operand
(LSH)

DA.
A operand
(MSH)
A operand
(MSH)
A operand
(LSH)
B operand
(MSH)

DB
A operand
(LSH)
B operand
(MSH)
B operand
(LSH)
B operand
(LSH))

tOn the first active clock edge (see Clock Mode Settings), data in this column is loaded into the temporary
register. On the next rising edge, operands in the temporary register and the DAtOS buses are loaded into
the RA and RS registers.

When single-precision or integer operands are loaded, the ordinary setting of
CONFIG1-CONFIGO is 01, as shown in Table 7. This setting loads each 32-bit operand
in the most significant half (MSH) of its respective register. Single-precision operands
are loaded into the MSHs and adjusted to double precision because the data paths
internal to the device are all double precision. It is also possible to load single-precision
operands with other CON FIG settings but two clock edges are required to load both
the A and B operands on the DA bus. The operands are input as the MSHs of the A
and B operands (see Table 6). For example, to load single-precision operands using
CONFIG 1-CONFIGO = 10, the A and B operands are input one active clock edge before
the instruction.
Table 7. Single-Precision Input Data Configuration Mode
DATA LOADED INTO
RA/RB REGISTERS ON
FIRST CLOCK
CONFIG1

CONFIGO

DA

DB

0

1

A operand

B operand

NOTE
This mode is ordinarily used for singleprecision operations.

Clock Mode Settings
Timing of double-precision data inputs is determined by the clock mode setting, which
allows the temporary register to be loaded on either the rising edge (CLKMODE = 0)
or the falling edge of the clock (CLKMODE = 1). Since the temporary register is not
used when single-precision operands are input, clock modes 0 and 1 are functionally
equivalent for single-precision operations using CONFIG 1-CONFIGO = 01.

7-66

The setting of CLKMODE can be used to speed up the loading of double-precision
operands. When the CLKMODE input is set high, data on the DA and DB buses are
loaded on the falling edge of the clock into the MSH and LSH, respectively, of the
temporary register. On the next rising edge, contents of the DA bus, DB bus, and
temporary register are loaded into the RA and RB registers, and execution of the current
instruction begins. The setting of CON FIG 1-CONFIGO determines the exact pattern
in which operands are loaded, whether as MSH or LSH in RA or RB.
Double-precision operation in clock mode 0 is similar except that the temporary register
loads only on a rising edge. For this reason, the RA and RB registers do not load until
the next rising edge, when all operands are available and execution can begin.
A considerable advantage in speed can be realized by performing double-precision
operations with CLKMODE set high. In this clock mode, both double-precision operands
can be loaded on successive clock edges, one falling and one rising. If the instruction
is an ALU operation, then the operation can be executed in the time from one rising
edge of the clock to the next rising edge. Both halves of a double-precision ALU result
must be read out on the Y bus within one clock cycle when the' ACT884 7 is operated
in clock mode 1.
The discussion above assumes that the system is able to furnish two sets of operands
in one cycle (one set on the falling edge of the clock and the other set on the next
rising edge). This assumption may not be valid, since the system is required to "double
pump" the input data buses.
Even for a system that is not able to double pump the input data buses, using clock
mode 1 can reduce microcode size substantially resulting in increased system
throughput. To illustrate, take the case of an operation where the operand(s) are
furnished by one or more of the feedback registers (refer to Table 8). Since the input
data buses are not being used to furnish the operands, the data on the buses at the
time of the instruction is unimportant. By setting CLKMODE high, the instruction begins
after the first cycle, resulting in a savings of one cycle.

Table 8a. Double-Precision CREG

CYCLE

CLKMODE

1

2

0
0

3

X

DA
BUS

DB
BUS

TEMP
REG

X
X
X

X
X

X
X
X

Table 8b. Double-Precision CREG

CYCLE

CLKMODE

1

1

2

X

+ PREG Using CLKMODE ... 0, PIPES2-0 - 010

X

INSTR
BUS
C + P
C + P
X

RA
REG

RB
REG

S
REG

X
X
X

X
X
X

X
X

C

+ P

+ PREG Using CLKMODE - 0, PIPES2-0 - 010

DA
BUS

DB
BUS

TEMP
REG

X
X

X
X

X
X

INSTR
BUS
C + P
X

RA
REG

RB
REG

X
X

X
X

S
REG
X

C

+ P

7-67

Going one step further, take the case of an operation where only one operand needs
to be furnished by the input data buses (refer to Table 9). To take advantage of clock
mode 1, set the CONFIG lines so that the external operand comes directly from the
DA and DB bus, as opposed to coming from the temporary register. Since the temporary
register is not used to provide an operand, the data latched into it is inconsequential.
It naturally follows then that the clock edge used to load the temporary register is
unimportant. So by setting CLKMODE high, a double-precision instruction will begin
after one cycle, instead of two cycles.
Table 9a. Double-Precision PREG

0
0

DA
BUS
X
RB(M)

DB
BUS
X
RB(L)

X

X

X

CYCLE

CLKMODE

1

2
3

Table 9b. Double-Precisioh PREG

1

1

DA
BUS
RB(M)

2

X

X

CYCLE

+ RB Using CLKMODE .. 0, PIPES2-0 .. 010

CLKMODE

INSTR
BUS
P + RB
P + RB
X

TEMP
REG
X
RB
X

RA
REG
X
X
X

+ RB Using CLKMODE

DB
BUS
RB(L)
X

INSTR
BUS
P + RB
X

TEMP
REG
RB
X

RB
REG
X
RB
X

1, PIPES2-0
RA
REG
X
X

RB
REG
RB
X

S
REG
X
X
P + RB

010
S
REG
X
P + RB

Operand Selection
Four multiplexers select the multiplier and ALU operands from the RA and RB registers,
the previous multiplier or ALU result, or the C register (see Figure 16). The multiplexers
are controlled by input signals SELOP7-SELOPO as shown in Tables 10 and 11. For
division and square root operations, operands must be sourced from the input registers
RA and RB.
Table 10. Multiplier Input Selection
A1 IMUX1) INPUT

en

SELOP6

OPERAND SOURCEt

SELOP5

SELOP4

OPERAND SOURCE t

0
0

0
0

0
0
1

0

1
1

Reserved
C register
ALU feedback
RA input register

Reserved
C register
Multiplier feedback
RB input register

2

"""
~

»
('")
-t

B1 IMUX2) INPUT

SELOP7

1
1

1

1

0
1

t For division or square root operations, only RA and RB registers can be selected as sources.

(X)
(X)
~

"""

7-68

ENRA

FROM
C REGISTER -- - FROM PRODUCT - - - REGISTER
SELOP7-6
SELOP5-4

t----------- ENRB

------------~

I

I

~

~

•

I

~

\.

SELOP1-0

-------------+------'
64
MULTIPLIER

Figure 16. Operand Selection Multiplexer

-..J

a,
(0

SN74ACT8847

SUM
T - ~ - FROM
REGISTER

SELOP3-2

Table 11. ALU Input Selection
82 (MUX4) INPUT

A2 (MUX3) INPUT
SELOP3

SELOP2

OPERAND SOURCEt

SELOP1

SELOPO

OPERAND SOURCEt

0
0

0

0
0

0

1
1

0

Reserved
C register
Multiplier feedback
RA input register

1
1

0

Reserved
C register
ALU feedback
RB input register

1
1

1
1

t For division or square root operations, only RA and RB registers can be selected as sources.

As shown in Tables 10 and 11, data operands can be selected from five possible
sources, including external inputs from the RA and RB registers. feedback from the
P (Product) and S (Sum) registers, and a stored value in the C register. Contents of
the C register may be selected as either the A or the B operand in the ALU, the multiplier,
or both. When an external input is selected, the RA input always becomes the A
operand, and the RB input is the B operand.
Feedback from the ALU can be selected as the A operand to the multiplier or as the
B operand to the ALU, Similarly, multiplier feedback may be used as the A operand
to the ALU or the B operand to the multiplier. During division or square root operations,
operands may not be selected except from the RA and RB input registers
(SELOP7-SELOPO = 11111111).
Selection of operands also interacts with the selected operation in the ALU or the
multiplier. ALU operations with one operand are performed only on the A operand (with
the exception of the Pass B operation). Also, depending on the instruction selected,
the B operand may optionally be forced to zero in the ALU or to one in the multiplier.
If an operation uses one or more feedback registers as operands, the unused busIes)
can be used to preload operand(s) for a later operation. The data is loaded into the
RA or RB input register(s); when the data is needed as an operand, the SELOPS pins
are set to select the RA or RB register(s), but the register input enables (ENRA, ENRB)
are not enabled. The one restriction on preloading data is that the operation being
performed during the preload MUST use the same data type (single-precision, doubleprecision, or integer) as the data being loaded. Operands cannot be preloaded within
square root or divide instructions.

C Register
The 64-bit constant (C) register is available for storing the result of an ALU or multiplier
operation before feedback to the multiplier or ALU. The C register has a separate clock
input (CLKC), input source select (SRCCI. and write enable (ENRC, active low).
The C register loads from the P or the S register output, depending on the setting of
SRCC. SRCC = 1 selects the multiplier as the input source. Otherwise, the ALU is
selected when SRCC = O. The SRCC input is not registered with the instruction inputs.
Depending on the operation selected and the settings of PIPES2-PIPESO, an offset
of one or more cycles may be necessary to load the desired result into the C register.
The register only loads on a rising edge of CLCK when ENRC is low. (See Figure 17).
7-70

~

~

td t

I

I

I
I

ClKC

f

I

I

I

"

f

\

ClK --~I

11

~

t td is the clock cycle period.
Figure 1 7. C Register Timing

-...J

~

SN74ACT8847

'-

A separate control (FLOWC) is available to bypass the C register when feeding an
operand back on theC register feedback bus. When FLOWC is high, the output of
the P or S register (as selected by SRCC) bypasses the C register without affecting
the C register's contents. Direct P or S feedback is unaffected by the FLOWC setting.

Pipelined ALU
The pipelined ALU contains a circuit for floating point addition and/or subtraction of
aligned operands, a pipeline register, an exponent adjuster and a normalizer/rounder
as shown in Figure 18. An exception circuit is provided to detect denormal inputs;
these can be flushed to zero if the FAST input is set high. If the FAST input is low,
the ALU accepts a denormal as input. A de norm exception flag (DENORM) goes high
when the ALU output is a denormal.
Integer processing in the ALU includes both arithmetic and logical operations on either
two's complement numbers or unsigned integers. The ALU performs addition,
subtraction, comparison, logical shifts, logical AND, logical OR, and logical XOR.
The ALU may be operated independently or in parallel with the multiplier. Possible ALU
functions during independent operation are given in Table 12.

EXPONENT SUBTRACTER

PREALIGNMENT

INTEGER ALU

NORMALIZER

ROUNDER

Figure 18. Functional Diagram for ALU

7-72

Table 12. Independent ALU Operations
SINGLE OPERAND
Pass
Move
Format Conversions
Wrap Denormalized Number
Unwrap
Shift

TWO OPERANDS
Add
Subtract
Compare
AND
OR
XOR

Pipelined Multiplier
The pipelined multiplier (see Figure 19) performs a basic multiply function, division
and square root. The operands can be singie-precision or double-precision floating point
numbers and can be converted to absolute values before multiplication takes place.
Integer operands may also be used. Independent multiplier operations are summarized
in Table 13.
If the operands to the multiplier are double precision or mixed precision (ie. one single
precision and one double precision), then one extra clock cycle is required to get the
product through the multiplier pipeline. This means that for PIPES 1 = 1, one clock
cycle is required for the multiplier pipeline; for PIPES 1 = 0, two clock cycles are required
for the multiplier pipeline.

RECODER

MULTIPLIER/DIVIDER

CONVERTER

"

~

CO
CO
I-

U

«~

NORMALIZER

"z

en
Figure 19. Functional Diagram for Multiplier

7-73

Table 13. Independent Multiplier Operations
SINGLE OPERAND

Square Root

TWO OPERANDS
Multiply
Divide

An exception circuit is provided to detect denormalized inputs; these are indicated
by a high on the DENIN signal. Denormalized inputs must be wrapped by the ALU before
multiplication, division, or square root. If results are wrapped (signaled by a high on
the DENORM status pin). they must be unwrapped by the ALU.
The multiplier and ALU can be operated simultaneously by setting the 11 0 instruction
input high. Division and square root are performed as independent multiplier operations,
even though both multiplier and ALU are active during divide and SQRT operations.

Data Output Controls
Selection and duration of results from the Y output multiplexer may be affected by
several factors, including the operation selected, precision of the operands, registers
enabled, and the next operation to be performed. The data output controls are not
registered with the data and instruction inputs. When the device is microprogrammed,
the effects of pipelining and sequencing of operations should be taken into account.
Two particular conditions need to be considered. Depending on which registers are
enabled, an offset of one or more cycles must be allowed before a valid result is available
at the Y output multiplexer. Also, certain sequences of operations may require both
halves of a double-precision result to be read out within a single clock cycle. This is
done by toggling the SELMS/LS signal in the middle of the clock period.
When a single-precision result is output, the SELMS/LS signal has no effect. The
SELMS/LS signal is set low only to read out the LSH of a double-precision result (see
Figure 20). To read out a result on the Y bus, the output enable OEY must be low.
, OEY is an asynchronous signal.

7-74

PRODUCT REGISTER

SUM REGISTER

64

,.--------'

64

16---~

SELMS/LS - - - r - - - - - - - - 4
FROM
INSTRUCTION
REGISTER

Y BUS

Figure 20. Y Output Control

Parity Checker/Generator
When BYTEP is high, internal even parity is generated for each byte of input data at
the DA and DB ports and compared to the PA and PB parity inputs respectively. If
an odd number of bits is set high in a data byte, a parity check can also be performed
on the entire input data word by setting BYTEP low. In this mode, PAO is the parity
input for DA data and PBO is the parity input for DB data.
Even parity is generated for the Y multiplexer output, either for each byte or for each
word of output, depending on the setting of BYTEP. When BYTEP is high, the parity
generator computes four parity bits, one for each byte of the Y multiplexer output.
Parity bits are output on the PY3-PYO pins; PYO represents parity for the least significant
byte. A single parity bit can also be generated for the entire output data word by setting
BYTEP low. In this mode, PYO is the parity output.

,....
"d-

00
00

Master/Slave Comparator

~

A master/slave comparator is provided to compare data bytes from the Y output U
multiplexer and the status outputs with data bytes on the external Y and status ports
when OEY, OES and OEC are high. If the data bytes are not equal, a high signal is ~
2
generated on the master/slave error output pin (MSERR).

«

en

Figure 21 shows an example master/slave circuit. Two' ACT884 7 slave devices verify
the data/status integrity of the' ACT884 7 master.

7-75

L v881~:n1v L NS
-..J

~

OJ

ARBITRATION I
CONTROL LOGIC
Y OUT
STATUS OUT

OUTPUT

Figure 21. Example of Master/Slave Operation

Status and Exception Generation
A status and exception generator produces several output signals to indicate invalid
operations as well as overflow, underflow, non-numerical and inexact results, in
conformance with IEEE Standard 754-1985. If output registers are enabled
(PIPES2 = 0), status and exception results are latched in the status register on the rising
edge of the clock. Status results are valid at the same time as associated data results
are valid.
Duration and availability of status results are affected by the same timing constraints
that apply to data results on the Y bus. Status outputs are enabled by two signals,
OEC for comparison status and OES for other status and exception outputs. Status
outputs are summarized in Tables 14 and 15.
Table 14. Comparison Status Outputs
SIGNAL

RESULT OF COMPARISON (ACTIVE HIGH)

AEQB

The A and B operands are equal. A high signal on the AEQB output indicates a
zero result from the selected source except during a compare operation in the ALU.
During integer operations, indicates zero status output.

AGTB

The A operand is greater than the B operand.

UNORD

The two inputs of a comparison operation are unordered, i.e., one or both of the
inputs is a NaN.

During a compare operation in the ALU, the AE08 output goes high when the A and
8 operands are equal. When any operation other than a compare is performed, either
by the ALU or the multiplier, the AE08 signal is used as a zero detect.

7-77

Table 15. Status Outputs
SIGNAL

STATUS RESULT

CHEX

If 16 is low, indicates the multiplier is the source of an exception during a chained
function. If 16 is high, indicates the ALU is the source of an exception during a
chained function.

DENIN

Input to the multiplier is a denorm. When DENIN goes high, the STEX pins indicate
which port had the denormal input.

DENORM

The multiplier output is a wrapped number or the ALU output is a denorm. In the
FAST mode, this condition causes the result to go to zero. It also indicates an
invalid integer operaion, i.e., PASS (-A) with unsigned integer operand.

DIVBYO

An invalid operation involving a zero divisor has been detected by the multiplier.

ED

Exception detect status signal representing logical OR of all enabled exceptions
in the exception disable register.

INEX
INF

(J)

2
~

The result of an operation is not exact.
The output is the IEEE representation of infinity.

IVAL

A NaN has been input to the multiplier or the ALU, or an invalid operation
[(0 * (0) or (+ 00 - (0) or (- 00 + (0)) has been requested. This signal also goes
high if an operation involves the square root of a negative number. When IVAL
goes high, the STEX pins indicate which port had the NaN.

NEG

Output value has negative sign.

OVER

The result is greater than the largest allowable value for the specified format.

RNDCO

The mantissa of a number has been increased in magnitude by rounding. If the
number generated was wrapped, then the unwrap round instruction must be used
to properly unwrap the wrapped number (see Table 8).

SRCEX

The status was generated by the multiplier. (When SRCEX is low, the status was
generated by the ALU.)

STEXO

A NaN or a denorm has been input on the B port.

STEXl

A NaN or a denorm has been input on the A port.

UNDER

The result is inexact and less than the minimum allowable value for the specified
format. In the FAST mode, this condition causes the result to go to zero.

In chained mode, results to be output are selected based on the state of the 16 (source
output) pin (if 16 is low, ALU status will be selected; if 16 is high, multip[ier status
will be selected). If the nonse[ected output source generates an exception, CHEX is
set high. Status of the nonse[ected output source can be forced using the SELST pins,
as shown in Table 16.

»

(")
~
(X)
(X)

~

'-I

7-78

MULTIPLIER STATUS REGISTER

ALU STATUS REGISTER
18

18

SElST1-0, 16

\.

OES~

STATUS OUTPUT

b---OEC

COMPARISON
STATUS OUTPUT

Figure 22. Status Output Control

-..J

~
co

SN74ACT8847

Table 16. Status Output Selection (Chained Mode)
SELST1SELSTO
00
01
10
11

STATUS SELECTED
Logical OR of ALU and multiplier exceptions (bit by bit)
Selects multiplier status
Selects ALU status
Normal operation (selection based on result source specified by 16 input)

An exception detect mask register is available to mask out selected exceptions from
the multiplier, ALU, or both. Multiply status is disabled during an independent ALU
instruction, and ALU status is disabled during multiplier instructions. During chained
operation, both status outputs are enabled.
When the exception mask register has been loaded with a mask, the mask is applied
to the contents of the status register to disable unnecessary exceptions. Status results
for enabled exceptions are then ORed together and, if true, the exception detect (ED)
status output pin is set high (see Figure 23). Individual status outputs remain active
and can be read independently from mask register operations.

7-80

EXCEPTION
DETECT MASK

MULTIPLIER

MULTIPLIER

ALU

5

6

SELST1-0, 16

\.

OES----o.

ED

Figure 23. Exception Detect Mask Logic
-...J

00

SN74ACT8847

Microprogramming the ' ACT884 7
Because the' ACT884 7 is microprogrammable, it can be configured to operate on either
integer or single- or double-precision data operands, and the operations of the registers,
ALU, and multiplier can be programmed to support a variety of applications. The
following sections present not only control settings but the timings of the specific
operations required to execute the sample instructions.

Control Inputs
Control inputs to the 'ACT8847 are summarized in Table 17 below. Several of the
inputs have already been discussed; refer to the page listed in the table for detailed
information.
The remaining inputs are discussed in the following sections. All control signals and
their associated tables are also listed in the' ACT884 7 Reference Guide to provide
a complete, easy-to-access reference for the programmer already familiar with
, ACT884 7 operation.

7-82

Table 17. Control Inputs
SIGNAL

HIGH

BYTEP

Selects byte parity generation and
test
Clocks all registers (except C) on
rising edge
Clocks C register on rising edge
Enables temporary input register
load on falling clock edge
See Table 6 (RA and RB register
data source selects)
No effect

CLK
CLKC
CLKMODE
CONFIG1CONFIGO
ENRC
ENRA

If register is not in flowthrough,
enables clocking of RA register

ENRB

If register is not in flowthrough,
enables clocking of RB register
Places device in FAST mode

FAST
FLOW_C

HALT

OEC

DES
OEY
PIPES2PIPESO
RESET

RND1RNDO
SELOP7SELOPO
SELMS/LS

SELST1SELSTO
SRCC
TP1-TPO

Causes output value to bypass C
register and appear on C register
output bus.
No effect

Disables compare pins
Disables status outputs
Disables Y bus
See Table 5 (Pipeline Mode
Control)
No effect

See Table 18 (Rounding Mode
Control)
See Tables 10 and 11 (Multiplier!
ALU operand selection)
Selects MSH of 64-bit result for
output on the Y bus (no effect on
single-precision operands)
See Table 16 (Status Output
Selection)
Selects multiplier result for input
to C register
See Table 22 (Test Pin Control
Inputs)

LOW

PAGE
NO.

Selects single bit parity
generation and test
No effect

7-75

No effect
Enables temporary input
register load on rising clock edge
See Table 42 (RA and RB
register data source selects)
Enables C register load when
CLKC goes high.
If register is not in flowthrough,
holds contents of RA
register
If register is not in flowthrough,
holds contents of RB register
Places device in IEEE mode

7-70
7-66

No effect

7-72

Stalls device operation but
does not affect registers, internal
states, or status. C register
loading is not disabled
Enables compare pins
Enables status outputs
Enables Y bus
See Table 5 (Pipeline Mode
Control)
Clears internal states, status,
internal pipeline registers, and
exception disable register. Does
not affect other data registers.
See Table 18 (Rounding Mode
Control)
See Tables 10 and 11
(Multiplier!ALU operand selection
Selects LSH of 64-bit result for
output on the Y bus (no effect
on single-precision operands)
See Table 16 (Status Output
Selection)
Selects ALU result for input to
C register
See Table 22 (Test Pin Control
Inputs)

7-85

7-62

7-65
7-70
7-65

7-65
7-84

7-77
7-77
7-74
7-62
7-86

7-84
7-68
7-74

7-78
7-70
7-86
7-83

Rounding Modes
The' ACT884 7 supports the four IEEE standard rounding modes: round to nearest,
round towards zero (truncate). round towards infinity (round up), and round towards
minus infinity (round down). The rounding function is selected by control pins RND1
and RNDO, as shown in Table 18.
Table 18. Rounding Modes
RND1-

ROUNDING MODE SELECTED

RNDO

o

0
0 1
1 0
1 1

Round
Round
Round
Round

towards
towards
towards
towards

nearest
zero (truncate)
infinity (round up)
negative infinity (round down)

Rounding mode should be selected to minimize procedural errors which may otherwise
accumulate and affect the accuracy of results. Rounding to nearest introduces a
procedural error not exceeding half of the least significant bit for each rounding
operation. Since rounding to nearest may involve rounding either upward or downward
in successive steps, rounding errors tend to cancel each other.
In contrast, directed rounding modes may introduce errors approaching one bit for
each rounding operation. Since successive rounding operations in a procedure may
all be similarly directed, each introducing up to a one-bit error, rounding errors may
accumulate rapidly, especially in single-precision operations.

FAST and IEEE Modes
The device can be programmed to operate in FAST mode by asserting the FAST pin.
In the FAST mode, all denormalized inputs and outputs are forced to zero.

~

Placing a zero on the FAST pin causes the chip to operate in IEEE mode. In this mode,
the ALU can operate on denorrnalized inputs and return denormals. If a de norm is input
to the multiplier, the DENIN flag will be asserted, and the result will be invalid. Denormal
numbers must be wrapped before being input to the multiplier. If the multiplier result
underflows, a wrapped number will be output.

~

Handling of Denormalized Numbers (FAST)

~

The FAST input selects the mode for handling denormalized inputs and outputs. When
the FAST input is set low, the ALU accepts denormalized inputs but the multiplier
generates an exception when a denormal is input. When FAST is set high, the DENIN
status exception is disabled and all denormalized numbers, both inputs and results,
are forced to zero.

~
(X)

~
""'"

A denormalized input has the form of a floating point number with a zero exponent,
a nonzero mantissa, and a ~ero in the leftmost bit of the mantissa (hidden or implicit
bit). A denormalized number results from decrementing the biased exponent field to
7-84

zero before normalization is complete. Since a denormalized number cannot be input
to the multiplier, it must first be converted to a wrapped number by the ALU. When
the mantissa of the denormal is normalized by shifting it left, the exponent field
decrements from all zeros (wraps past zero) to a negative two's complement number
(except in the case of 0.1 XXX ... ). where the exponent is not decremented.
Exponent underflow is possible during multiplication of small operands even when the
operands are not wrapped numbers. Setting FAST = 0 selects gradual underflow so
that denormal inputs can be wrapped and wrapped results are not automatically
discarded. When FAST is set high, denormal inputs and wrapped results are forced
to zero immediately.
When the multiplier is in IEEE mode and produces a wrapped number as its result,
the result may be passed to the ALU and unwrapped. If the wrapped number can be
unwrapped to an exact denormal, it can be output without causing the underflow status
flag (UNDER) to be set. UNDER goes high when a result is an inexact denormal, and
a zero is output from the FPU if the wrapped result is too small to represent as a
denormal (smaller than the minimum denorm). Table 10 describes the handling of
wrapped multiplier results and the status flags that are set when wrapped numbers
are output from the multiplier.
Table 19. Handling Wrapped Multiplier Outputs
TYPE
OF RESULT

STATUS FLAGS SET

NOTES

DENORM

INEX

RNDCO

Wrapped,
exact

1

0

0

Unwrap with 'Wrapped
exact' ALU instruction

Wrapped,
inexact

1

1

0

Unwrap with 'Wrapped
inexact' ALU instruction

Wrapped,
increased in
magnitude

1

1

1

Unwrap with 'Wrapped
rounded' ALU instruction

When operating in chained mode, the multiplier may output a wrapped result to the
ALU during the same clock cycle that the multiplier status is output. In such a case
the ALU cannot unwrap the operand prior to using it, for example, when accumulating
the results of previous multiplications. To avoid this situation, the FPU can be operated
in FAST mode to simplify exception handling during chained operations. Otherwise,
wrapped outputs from the multiplier may adversely affect the accuracy of the chained
operation, because a wrapped number may appear to be a large normalized number
instead of a very small denormalized number.
Because of the latency associated with interpreting the FPU status outputs and
determining how to process the wrapped output, it is necessary that a wrapped operand
be stored external to the FPU (for example, in an external register file) and reloaded
to the A port of the ALU for unwrappjng and further processing.

7-85

Stalling the Device
Operation of the 'ACT884 7 can be stalled nondestructively by means of the HALT
signal. Bringing the HALT input low causes the device to inhibit the next rising clock
edge. Register contents are unaltered when the device is stalled, and normal operation
resumes at the next low clock period after the HALT signal is set high.
Stalling the device does not stall the C register. If ENRC is low, CLKC will clock in
data from the source selected by SRCC.
For some operations, such as a double-precision multiply with CLKMODE = 1, setting
the HALT input low may interrupt loading of the RA, RB, and instruction registers,
as well as stalling operation. In clQck mode 1, the temporary register loads on the falling
edge of the clock, but the HALT signal going low would prevent the RA, RB, and
instruction registers from loading on the next rising clock edge. It is therefore necessary
to have the instruction and data inputs on the pins when the HALT signal is set high
again and normal operation resumes.

RESET
The RESET input is an active-low signal that asynchronously clears the internal states,
status, and exception disable mask. Internal pipeline registers are cleared, but the RA,
RB, and C registers are riot. Operation resumes when RESET goes high again.

Test Pins
Two pins, TP1-TPO, support system testing. These may be used, for example, to place
all outputs in a high-impedance state, isolating the chip from the rest of the system
(see Table 20).
Table 20. Test Pin Control Inputs
TP1TPO
0
0
0
1
0
1
1

7-86

1

OPERATION
All outputs and I/Os are forced low
All outputs and I/Os are forced high
All outputs are placed in a high impedance state
Normal operation

Independent ALU Operations
Configuration and operation of the' ACT884 7 can be selected to perform single- or
double-precision floating point and integer calculations in operating modes ranging from
flowthrough to fully pipelined. Timing and sequences of operations are affected by
settings of clock mode, data and status registers, input data configurations, and
rounding mode, as well as the instruction inputs controlling the ALU and the multiplier.
Three modes of operation can be selected with inputs 110-10, including independent
ALU operation, independent multiplier operation, or simultaneous (chained) operation
of ALU and multiplier. Each of these operating modes is treated separately in the
following sections.
The ALU executes single- and double-precision operations which can be divided
according to the number of operands involved, one or two. Tables 21 and 22 show
independent ALU operations with one operand, along with the inputs 110-10 which
select each operation. Conversions from one format to another are handled in this mode,
with the exception of adjustments to precision during two-operand ALU operations.
The wrapping and unwrapping of operands is also done in this mode.
Most format conversions involve double-precision timing. Conversions between singleand double-precision floating point format are treated as mixed-precision operations
requiring two cycles to load the operands. A single-precision number is loaded in the
upper half (MSH) of its input register. During integer to floating point conversions,
the integer input should be loaded into the upper half of the RA register. If converting
from integer to double precision, then two cycles are required.
Logical shifts can be performed on integer operands using the instructions shown in
Table 22. The data operand to be shifted is input from any valid operand source and
the number of bit positions the operand is to be shifted is input only from the DB bus.
The shift number on the DB bus should be in positive 32-bit integer format, although
only the lowest eight bits are used. The shift number cannot be selected from sources
other than the RB register, and the shift number must be loaded on the same cycle
as the instruction.

"

c::t

00
00
I-

(J

1

1~.....,,! !I

, -_ _-1.

:

2ND OPS

SELMS/LS

OUT(31,O) STATUS(18,O)
NOTE: Assume PIPES2-0=110, CLKMODE=O, CONFIG1-0=OO, ENRA=l, ENRB=l, OEY=O, OEC=OES=O, RESET=HALT=l, TP1-0=11

';'I

o

eN

Figure 29. Double-Precision Independent ALU Operation, Input Registers Enabled
(PIPES2-PIPESO = 110, CLKMODE = 0)

SN74ACT8847

L17881:>"17LNS
-.J
,
~

0

.j:o.

-

Load Rest
of Second
Operands

Load Rest
of First
Operands

Begin Second
Operation

Load HaH
of First
Operands

Bagln First
Operation

Load Half
of Second
Operands

Load Output

~

l

~

~

I
14

11

.1

I

I

~

I

I

ClK

..i....-_ _---'-,-..." ,

FIRST INSTRUCTION
I

If- 16

---+I

INSTRUCTION:

i

II

I

i

.Ie

II

I

I

k-22~
I

... 22 ....... 16 .....

I

I

:

I
I

I

I

I

I

I

I

I

HALF
2ND OPS

II
i
II
.1.... 17 ----..! I+- 17
23
23
DATA(31,0) A AND B INPUTS
I

17 ~

I

22 -M---+I I+- 16 ---.J

r-------~--~, Ir--~!--~' •
HALF
1ST OPS

_ _"'"

SECOND INSTRUCTION

FUNC(10,0), RND(1,0), FAST
I

I

I , I~_....L..._

I

HALF
3RD OPS

REST
2ND OPS

II
I
tM---+I~

23

171
I
I

II

I

23-+1~

171

I

REST
3RD OPS

II

i

I

23 -.... JMM-- 23 ~
17

I

L
SElMS/lS

OUT(31 ,0) STATUS(18,01
NOTE: Assume PIPES2-0=010, CLKMODE=1, CONFIG1-0=11, ENRA=1, ENRB=1, OEY=O, OEC=OES=O, RESET=HALT=1, TP1-0=11

Figure 30. Double-Precision Independent ALU Operation, Input and Output Registers Enabled
(PIPES2-PIPESO = 010, CLKMODE .. 1)

Load Rest
of First
Operands

Load Half
of First
Operands

I

I

I

,

14

I

I

I

I

I
i
1+-16 --+I

INSTRUCTION:

10

I

I

L

I

~14

10---+\
I

I

Ir--------------------~

I

J

II

I

II

I

22 *---+11+-16 --+I

I

22~1+-16

I

I

i

'I

I

I
I

I
I

HALF
3RD OPS

REST
3RD OPS

I

I

22-!+--+1

-+I

FUNC(10.0l. RND(1.0L FAST

i

I -"'"',
r------..I..

'

REST
1ST OPS

II
17 ~ I+- 17
I

I+-

!

!

FIRST INSTRUCTION

I'
I

Load Output

1
I

Begin Second
Operation

load Pipeline

1
I

ClK

Load Half
of Second
Operands

Begin First
Operation

I

Load Rest
of Second
Operands

23

I
~4

I

HALF
2ND OPS

II
~I I+-- 17

23

I
~14

I I
~I I+-- 17

23

II

I
~I.

--+1+---+114---- 17
I

23

II
~ \4-- 17

,,'-_ _ _ _ _ _ __

I
~ 23

I

~I.

23

DATA(31.0) A AND B INPUTS

L__J

SElMS/lS

I

OUT(31.0) STATUS(1.8.0)

I

1+--+1
4

NOTE: Assume

-;J
~

o

01

I

I

........
5

I

I

I
I

I

I

~

14-+1

4

5

4

PIPES2-0~OOO, CLKMODE~O, CONFIG1-0~11, ENRA~l, ENRB~l, OEY~O, OEC~OES~O, RESET~HALT~1,

Figure 31. Double-Precision Independent ALU Operation, All Registers Enabled
(PIPES2-PIPESO = 000, CLKMODE = 0)

SN74ACT8847

I

1+-+1

TP1-0

11

Sample Independent Multiplier Microinstructions
The following independent multiplier timing diagram exam pies show five register
settings, ranging through fully pipelined. Examples for divide and square root are
inqluded in this section. X = don't care.

FIRST INSTRUCTION

:~

__________ ______
~I

J,~

__________

INSTRUCTION: FUNC(10,O), RND(1,O), FAST

==*
I

I

I

FIRST OPERANDS

~

~

____- J

~--------------

I

:

SECOND OPE,RANDS

X. . _______________

~

DATA(31,O) A AND B INPUTS:

I::
~

I~

FIRST
~ SECOND
RESULT ~ RESULT

I
I
It---- 1 - - - .
OUT(31,O), STATUS(18,O)

NOTE:

I

\4---

I

1

----t

Assum~PES2~1, CONFIG1-0=Ol, ENRA=X, ENRB=X, SELMS/LSX, OEY=O,
OEC=OES=O, RESET = HALT = 1 TP1-0=11

Figure 32. Single-Precision Independent Multiplier Operation, AU Registers
Disabled (PIPES2-PIPESO .. 111, CLKMODE .. Xl

7-106

load Second Operands
regin Second Operation

load First Operands
Begin First Operation

l

~__- ,__- J I

i~~~~~J

~~~~~~~~~~~

I.-

...- 16 ..;.. 22..1
INSTRUcrTION: FUNC(10,O), RND(1,OI. FAST

( Op~l:j~DS ~ O~~~~~~S ~
~

I

...- 17 .... 23-tf
DATA(31 ,0) A AND B INPUTS
I
I

~
14

2.1

I

I!

..-- 17 ....N...
~~Mo
... 23
I

R~~~~T

I
I

~
~

2

.,\

OUTl31.01 STATUS(18.01
NOTE: Assume PIPES2-0 = 110. CONFIG 1-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X, OEY = 0,
OEC =OES=O, RESET=HALT= 1 TP1-0= 11

Figure 33. Single-Precision Independent Multiplier Operation, Input
Registers Enabled (PIPES2-PIPESO - 010, CLKMODE = Xl

7-107

load First Operands
Begin First Operation

load Second Operands
Begin Second Operation

l

~

I

I

I

ClK

I

Ie

9------~~
I
I

I
I

I

I

I

I

I

.... 16 ..... 22 ~

14- 16

I

I

INSTRUCTION: FUNC(10.01. RND(1.01. FAST
;

{
I

I

i

I
I
-1~~_.r.1~22

I
I

,:

I

I

I

op~ld~DS
~ o~~~~~~S ~
I
I

It- 17 ... 23 ~

DATA(31.01 A AND B INPUTS

I

I

I

14- 1 7 - ....IIIt--... 23
f--1

I
I

~

FIRST RESULT

----~--------------------------~--- I
I ~-------------------~4-..1
OUT(31.01 STATUS(18.01

NOTE: AssumillPES2-0=010. CONFIG1-0=01. ENRA=1. ENRB=1. SELMS/LS=X. OEY=O.
OEC=OES=O. RESET = HALT = 1 TP1-0= 11

Figure 34. Single-Precision Independent Multiplier Operation, Input and Output
Registers Enabled (PIPES2-PIPESO .. 010. CLKMODE ... XI

,-7c108

Load Second
Operands
Load First
Operands

Begin Second
Operation

Begin First
Operation

I

Load Pipeline

Begin Fourth
Operation

Begin Fifth
Operation

Load Pipeline

Load Pipeline

Load Pipeline

Load Output

Load Output

Load Output

.

I

I

I

I

I

I

10

I

j

-,-..

-----+I

.,~

16

II

~
I

FOURTH
INSTRUCTION

16

~

~--.I

1

I

\I I i

L

I

SECOND
INSTRUCTION

INSTRUCTION: FUNC(10,0), RND(1,0) FAST
,
I
I
If
I
I
I,
!
II
"

+

•

i 22 II

"Oil

I

M4

I

FIRST
INSTRUCTION

I

Begin Third
Operation

.

14--- 1 0

I

Load Fifth
Operands

.

, . . - - -.....-~,

14- 16

Load Fourth
Operands

.

I

CLK

Load Third
Operands

SECOND
OPERANDS

\I oJ

FOURTH
OPERANDS

17
DATA(31 ,0) A AND B INPUTS

,

I
I
t

I

.
\I

1

1
1

FIFTH
OPERANDS

t 23"

~

~ I+-

l231

17 ~

~.

I

I
I

,...-----..1

~1'l.ALJ1'

OUT(31,0) STATUS(18,0)

,,----.

j

I
I
14-4~

NOTE: Assume PtPES2-0=OOO, CONFIG1-0=01, ENRA=1, ENRB=1, SELMS/LS=X, OEY=O, OEC=OES=O, RESET=HALT=1, TP1-0=11

-;J
.....
o

Figure 35. Single-Precision Independent Multiplier Operation, All Registers Enabled
(PIPES2-PIPESO =< 000, CLKMODE -= Xl

c.o

SN74ACT8847

Load Half

____-'-!tJ-f-O-P.-'.-~d-S-.....,L-____~r·d Pipeline
C~

I
I

I
I

<

I

I

FIRST INSTRUCTION

I

I

~ 16-_~1'----INSTRUCTION:

I

~~

22

------+~::

FUNCI10,OI. RNDI1,OI. FAST

I

I

I

__~,~~t~~~~;___-I>}~__~'~~~~~PS~_,-

I'-- 1 7 --.tI~t--- 23-~.ttlt---

1,8

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ___

--.I

DATAI31.01 A AND B INPUTS

I

SELMS/LS

I

I

------------------------~I

Ir-----~I

I

I
_______________.......~
OUTl31,Ol STATUSI1B,OI

~:~~

k---:-- 3 ----.I

NOTE: Assume PIPES2-0 = 111, CLKMODE = 0, CONFIG 1-0
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

=

I~----

~.......-F:::~~::.:~~~
~5---.t

11, ENRA

=

X, ENRB

=

X, OEY

Figure 36. Double-Precision Independent Multiplier Operation, All Registers
Disabled (PIPES2-PIPESO .. 111, CLKMODE .. 0)

=

0,

Load Rest
of Second
Load Pipeline
Operands
Load Half
of Second Begin Second
Operands Operation

Load Rest
of First
Operands
Load Half
Begin First
of First
Operands Operation

!

!

l

I

CLK

,I

I,

1

!

l

~~--~I

~I--~___

I

I

I
I

I
I
I

I

I

I

SECOND INSTRUCTION

:.

16 I
I

INSTRUCTION,:

I

I----J- 22

.1I

FUNC(10,01. RND(1,0), FAST
I

I

II.

I

\4----tI- 22

I

I

I
I

I
REST
1ST OPS

I

,

,14- 16 --./

,

...... 17 .............17..- 23
23

HALF
2ND OPS

II. "u1l4
I::

I

--...

17

23 17

I

23-...1

II

,
I

DATA(31 ,0) A AND B INPUTS

SELMS/LS
__________________

I

_'4

I
I
I
II

i 'I

~,

~~~~~~_

~

~
I
I
-------

-------------------..

OUTl31 ,0) STATUS(18,0)

~

~

3

5

NOTE: Assume PIPES2-0 = 110, CON FIG 1-0 = 11, ENRA = 1, ENRB = 1, OEY = 0, OEC = OES = 0,
RESET= HALT = 1, TP1-0 = 11

Figure 37. Double-Precision Independent Multiplier Operation, Input Registers
Enabled (PIPES2-PIPESO - 110, CLKMODE = 1)

7-111

load Rest
of First
Operands
load Half
of First
Operands

•

load Half
of Second
Operands

Begin Second
Operation

Begin First
Operation

load Pipeline

load Output

~

+

+

I

ClK

Load Rest
of Second
Operands

14-- 9
I

-_1+-- 9 "'---+III
THIRD INSTRUCTION

SECOND INSTRUCTION
I

I

~16

INSTRUCTION:

I

I
~22

II

!+" 22 -.t 14-16 -.t
FUNC(10,O), RND(1,O), FAST
I

.....

I

I
I

I

I

I

I

I

r---~----~!Ir-------~
REST
2ND DPS

I
~17

I

II

I

II

I

II

I·

r-----REST
3RD OPS

I

...... 23 ... M-17___ 23-.11&SX~g:E~iM~~~NEXT(DP) )~I~-

~22

16 ~

I

.... ,

..--: 1oP22

~~~~*~Eg~auoTIENT)""_--.I

~3

NOTE: Assume PIPES2-0 = 110, CONFIG1-0 = 01, ENRA = 1, ENRB = 1, OEY = 0, OEC = OES = 0,
RESET = HAlT = 1, TP1-0 = 11

Figure 44. Double-Precision Floating Point· Division
(PIPES2-PIPESO - 11 0, CLK~ODE .. 0)
2

3 4 5 6 7 8 9 10 11 12

13

14

NOTE: Assume PIPES2-0 = 100, CONFIG1-0 = 01, ENRA = 1, ENRB = 1, OEY = 0, OEC = OES = 0,
RESn = HALT = 1, TP1-0 =·11

Figure 45. Double-Precision Floating Point Division
(PIPES2-PIPESO .. 100, CLKMODE .. 0)

7-116

2

.....- .....elK

INST

-

i

14

13

3 4 5 6 7 8 9 10 11 12

-

.....-

I I
~__~I~I_

OIV

14-16

~~~**~~

-.I k-M- 22

I
I

I

NEXT (OPI

\4-16"':

NOTE: Assume PIPES2-0 = 010, CONFIG1-0 = 01, ENRA = 1, ENRB
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

=

»)-,---!.-.i22

I

1, SELMS/LS

=

X, OEY = 0,

Figure 46. Double-Precision Floating Point Division
(PIPES2-PIPES() - 010. CLKMODE = 1)
2

3 4 5 6 7 8 9 10 11 12

NOTE: Assume PIPES2-0 = 000, CONFIG1-0 = 00, ENRA = 1, ENRB
RES'Ei' = HALT = 1, TP1-0 = 11

14

13

=

1, OEY
.

=

0, OEC

=

OES = 0,

Figure 47. Double-Precision Floating-Point Division. All Registers Enabled
(PIPES2-PIPESO - 000. CLKMODE - 1)

7-117

-

123466789101112131416

ClK

-

16
~

NOTE: ~me..f!f.ES2-0 = 110, CQtill91-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X, OEY
OEC = OES ;., 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

=

0,

Figure 48. Integer Division, Input Registers Enabled
(PIPES2-PIPESO - 110, CLKMODE - XI

1

,......,..,..

2 346 6

789 1011121314 16

.....-

16

.....-

ClK

NOTE: Assume PIPES2-0 = 100, CONFIG1-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X,
OEC = DES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

Figure 49. Integer Division, Input and Pipeline Registers Enabled
(PIPES2-PIPESO .. 100, CLKMODE - XI

7-118

OEY = 0,

2

3 4 5 6 7 8

9 10 11 12 13 14

15

-

16

17

r--

r--

NOTE: Assume PIPES2-0 = 010, eONFIG1-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X, OEY
OEe = OES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

=

0,

Figure 50_ Integer Division. Input and Output Registers
Enabled (PIPES2-PIPESO - 010. CLKMODE = Xl .

1
2 3
r--

4

5 6 7 8 9 1011121314

15

16

17

r--

r--

r--

elK
-I

-

I

I

INST~:V{P~T;;I~~!---I-I----16--.t

~I
~22

---'

I

y

I
~ I
I

16--.1!.-1
22--t

~~NgEIE~MIN~~~UOTlENT>--I

t--4

NOTE: Assume PIPES2-0 = 000, eONFIG1-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X, OEY
DEe = OES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

=

0,

Figure 51. Integer Division. All Registers Enabled
(PIPES2-PIPESO ... 000. CLKMODE - Xl

7-119

23456789

11

10

NOTE: Assume PIPES2-0 = 110, CONFIG1-0 = 01, ENRA = 1, ENRB = 1, SElMS/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 52. Single-Precision Floating Point Square Root, Input
Registers Enabled (PIPES2-PIPESO - 110, CLKMODE - XI
23456789

10

NOTE: ~me.E!f.ES2-0 -=--1.!Q, C~1-0 = 01, ENRA = 1, ENRB = 1, SELMS/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 53. Single-Precision Floating Point Square Root, Input and Pipeline
Registers Enabled (PIPES2-PIPESO - 100, CLKMODE - Xl

7-120

23456789

-

11

10

-

CLK
-I

1SQUARE
1

ROOT

1

INST~U~D!TrRXliE~I--1641

~I

-.I

16-.1

14- 22

14-1

~ 22

.....,

I

y

~U~~1~~Eg~S~~~~E>4 ~

t---

NOTE: Assume PIPES2-0 = 010, eONFIG1-0 = 01, ENRA = 1, SELMS/LS = X, OEY = 0,
OEe = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 54. Single-Precision Floating Point Square Root, Input and Output
Registers Enabled (PIPES2-PIPESO ... 010, CLKMODE .. XI
23456789
r--

10

11

~

~

CLK
-I

1SQUARE
1

ROOT

I

--rnsx:U~D!T!RXI~E~I---+----16-.t ~I
16-.j 14-1

INST

-.I

I

y

14-22

.....,

14-22

1

~N1E!Eli~N!D~S~~~~E>4--1 t---

NOTE: Assume PIPES2-0 = 000, eONFIG1-0 = 00, ENRA = 1, SELMS/LS = X, OEY = 0,
OEe = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 55. Single-Precision Floating Point Square Root, All Registers Enabled
(PIPES2-PIPESO ... 000, CLKMODE - XI

7-121

23456789101112131415

17

16

-

~~

-

~

CLK

NOTE: Assume PIPES2-0 = 110, CONFIG1-0
RESET = HALT = 1, TPt-O = 11

=

11, ENRA

=

1, OEY

=

0, OEC

=

OES

=

0

Figure 56. Double-Precision Floating Point Square Root, Input
Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = 1)
2

3 4 5 6 789101112131415

17

16

CLK
I

II
, II

I
I

I

INST

)~I----------

NEXTIDP)
j..16

I

~

I

",22

I

NOTE: Assume PIPES2-0 = 100, CONFIG1-0
RESET = HALT = 1, TP1-0 = 11

en
:2
~
~

=

01, ENRA

=

1, OEY

=

0, OEC

=

OES

=

0,

Figure 57. Double-Precision Floating Point Square Root, Input and Pipeline
Registers Enabled (PIPES2-PIPESO - 100, CLKMODE - 0)

»

(")
~
CX)
CX)
,~
~

7-122

2

3

4 5 6 7 8 9 101112131415

16

17

NOTE: Assume PIPES2-0 = 010, CONFIG1-0 = 10, ENRA = 1, OEY = 0, OEC = OES = 0,
RESET = HALT = 1, TP1-0 = 11

Figure 58. Double-Precision Floating Point Square Root, Input and Output
Registers Enabled (PIPES2-PIPESO - 010, CLKMODE = 1)
23456789101112131415

16

17

elK

INST

I

I

I

I

I

I

I

I

NEXT (DP)

k- 16 --rI

I
I

)~I----~I---------I

~ 22

I
I

NOTE: Assume PIPES2-0 = 000, CONFIG1-0 = 00, ENRA = 1, OEY = 0, OEC = OES = 0,
RESET = HALT = 1, TP1-0 = 11

Figure 59. Double-Precision Floating Point Square Root, All
Registers Enabled (PIPES2-PIPESO - 000, CLKMODE - 0)

,....
or::t

ex)
ex)
~

u

«

or::t
,....

Z

(J)

7-123

23456789101112131415161718

19

20

NOTE: Assume PIPES2-0 = 110, CON FIG 1-0 = 01, ENRA = 1, SELM/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1 TP1-0 = 11. The result appears in the SREG.

Figure 60. Integer Square Root, Input Registers Enabled
(PIPES2-PIPESO - 110, CLKMODE .. Xl
2 3 4 5 6 7 89101112131415161718

19

20

CLK.

I
I

SQUARE
ROOT

I
I

INST~ioiT~R~liE~----16 ~

~I
~

It- 22

16

-.I !.-I
~ I't- 22

Y~U:~1R:*Eg~S~~~~Er-.j ~3
tJ)

NOTE: Assume PIPES2-0 = 100, CONFIG1-0 = 00, ENRA = 1, SELMS/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

2

"

Figure 61. Integer Square Root, Input and fipeline Registers Enabled
(PIPES2-PIPESO - 100, CLKMQDE - Xl

t

(")

-t

00
00
~

"

7-124

2 3 4

5 6

7 8 9101112131415161718

19

20

NOTE: Assume PIPES2-0 = 010, CONFIG1-0 = 01, ENRA = 1, SELMS/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

Figure 62. Integer Square Root, Input and Output Registers Enabled
(PIPES2-PIPESO - 010, CLKMODE
Xl
2 3 4

5 6 7 8 9 101112131415161718

19

20

NOTE: Assume PIPES2-0 = 000, CONFIG1-0 = 00, ENRA = 1, SELMS/LS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11. The result appears in the SREG.

Figure 63. Integer Square Root, All Registers Enabled
(PIPES2-PIPESO - 000, CLKMODE - Xl

7-125

Sample Chained Mode Microinstructions
The following chained mode timing diagram examples show four register settings,
ranging from fully flowthrough to fully pipelined.

FIRST INSTRUCTION

:~

__________~I______J,~__________~____- J

INSTRUCTION: FUNC(10,0), RND(1,0), FAST

I

I

~

FIRST OPERANDS

i

~

~--------------

I

I

SECOND

OP~RANDS

X'-__________

I

DATA(31 ,0) A AND B INPUTS:

~ :~~~T ~ ~i;~~~~
I

I

1 ~
OUT(31.0), STATUS(18,0)

It--

I

14--

I

1

-----..

NOTE: Assume PIPES2-0 = 111, CONFIG1-0 = 01, ENRA = X, ENRB = X, SELMS/LS, DEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 64. Single-Precision Chained Mode Operation, All Registers Disabled
(PIPES2-PIPESO - 111, CLKMODE ... Xl

7-126

ClK

load half
of First
Operands

load Rest
of First
Operands

load Helf
of Second
Operands

load Rest
of Second
Operands

load Half
of Third
Operands

load Rest
of Third
Operands

l

l

~

~

l

l

I

I

I

I

I

I
I
I

I
I

I
I
I
I

I
I
I

I

THIRD INSTRUCTION

FIRST INSTRUCTION

1-16-1
I

I

I

II

J+- 22 +114---+1- 1 6
I

I

INSTRUCTION: FUNC/l0,OI. RND/l,O), FAST
I

I

II

I--+J If- 16 +I
122

I

I

I

I

r---~--~

I

r--~--~

I

f4---* 22

I
I
I
I

II

...
.11f
.. ---ti4--.......
: 23
17

m
I

~

FIRST

--------- I
I ------- I
I
OUT/3l,O) STATUS/la,O) ~ 2
~ 2

SECOND

x~_

NOTE: Assume PIPES2-0 = 110, CONFIG1-0 = 11, ENRA = 1, ENRB = 1, SElMS/lS = X, OEY = 0,
OEC = OES = 0, RESET = HALT = 1, TP1-0 = 11

Figure 65. Single-Precision Chained Mode Operation, Input Registers Enabled
(PIPES2-PIPESO = 110, CLKMODE ... 1)
.

7-127

load First Operands
Begin First Operation

elK

load Second Operands
Begin Second Operation

1

l ·

I

I

I

~

I

9------~~

I

I

.17

I
"23~

DATAI31.0) A AND B INPUTS

I

I

t.- 17

I

~III

23

I
I

~
~

FIRST RESULT

---------------------------------I1f--4...... I~------------------OUTI31.0) STATUSI18.0)
NOTE: Assume PIPES2-0 = 010, CONFIG1-0 = 01, ENRA = 1, SELMS/LS = X, OEY = 0,
OEC = OES= 0, ReSeT = HALT = 1, TP1-0 = 11

Figure 66. Single-Precision Chained Mode Qperation. Input and Output Registers
Enabled (PIPES2-PIPESO - 010. CLKMODE - XI

7-128

Load Second
Operands
Load First
Operands

Begin Second
Operation

Begin First
Operation

Load Pipeline

.

I

.

I

CLK

FIRST
INSTRUCTION

I

1416
I

Load Fifth
Operands

Begin Third
Operation

Begin Fourth
Operation

Begin Fifth
Operation

Load Pipeline

Load Pipeline

Load Pipeline

Load Output

.

Load Output

Load Output

I

,r----...

+

L

I

.....

~10
I
I

Load Fourth
Operands

.

,

I

Load Third
Operands

10 ----.!

I

I

I

SECOND
INSTRUCTION

i 22 II
""
.11f-16
I
II

~
I

..I

INSTRUCTION: FUNC(10.0)' RND(1.0) FAST
•
I
1
11
I
I
II
!
Ii
I

Ii

"

.J

SECOND
OPERANDS

u II~----L--v

FOURTH
OPERANDS

17

t 23'1
toI4---tI ~ 17
I
I

DATA(31.0) A AND B INPUTS

t

I

,

-;-J
~

(0

:
.1
II'-l......L.Yi ~I

I
I
14-4~

OUT(31.0) STATUS(18.0)

r-.J

~I

I

---"~Ij

NOTE: Assume PIPES2-0 = 000, CONFIG1-0
TP1-0=11

~...

01, ENRA

1, ENRB

~I

I

14-4--.J

1, SELMS/LS = X, OEY = 0, OEC = OES = 0, RESET = HALT

Figure 67. Single-Precision Chained Mode Operation. All Registers Enabled
(PIPES2-PIPESO = 000. CLKMODE = Xl

SN74ACT8847

,'-_ _-'

I

1,

Load Half

____-!tJ- opf

-

I

ClK

<

---!r.d

8
-'an-d-._ - - ,......._ _ _ _

Pipeline

I

I

I

I
FIRST INSTRUCTION

I

I

~ 16-_~~---INSTRUCTION:

22 - - - - _ . ; :

FUNCll0.01. RNDll.0J. FAST

I

I

I

I

~~___~l;~:~~"~;______-I>}~___~l~;~~~PS~_~_____________________________
It-- 17 - -...11_- 23 -91'111It---- 18
DATA131.01 A AND B INPUTS

SElMS/lS

-----t

I
I

-------------~I

I

.

I

Ir----~I

------------------------------'~
OUT131.01 STATUSI18.01

I--- 3 ---.t

~:~~

Ir----

~'---"':.~;;.;~~'-jI--- 5----.{

NOTE: Assume PIPES2-0 = 111, CONFIG 1-0 = 11, ENRA = 1, ENRB = 1, OEY = 0, OEC = OES = 0,
RESET = HALT = 1, TP1-0 = 11

Figure 68. Double-Precision Chained Mode Operation, All Registers Disabled
(PIPES2-PIPESO - 111, CLKMODE '" 0)

7-130

load Rest
of First
Operands

1

load Pipeline

load Half
of First
Begin First
Operands Operation

!

t

;I

i

I

,

lL..._~---'1
ClK

I

load Rest
of Second
Operands

load Half
of Second Begin Second
Operands Operation

l!

" - 1--:"_~-"""'I

1....----:..__

I

I

FIRST INSTRUCTION

SECOND INSTRUCTION

~'-------------~'------I~--"I ~-------r'------~'------~I--~ ~--------

,,----..
16 ------....,
I

INSTRUCTION:

~ 22

I

FUNC(10.01. RND(1.01. FAST

i4----J. 22

14- 16 -..t

I

I

I

i
I

REST

HALF
' -____~~~----~;I'~~~'S~T~O~P~S--~ , -__~2~N~D~O~PS~~~II~__~~~__~
I II
I
I
I
I JI ,
I

. - 17 ...........17....- 23 --..
23

~e .1

17 Ie

23

Ie

17

.Ie

17 -.I

231
I

DATA(31.01 A AND B INPUTS

,

,
I
I

SElMS/lS

I HALF

-----------------------------------OUT(31.01 STATUS(18.01

I

I

,

~I~----------"

REST

I FIRST I I FIRST I ~-----------

......

~

3

5

NOTE: Assume PIPES2-0 = 110. CONFIG1-0 = 11. ENRA = 1. OEY = O. OEC = OES = O.
RESET = HALT = 1. TP1-0 = 11

Figure 69. Double-Precision Chained Mode Operation, Input Registers Enabled
(PIPES2-PIPESO - 110, CLKMODE .. 1)

7-131

load Rest
of First
Operands
load Half
of First
Operands

+

load Rest
of Second
Operands

load Half
of Second
Operands

Begin Second
Operation

Begin First
Operation

load Pipeline

~

~
I

I

ClK

I+---- 9 ----.!

I

THIRD INSTRUCTION

SECOND INSTRUCTION
I

I

I

!4'""" 22 ~ ... 16 -.j

~16

INSTRUCTION:

II

I

I

:

:

I+- 22 ....
:
:

FUNC(10,O), RND(1,O), FAST

I

r---~--~ r----~--~ r--------~ r----~--~!Ir-------~
REST

r-----REST

2ND OPS
I

I

II

,

II

I

II

lRD OPS

I

I

:

I

1+-17 ...... 23-.1 '-17.......... 23-.11+-17 ........ 23-.1 1+-17~23-+:
DATA(31 ,0) A AND B INPUTS

I

-------------------------------------------!~----~:----~I

~

l

SElMS/lS

I
I
I
:
:
I
I
------------------------------------------~

l

----------------------------------------~
I
l
:
I
I+- 4 .....

OUT(31 ,0) STATUS( 18,0)

NOTE: Assume PIPES2-0 = 010, CONFIG1-0
RESET = HALT = 1, TP1-0 = 11

= 10, ENRA

= 1, ENRB

= 1, OEY

~

5 -..I

= 0, OEC

= OES = 0,

Figure 70. Double-Precision Chained Mode Operation, Input and Output Registers
Enabled (PIPES2-PIPESO - 010, CLKMODE .. 0)

7-132

Load Rest
of First
Operands

Load Half
of First
Operands

Begin First
Operation

+

+

Load Half
of Second
Operands

I

I

10 - - . j .

• 1I

I

!

I

FIRST
INSTRUCTION

~

I

10

.~

I

I
THIRD
INSTRUCTION

SECOND
INSTRUCTION

22

INSTRUCTION:

~

I

14---10

I

Begin Second
Load Pipeline
Operation

+

I

1.-16....1

Load Half
of Third
Operands

Load Pipeline Load Pipeline Load Output

1

CLK

Load Rest
of Second
Operands

.

I

I

~

I

I

I

I
I

FUNC(10.0). RND(1.01. FAST

~_-i..''''''''''\'

17

I
I

•
I

j

23

I

1+-22 ....

1

I

I

I

22......-..J -----*" 16

... 16....

I

17

23

17

17T23' ~ 1-7-:- 23- ~

23

DATA(31.01 A AND B INPUTS

_ _ _ _ _ _ _--:--~I---lL______'
SELMS/LS

I

I

t.--4-.t

OUT(31.01 STATUS(18.01
NOTE: Assume PIPES2-0

= 000.

CONFIG1-0

=

01. ENRA

=

1. ENRB

=

1. OEY

=

O. OEC

=

OES

=

O. RESET

= HALT = 1,

-;J

w
w

Figure 71. Double-Precision Chained Mode Operation, All Registers Enabled
(PIPES2-PIPESO ... 000, CLKMODE - 01

SN74ACT8847

TP1-0

= 11

Instruction Timing
The following table details the number of clock cycles required to compiete an operation
in different pipelined modes. For more detail, see the sample microi!1structions shown
in the previous section.
Clock duration and output delay depend on the pipeline mode selected. See the note
in the table and timing parameters listed at the beginning of this document.
Table 31. Number of Clocks Required'to Complete an Operation
PIPES2-0

PIPES2-0

PIPES2-0

PIPES2-0

- 000
(tpd41

- 100

- 111

-010

(tpd3 1

- 110
(tp d21

(tpdl 1

(tp d41

ALU Operation
or Multiply:!:

3

2

1

0

2

Divide

8

7

7

X

8

11

10

10

X

11

ALU Operation t

4

3

2

1

Multiply:!:

5

4

3

2

4

OPERATION

PIPES2-0

Single-Precision
Floating Point

Square Root
Double-Precision
Floating Point

3

Divide

14

13

13

X

14

Square Root
Integer

17

16

16

X

17

3

2

1

0

2

Divide

16

15

15

X

16

Sauare Root

20

19

19

X

20

ALU Operation
or Multiply:!:

Y output and status valid following this tpd delay after the designated number of clocks
t'nc'udes every conversion involving double-precision lOP +-+ SP or OP +-+ Integer)
:t Includes all chained mode operations
X = invalid

When using fast cycle times and double-precision operations, two cycles may be
required to output and capture both halves of a double-precision result. To insure the
result remains valid for two cycles, a NOP instruction may need to be inserted between
the operations. Table 32 shows the number ,of NOPs necessary to insert into the
instruction !;tream for fully pipelined operation (PIPES2-PIPESO = 000).

7-134

Table 32. NOPs Inserted to Guarantee That Double-Precision Results Remain
000)
Valid for Two Clock Cycles (PIPES2-PIPESO
1 ST OPERATION

DP -

32 BIT

32 BIT -

DP

32 BIT OP

DP ALU

DP Multiply

FOLLOWED BY
2ND OPERATION

# NOPs INSERTED
BETWEEN OPERATIONS

# CYCLES RESULT
IS VALID
2
2
1
2
2
2
2

DP
32
32
DP
DP
DP
DP

- 32 BIT
BIT - DP
BIT OP
ALU
Multiply
Sqrt
Divide

0
0
0
0
0
0
0

DP
32
32
DP
DP
DP
DP

- 32 BIT
BIT -+ DP
BIT OP
ALU
Multiply
Sqrt
Divide

0
0

DP
32
32
DP
DP
DP
DP

-+ 32 BIT
BIT -+ DP
BIT OP
ALU
Multiply
Sqrt
Divide

0
0
0
0
0
0
0

2
2

DP
32
32
DP
DP
DP
DP

-+ 32 BIT
BIT -+ DP
BIT OP
ALU
Multiply
Sqrt
Divide

0
0

2
2
2
2
2
2
2

DP
32
32
DP
DP
DP
DP

-+ 32 BIT
BIT -+ DP
BIT OP
ALU
Multiply
Sqrt
Divide

1

0
0
0
0

1

0
0
0
0
1
1
2t
1

0

2
2
2
2
2
2
2

2
2
2
2

2
2
2
2
2
2
2

NOTE: 32-bit operation refers to a single-precision floating point or integer ALU operation or multiply, except
conversion to or from double-precision. This assumes the instruction following a double-precision divide
may begin loading on the 12th clock cycle, following a double-precision square root on the 15th cycle.
tThe device will not load a single-precision operation on the first clock edge following this operation, so any
single-precision instruction may be used. A Nap is recommended. The second instruction must be a Nap.

7-135

Table 32. NOPs Inserted to Guarantee That Double-Precision Results Remain
Valid for Two Clock Cycles (PIPES2-PIPESO ... 000) (Continued)
1ST OPERATION
DP SQRT

DP Divide

FOLLOWED BY
2ND OPERATION
DP - 32 BIT
32 BIT - DP
32 BIT OP
DPALU
DP Multiply
DP Sqrt
DP Divide
DP - 32 BIT
32 BIT - DP
32 BIT OP
bPALU
DP Multiply
DP Sqrt
DP Divide

# NOPs INSERTED
BETWEEN OPERATIONS

1
1
2t
1
0
0
0

# CYCLES RESULT
IS VALID
2
2
2
2
2
2
2

1
1
2t
1
0
0
0

2
2
2
2
2
2
2

NOTE: 3i-bit operation refers to a single-precision floating point or integer ALU operation or multiply, except
conversion to or from double-precision. This assumes the instruction following a double-precision divide
may begin loading on the 12th clock cycle, following a double-precision square root on the 15th cycle.
tThe device will not load a single-precision operation on the first clock edge following this operation, so any
single-precision instruction may be used. A Nap is recommended. The second instruction must be a Nap.
\'

Exception and Status Handling
Exception and status flags for the' ACT8847 were listed previously in Tables 14 and

(J)

Z

~

~

l>
(")

""'4
CO
CO
~
~

15.

Output exception signals are provided to indicate both the source and type of the
exception. DENORM, INEX, OVER, UNDER, and RNDCO indicate the exception type,
and CHEX and SRCEX indicate the source of an exception. SRCEX indicates the source
of a result as selected by instruction bit 16, and SRCEX is active whenever a result
is output, not only when an exception is being signalled. The chained-mode exception
signal CHEX indicates that an exception has be generated by the source not selected
for output by 16. The exception type signalled by CHEX cannot be read unless status
select controls SELSn-SELSTO are used to force status output from the deselected
source.
Output exceptions may be due either to a result in an illegal ,format or to a procedural
error. Results too large or too small to be represented in the selected precision are
signalled by OVER and UNDER. When INF is high, the output is the IEEE representation
of infinity. Any ALU output which has been increased in magnitude by rounding causes
INEX to be set high. DENORM is set when the multiplier output is wrapped or the ALU
output is denormalized. DENORM is also set high when an illegal operation on an integer
is performed. Wrapped outputs from the multiplier may be inexact or increased in
magnitude by rounding, which may cause the INEX and RNDCO status signals to be
set high. A denormal output from the ALU (DENORM = 1) may also cause INEX to
be set, in which case UNDER is also signalled.
7-136

Ordinarily, SELST1-SELSTO are set high so that status selection defaults to the output
source selected by instruction input 16. The ALU is selected as the output source when
16 is low, and the multiplier when 16 is high.
When the device operates in chained mode, it may be necessary to read the status
results not associated with the output source. As shown in Table 16, SELST1-SELSTO
can be used to read the status of either the ALU or the multiplier regardless of the
16 setting.
Status results are registered only when the output (P and S) registers are enabled
(PIPES2 = 0). Otherwise, the status register is transparent. In either case, to read
the status outputs, the output enables (OES, OEC, or both) must be low.
Status flags are provided to signal both floating point and integer results. Integer status
is provided using AEQB for zero, NEG for sign, and OVER for overflow/carryout.
Several status exceptions are generated by illegal data or instruction inputs to the FPU.
Input exceptions may cause the following signals to be set high: IVAL, DIVBYO, DEN IN,
and STEX 1-STEXO. If the IVAL flag is set, either an invalid operation such as the square
root of - IX I, has been requested or a NaN (Not a Number) has been input. When
DEN IN is set, a denormalized number has been input to the multiplier. DIVBYO is set
when the divisor is zero. STEX 1-STEXO indicate which port (RA, RB, or both) is the
source of the exception when either a denormal is input to the multiplier (DENIN = 1)
or a NaN (lVAL = 1) is input to the multiplier or the ALU.
NaN inputs are all treated as IEEE signalling NaNs, causing the IVAL flag to be set.
When output from the FPU, the fraction field from a NaN is set high (all 1s) and the
sign bit is 0, regardless of the original fraction and sign fields of the input NaN.
When the' ACT884 7 outputs a NaN, it is always in the form of a signalling NaN along
with the IVAL (Invalid) and appropriate STEX flag set high (except for the MOVE A
instruction which passes any operand as is without setting exception flags).
Certain operations involving floating point zeros and infinities are invalid, causing the
, ACT884 7 to set the IVAL flag and output a NaN. Operations involving zero and infinity
are detailed below.
A floating point zero is represented by an all zero exponent and fraction field. The sign
bit may be 0 or 1, to represent +0 OR -0 respectively.
Zero divided by zero is an invalid operation. The result is a NaN with the IVAL and
DIVBYO flags set. Any other number divided by zero results in the appropriately signed
infinity with the DIVBYO flag set.

"

~
CO

I-

U

oCt

~

z"

en

7-137

For operations with floating point zeros: ± 0 multiplied by any number is the
appropriately signed O.
+0
+0
-0
-0
+0
+0
-0
-0

+
+
+
+
-

(-0)
(+0)
(-0)
(+0)
(-0)
(+0)
(-0)
(+0)

+0
+0
-0
+0
+0
+0
+0
-0

Floating point infinity is represented by an all 1 exponent field with an all 0 fraction
field. The sign bit determines positive or negative infinity (0 or 1 respectively).
Infinity divided by infinity is an invalid operation, setting the IVAL flag and resulting
in a NaN output. Division of infinity by any other number results in the appropriately
signed infinity. Division of any number (except infinity or zero) by infinity results in
an appropriately signed zero. Infinity divided by zero results in the appropriately signed
infinity with the DIVBYO flag set.
For invalid operations with infinity listed below, the output is a signalling NaN with
the IVAL flag set.

± infinity multiplied by ± 0
± infinity divided by ± 0

+ infinity + (- infinity)
- infinity + (+ infinity)
+ infinity - (+ infinity)
- infinity - (- infinity)
Any other number added to or multiplied by infinity results in the appropriately signed
infinity as output.

7-138

, ACT884 7 Reference Guide
Instruction Inputs
Operations are summarized in Tables 33 thru 41.
Table 33. Independent ALU Operations, Single Floating Point Operand
ALU OPERATION

INSTRUCTION

ON A OPERAND

INPUTS 11 0-10

NOTES

Pass A operand

OOx x01 x 0000

Pass - A operand

OOx x01 x 0001

Convert from 2' s
complement integer
to floating point t

OOx x01 00010

Convert from floating
point to 2's complement
integer t

OOx x01x 0011

x = Don't care

Move A operand (pass
without NaN detect or
status flags active)

OOx x01x 0100

18 selects precision of A
operand
0= A (SP)

Pass B operand

OOx x01x 0101

17 selects precision of B
operand and must equal 18.

Convert from floating
point to floating point
(adjusts precision of
input: SP -+ DP, DP -+ SP):t;

OOx x01x 0110

Floating point to
unsigned integer
conversion t

1 = A (DP)

14 selects absolute value of
a operand:
O=A

OOx x01x 0111

Wrap denormal operand
Unsigned integer to
floating point
conversion t

OOx x01x 1000
OOx x01x 1010

Unwrap exact number

OOx x01x 1100

Unwrap inexact number

OOx x01x 1101

Unwrap rounded input

OOx xO 1 x 111 0

1 = IAI
During integer to floating
point conversion, I A I is not
allowed as a result.

tOuring this operation, 18 selects the precision of the result. If the conversion involves double-precision. the
operation requires 2 cycles to load.
tRequires 2 cycles to load the operation. even if input is SP.

7-139

Table 34. Independent ALU Operations, Two Floating Point Operands
ALU OPERATIONS

INSTRUCTION

AND OPERANDS

INPUTS 110-10

Add A

+ B
+B
Add A + IBI
Add IAI + IBI

OOx xOOO OxOO

Add IAI

OOx xOO 1 OxOO

Subtract A - B

OOx xOOO Ox01

Subtract I A I - B

OOx x001 Ox01

Subtract A -

OOx xOOO 1x01

IB I

Subtract IAI Compare A, B

IBI

NOTES

OOx xOOO 1xOO

x = Don't Care

OOx x001 1xOO

18 selects precision of A
operand:

OOx x001 1x01
OOx xOOO Ox 10

0= A ISP)
1 = A lOP)
17 selects precision of B
operand:

o=

B ISP)

Compare IAI ' B
Compare A, I B I

OOx x001 Ox10

Compare I A I, I B I
Subtract B - A

OOx x001 1x10

12 selects either Y or its
absolute value:

OOx xOOO Ox 11

o=y

Subtract B-1 A I

OOx x001 Ox11

1 = IYI

Subtract I B I - A

OOx xOOO 1x11

Subtract IBI -

OOx x001 1x11

IAI

OOx xOOO 1x1 0

1 = B lOP)

Table 35. Independent ALU Operations, One Integer Operand

CJ)

:2

-..J

ALU OPERATION

INSTRUCTION

ON A OPERAND

INPUTS 110-10

NOTES

Pass A operand

010 xx10 0000

x = Don't Care

Pass - A operand 12's complement):I:

010 xx10 0001

17 selects format of A or B
integer operand:

Negate A operand 11' s complement)

010 xx10 0010

Pass B operand

010 xx10 0101

Shift left logical t

010 xx10 1000

Shift right logical t

010 xx10 1001

1 = Single-precision unsigned
integer

Shift right arithmetic t

010 xx10 1101

18 must equal 17

o=

Single-precision 2's
complement

tB operand is number of bit positions A is to be shifted and must be input on the same cycle as the instruction.
tPass (- AI of unsigned integer takes 1 's complement.

~

l>

(")

-4

ex)
ex)

~

-..J

7-140

Table 36. Independent ALU Operations, Two Integer Operands
ALU OPERATIONS

INSTRUCTION

AND OPERANDS

INPUTS 110-10

Add A

+ 8

NOTES

010 xOOO 0000

Subtract A - 8

010 xOOO 0001

x = Don't Care

Compare A, 8

010 xOOO 0010

Subtract 8 - A

010 xOOO 0011

17 selects format of A and 8
operands:

Logical AND A, 8

010 xOOO 1000

o=

Logical AND A, NOT 8

010 xOOO 1001

Logical AND NOT A, 8

010 xOOO 1010

Logical OR A, 8

010 xOOO 1100

Logical XOR A, 8

010 xOOO 1101

Single-precision 2's
complement

1 = Single-precision unsigned
integer

Table 37. Independent Floating Point Multiply Operations
MULTIPLIER OPERATION

INSTRUCTION

AND OPERANDS

INPUTS 110-10

*8
Multiply - (A * 8)
Multiply A * I B I
Multiply -(A * 181)
Multiply I A I * 8
Multiply -(IAI * 8)
Multiply IAI * 181
Multiply -(IAI * 181)
Multiply A

NOTES

OOx x 100 OOxx

x = Don't Care

OOx x100 01xx

18 selects A operand
precision (0 = SP, 1 = DP)

OOx x1 00 10xx
OOx x100 11xx
OOx x101 OOxx

17 selects 8 operand
precision (0 = SP, 1 = DP)

OOx x1 01 01 xx

11 selects A operand format
(0 = Normal, 1 = Wrapped)

OOx x101 10xx

10 selects 8 operand format

OOx x101 11xx

(0 = Normal, 1 = Wrapped)

Table 38. Independent Floating Point Divide/Square Root Operations
MULTIPLIER OPERATION

INSTRUCTION

AND OPERANDS t

INPUTS 110-10

NOTES

x = Don't Care
Divide A /8

OOx x11 0 Ox xx

SQRT A

OOx x110 1xxx

Divide IAI /8

OOx x111 Oxxx

SQRT IAI

OOx x111 1 xxx

18 selects A operand precision
and 17 selects 8 operand
precision (0 = SP, 1 = DP)
12 negates multiplier result
(0 = Normal, 1 = Negated)
11 selects A operand format and
10 selects 8 operand format
(0 = Normal, 1 = Wrapped)

tl7 should be equal to 18 for square root operations

7-141

Table 39. Independent Integer Multiply/Divide/Square Root Operations
MULTIPLIER OPERATION

INSTRUCTION

AND OPERANDSt

INPUTS 110-10

Multiply A * B
Divide A / B
SQRT A

010 x100 0000
010 x110 0000
010 x110 1000

NOTES
x = Don't care
17 selects operand format:
o = SP 2's complement
1 = SP unsigned integer

t Operations involving absolute values, wrapped operands, or negated results are valid only when floating point
format is selected (19 = 0).

Table 40; Chained Multiplier/ALU Floating Point Operationst:
CHAINED OPERATIONS

OUTPUT

INSTRUCTION

SOURCE

INPUTS 110-10

A+B

ALU

10x xOOO xxOO

A+B

Multiplier

10x x100 xxOO
10x xOOO xx01

A*B

A - B
A - B

ALU
Multiplier

10x x100 xx01

A*B

2-A

ALU

10x xOOO xx10

x = Don't Care

A*B

2 - A

Multiplier

10x x100 xx10

A * B

B-A

ALU

10x xOOO xx11

18 selects precision of
RA inputs:

A*B

B-A

Multiplier

10x x100 xx11

o=

A * B

A+O

ALU

10x x010 xxOO

1 = RA (DP)

A*B

Multiplier

10x x110 xxOO

A * B

A+O
O-A

ALU

10x x010 xx11

A * B

O-A

Multiplier

10x x110 xx11

17 selects precision of
RB inputs:
o = RB (SP)

A * 1

A+B

ALU

10x x001 xxOO

A * 1
A* 1

A+B
A-B

Multiplier

10x x101 xxOO

ALU

10x x001 xx01

A * 1
A* 1
A* 1

A-B

Multiplier

10x x101 xx01

2 - A

ALU

10x x001 xx10

2 - A

Multiplier

10x x101 xx10

A* 1
A * 1

B-A

ALU

10x x001 xx11

B-A

Multiplier

10x x101 xx11

A* 1
A * 1
A* 1

A+O

ALU

10x x011 xxOO

A+O

Multipiier

10x x111 xxOO

O-A

ALU

10x x011 xx11

A * 1

O-A

Multiplier

10x x 111 xx 11

MULTIPLIER

ALU

A*B
A * B
A*B

C/)

2

"l>

~

(")

-i

(Xl
(Xl

e;

NOTES

RA (SP)

1 = RB (DP)
13 negates ALU result:

o=

Normal

1 = Negated
12 negates multiplier
result:

o=

Normal

1 = Negated

tThe 110-10 setting 1xx xx1x xx10 is invalid, since it attempts to force the B operand of the ALU to both
o and 2 simultaneously.

7-142

Table 41. Chained Multiplier/ALU Integer Operations
CHAINED OPERATIONS

OUTPUT

INSTRUCTION

SOURCE

INPUTS 11 0-10

MULTIPLIER

ALU

A*B

A+B

ALU

11 0 xOOO 0000

A*B

Multiplier

110 x100 0000

A * B

A + B
A - B

ALU

110 xOOO 0001

A*B

A - B

Multiplier

110 x100 0001

A*B

2-A

ALU

110 xOOO 0010

A*B

2-A

Multiplier

110 x100 0010

A*B

B-A

ALU

110 xOOO 0011

A*B

B-A

Multiplier

110 x100 0011

A * B

A+O

ALU

110 x010 0000

NOTES

x

=

Don't Care

A*B

A+O

Multiplier

110 x110 0000

17 selects format of A
and B operands:

A * B

O-A

ALU

110 x010 0011

A*B

O-A

Multiplier

110 x11 00011

o = SP 2's

A * 1

A+B

ALU

110 x001 0000

A * 1

Multiplier

110 x101 0000

A * 1

A + B
A-B

ALU

110 x001 0001

A * 1

A-B

Multiplier

110 x101 0001

A * 1

2 - A

ALU

110 x001 0010

A * 1

2 - A

Multiplier

110 x101 0010

A * 1

B-A

ALU

110 x001 0011

A * 1

B-A

Multiplier

110 x101 0011

A * 1

A+O

ALU

110 x011 0000

A * 1

Multiplier

110 x111 0000

A * 1

A+O
O-A

ALU

110 x011 0011

A * 1

O-A

Multiplier

110x111 xx11

complement
1 = SP unsigned
integer

7-143

Input Configuration
CONFIG 1-CONFIGO control the order in which double-precision operands are loaded,
as shown in the Table 42.

Table 42. Double-Precision Input Data Configuration Modes
LOADING SEQUENCE
DATA LOADED INTO TEMP
DATA LOADED INTO RA/RB
REGISTER ON FIRST CLOCK
REGISTERS ON SECOND
AND RA/RB REGISTERS ON
CLOCK
SECOND CLOCK t
CONFIG1

CONFIGO

0

0

0

1 :j:

1

0

1

1

DA
B operand
(MSH)
A operand
(LSH)
A operand
(MSH)
A operand
(MSH)

DB
B operand
(LSH)
B operand
(LSH)
B operand
(MSH)
A operand
(LSH)

DA
A operand
(MSH)
A operand
(MSH)
A operand
(LSH)
B operand
(MSH)

DB
A operand
(LSH)
B operand
(MSH)
B operand
(LSH)
B operand
(LSH))

t On the first active clock edge (see CLKMOOE), data in this column is loaded into the temporary register.
On the next rising edge, operands in the temporary register and the OAIOB buses are loaded into the RA
and RB registers.
tUse CONFIG1-0 = 01 as normal single-precision input configuration.

Operand Source Select
Multiplier and ALU operands are selected by SELOP7-SELOPO as shown in Tables 43
and 44.

Table 43. Multiplier Input
SELOP7

0
0
1

en

z

""""
~
1>

1

A1 (MUX1) INPUT
OPERAND SOURCEt
SELOP6
Reserved
0
1
C register
0
ALU feedback
RA input register
1

Selectio~

SELOP5

0
0
1
1

B1 (MUX2) INPUT
SELOP4 OPERAND SQUhCEt

0
1
0
1

Reserved
C register
Multiplier feedback
RB input register

t For division or square root operations, only RA and RB registers can be selected as sources.

()

-t

00
00
~

""""

7-144

Table 44. ALU Input Selection
A2 IMUX3) INPUT

B2 IMUX4) INPUT

SELOP3

SELOP2

OPERAND SOURCEt

SELOP1

SELOPO

OPERAND SOURCEt

0
0

0

0
0

0

1
1

0

Reserved
C register
Multiplier feedback
RA input register

1
1

0
1

Reserved
C register
ALU feedback
RB input register

1
1

1

tFor division or square root operations, only RA and RB registers can be selected as sources.

Pipeline Control
Pipelining levels are turned on by PIPES2-PIPESO as shown below.
Table 45. Pipeline Controls (PIPES2-PIPESO)
PIPES2PIPESO

0
1

X
X
X
X

0 X
1 X

0
1

X
X

X
X

X
X

REGISTER OPERATION SELECTED
Enables input registers IRA, RB)
Makes input registers IRA, RB) transparent
Enables pipeline registers
Makes pipeline registers transparent
Enables output registers (PREG, SREG, Status)
Makes output registers (PREG, SREG, Status) transparent

Round Control
RND1-RNDO select the rounding mode as shown in Table 46.
Table 46. Rounding Modes
RND1-

ROUNDING MODE SELECTED

RNDO

0 0
0 1
1 0
1 1

Round
Round
Round
Round

towards
towards
towards
towards

nearest
zero (truncate)
infinity (round up)
negative infinity (round down)

7-145

Status Output Selection
SELST1-SELSTO choose the status output as shown below.
Table 47. Status Output Selection (Chained Mode)
SELST1-

STATUS SELECTED

SELSTO
00
01
10
11

Logical
Selects
Selects
Normal

OR of ALU and multiplier exceptions (bit by bit)
multiplier status
ALU status
operation (selection based on result source specified by 16 input)

Test Pin Control
Testing is controlled by TP1-TPO as shown below.
Table 48. Test Pin Control Inputs
TP1TPO

o

0

0 1
1 0
1 1

7-146

OPERATION
All outputs and I/Os are forced low
All outputs and I/Os are forced high
All outputs are placed in Ii high impedance state
Normal operation

Miscellaneous Control Inputs
The remaining control inputs are shown in the Table 49.
Table 49. Miscellaneous Control Inputs
SIGNAL
BYTEP
CLKMODE

ENRC

HIGH
Selects byte parity generation and test
Enables temporary input register load on
failing clock edge
No effect

ENRA

If register is not in flowthrough, enables
clocking of RA register

ENRB

HALT

If register is not in flowthrough, enables
enables clocking of RB register
Places device in FAST mode
Causes output value to bypass C
register and appear on C register output
bus.
No effect

OEC
OES
OEY
RESET

Disables compare pins
Disables status outputs
Disables Y bus
No effect

FAST
FLOW_C

SELMS/LS

SRCC

Selects MSH of 64-bit result for output
output on the Y bus (no effect on singleprecision operands)
Selects multiplier result for input to C
register

LOW
Selects single bit parity
generation and test
Enables temporary input register
load on rising clock edge
Enables C register load when
CLKC goes high.
If register is not in flowthrough,
through, holds contents of RA
register
If register is not in flowthrough,
holds contents of RB register
Places device in IEEE mode
No effect

Stalls device operation but
does not affect registers, internal
states, or status
Enables compare pins
Enables status outputs
Enables Y bus
Clears internal states, status,
internal pipeline registers, and
exception disable register. Does
not affect other data registers.
Selects LSH of 64-bit result for
output on the Y bus (no effect on
single-precision operands)
Selects ALU result for input to C
register

Glossary
Biased exponent - The true exponent of a floating point number plus a constant called
the exponent field's excess. In IEEE data format, the excess or bias is 127 for singleprecision numbers and 1023 for double-precision numbers.
Denormalized number (de norm) - A number with an exponent equal to zero and a
nonzero fraction field, with the implicit leading (leftmost) bit of the fraction field being O.

7-147

NaN (not a number) - Data that has no mathematical value. The' ACT884 7 produces
(Xl is executed. The output format
a NaN whenever an invalid operation such as 0
for an NaN is an exponent field of all ones, a fraction field of all ones, and a zero sign
bit. Any number with an exponent of all ones and a nonzero fraction is treated as a
NaN on the input.

*

Normalized number - A number in which the exponent field is between 1 and 254
(single precision) or 1 and 2046 (double precision). The implicit leading bit is 1.
Wrapped number - A number created by normalizing a denormalized number's fraction
field and subtracting from the exponent the number of shift positions required to do
so. The exponent is encoded as a two's complement negative number.

SN74ACT8847 Application Notes
Sum of Products and Product of Sums
Performing fully pipelined double-precision operations requires a detailed understanding
of timing constraints imposed by the multiplier. In particular, sum of products and
product of sums operations can be executed very quickly, mostly in chained mode,
assuming that timing relationships between the AlU and the multiplier are coded
properly.
Pseudocode tables for these sequences are provided, (Table 38 and Table 39) showing
how data and instructions are input in relation to the system clock. The overall patterns
of calculations for an extended sum of products and an extended product of sums
are presented. These examples assume FPU operation in ClKMODE 0, with the CONFIG
setting 10 to load operands by MSH and lSH, all registers enabled
(PIPES2 - PIPESO = 000), and the C register clock tied to the system clock.
In the sum of products timing table, the two initial products are generated in
independent multiplier mode. Several timing relationships should be noted in the table.
The first chained instruction 10aQs and begins to execute following the sixth rising
edge of the clock, after the first product P1 has already been held in the P register
for one clock. For this reason, P1 is loaded into the C register so that P1 will be stable
for two clocks.

en
2
"

i:
("')

-I

00
00

On the seventh clock, the AlU pipeline register loads with an unwanted sum, P1 + P1.
However, because the AlU timing is constrained by the multiplier, the S register will
not load until the rising edge of ClK9, when the AlU pipe contains the desired sum,
P1 + P2. The remaining sequence of chained operations then execute in the desired
manner.

~

"

7-148

Table 50. Pseudocode for Fully Pipelined Double-Precision Sum of Products t
(CLKMODE-O, CONFIG1-CONFIGO-10, PIPES2-PIPESO ... 000)
ClK

I
I
I
I

TEMP

INS

INS

RA

RB

MUl

P

C

ALU

REG

BUS

REG

REG

REG

PIPE

REG

REG

PIPE

S

2

A1 l5H

A1 *B1

A1

B1

3

A2 M5H B2 M5H A2.B2M5H A2*B2 A1 *B1

A1

B1

A1 *B1

4

A2 l5H

A2

B2

A1 *B1

A2

B2

A2*B2

P1

A3

B3

A2*B2

P1

P1

A3

B3

A3*B3

P2

P1

P1 +P1

A4

B4

A3*B3

P2

P1

P1 +P2

A4

B4

A4*B4

P3

P2

51 +P2

51

A5

B5

A4*B4

P3

P2

51 +P3

51

A5

B5

A5*B5

P4

P2

XXXXX

52

P4

P2

B1 l5H A1.B1l5H

B2 l5H A2.B2L5H

A1 *B1

A2*B2 A2*B2
PR+CR

I

5

A3 M5H B3 M5H A3.B3M5H

I

6

A3 L5H

I

7

A4M5H B4 M5H A4.B4M5H

I

8

A4 LSH

I

9

A5 M5H B5 M5H A5.B5M5H

A5 L5H

B3 L5H A3.B3L5H

B4 L5H A4.B4L5H

B5 L5H A5.B5L5H

A6 MSH B6 M5H A6.B6M5H

tpR = Product Register
SR = Sum Register
CR = Constant (C) Register

co
"'"

SN74ACT8847

A3*B3

A2*B2

PR+CR PR+CR.
A3*B3 A3*B3
PR+5R PR+5R.
A4*B4 A3*B3
PR+5R PR+5R.
A4*B4 A4*B4
PR+5R PR+5R.
A5*B5 A4*B4
PR+5R PR+5R.
A5*B5 A5*B5
PR+5R PR+5R,
A6*B6 A5*B5

V

REG BUS

A1 M5H B1 M5H A1.B1M5H A1 *B1

I11
I 12

--

DB
BUS

1

I10

';'I

DA
BUS

52

L V881.::n:fv L NS

...';-I

Table 51. Pseudocode for Fully Pipelined Double-Precision Product of Sums t
(CLKMODE ... O, CONFIG1-CONFIGO-10, PIPES2-PIPESO=OOO)

U1

o

CLK

I
I
I
I
I

DA

DB

TEMP

INS

INS

RA

RB

MUL

P

C

ALU

S

V

BUS

BUS

REG

BUS

REG

REG

REG

PIPE

REG

REG

PIPE

REG

BUS

1

A1M5H

B1M5H A1,B1M5H A1 +B1

2

A1L5H

B1L5H A 1 ,B1 L5H

A1 +B1

A1

B1

3

A2M5H

B2M5H A2,B2M5H A2+B2 A1 +B1

A1

B1

4

A2L5H

B2L5H A2,B2L5H

A2

B2

5

A3M5H

B3M5H A3,B3M5H

I

6

A3L5H

B3L5H A3,B3L5H

I

7

XXX

I

8

A4M5H

B4M5H A4,B4M5H

I 9
I10

A4L5H

B4L5H A4,B4L5H

XXX

XXX

XXX

XXX

XXX

I11

A5M5H

B5M5H A5,B5M5H

I12

A5L5H

B5L5H A5,B5L5H

A1 +B1

A2+B2 A2+B2
CR*5R

A2+B2

A2

B2

CR*5R CR*5R
A3+B3 A3+B3

A3

B3

A3

B3

A3+B3

NOP
PR*5R
A4+B4

CR*5R
A3+B3
NOP

PR*5R PR*5R
A4+B4 A4+B4
NOP
PR*5R
A5+B5

PR*5R
A4+B4
NOP

PR*5R PR*5R
A5+B5 A5+B5

ENRA=O ENRB=O

Nap instruction is 011 0000 0000.
Product Register
Sum Register
Constant (C) Register

A1 +B1

51

A2+B2

51

51

A2+B2

52

51 *52

51

A3+B3

52

51 *52

51

ENRC=O
51

XXX

A3

B3

A4

B4

XXX

P1

51

XXX

53

A4

B4

P1 *53

P1

51

A4+B4

53

51

A4+B4 XXX

ENRA=O ENRB=O
A4

B4

A5

B5

----- -------

NOTE:
t PR =
SR =
CR =

A1 +B1

P1 *53 XXX

XXX

P2

------- - -

51
---------

X

54

Matrix Operations
The' ACT884 7 floating point unit can also be used to perform matrix manipulations
involved in graphics processing or digital signal processing. The FPU multiplies and
adds data elements, executing sequences of microprogrammed calculations to form
new matrices.

Representation of Variables
In state representations of control systems, an n-th order linear differential equation
with constant coefficients can be represented as a sequence of n first-order linear
differential equations expressed in terms of state variables:
d X1
-dt

_
- x2,···,

dX(n-1 )
dt

=

xn

For example, in vector-matrix form the equations of an nth-order system can be
represented as follows:

d
dt

x1
x2

a11

a12

a1n

b11

b1n

x2

xn

an1

+

or, X = ax

an2

ann

+

bn1

xn

Q
u2

m
:

b nn

un

bu

Expanding the matrix equation for one state variable, dX1/dt, results in the following
expression:
X1

=

(a11

*

x1

+ ... +

a1 n

* xn)

+

(b11

* u1

+ ... +

b1 n

* un)

where X1 = dX1/dt.
Sequences of multiplications and additions are required when such state space
transformations are performed, and the' ACT884 7 has been designed to support such
sum-of-products operations. An n X n matrix A multiplied by an n x n matrix X yields
an n X n matrix C whose elements cij are given by this equation:
n

Cij =

E

aik

* Xkj

for i = 1, ... ,n

j = 1, ... ,n

(1 )

k=1

7-151

'For the Cij elements to be calculated by the' ACT884 7, the corresponding elements
aik and Xkj must be stored outside the' ACT884 7 and fed to the' ACT884 7 in the
proper order required to effect a matrix multiplication such as the state space system
representation just discussed.

Sample Matrix Transformation
The matrix manipulations commonly performed in graphics systems can be regarded
as geometrical transformations of graphic objects, A matrix operation on another matrix
representing a graphic object may result in scaling, rotating, transforming, distorting,
or generating a perspective view of the image. By performing a matrix operation on
the position vectors which define the vertices of an image surface, the shape' and
position of the surface can be manipulated.
The generalized 4 x 4 matrix for transforming a three-dimensional object with
homogeneous coordinates is shown below:

a
e
T

b
f

c

d

g

h

k

.....
m

n

,

0

..
p

The matrix T can be partitioned into four component matrices, each of which produces
a specific effect oli the resultant image:

3

en

:2
-..J

~

l>

(")

-f

00
00
~

-..J

3

x 3

x

1

x 3

1

x 1

The 3 x 3 matrix produces linear transformation in the form of scaling, shearing and
rotation, The 1 x 3 row matrix produces translation, while the 3 x 1 column matrix
produces perspective transformation with multiple vanishing points. The final single
element 1 x 1 produces overall scaling. Overall operation of the transformation matrix
T on the position vectors of a graphic object produces a combination of shearing,
rotation, reflection, translation, perspective, and overall scaling.
The rotation of an object about an arbitrary axis in a three-dimensional space can be
carried out by first translating the object such that the desired axis of rotation passes
through the origin of the coordinate system, then rotating the object about the axis

7-152

through the origin, and finally translating the rotated object such that the axis of rotation
resumes its initial position. If the axis of rotation passes through the point P = [a b c 11.
then the transformation matrix is representable in this form:

[x y z h)

[x y z 1)

1
0
0
-a

0
1
0
-b

0
0
1
-c

0
0
0
1

R

0
1
0
b

1
0
0
a

0
0
1
c

0
0
0
1

(2)

I

I
translation
to origin

rotation
about
origin

translation
back to initial
position

where R may be expressed as:

R =

n12 + (1-n)2 cosq,

n 1n2( 1-cosq,) + n3sinq,

n 1 n3( 1-cosq,) - n2sinq,

0

n 1 n2( 1-cosq,) - n3sinq,

n22 + (1-n2)2 cosq,

n2n3( 1-cosq,) + n 1sinq,

0

n1n3(1-cosq,)+n2sinq,

n2n3(1-cosq,)-n1sinq,

n32 + (1-n3)2 cosq,

0

o
and

n1 = q1/(q1 2 + q22 + q32)1/2

o

o

direction cosine for x-axis of
rotation
direction cosine for y-axis of rotation

n3 = q3/(q 12 + q22 + q32) 1/2 = direction cosine for z-axis of rotation

n=
Q

(n1 n2 n3)

= unit vector for

Q

= vector defining axis of rotation = [q 1 q2 q3)

q, = the rotation angle about Q
A general rotation using equation (2) is effected by determining the [x y z) coordinates
of a point A to be rotated on the object, the direction cosines of the axis of rotation
[n1, n2, n3), and the angle q, of rotation about the axis, all of which are needed to

7-153

define matrix [R]. Suppose, for example, that a tetrahedron ABCD, represented by
the coordinate matrix below is to be rotated.about an axis of rotation RX which passes
through a point P = [5 - 6 3 1] and whose direction cosines are given by unit vector
[n1 = 0.866, n2 = 0.5, n3 = 0.707]. The angle of rotation 0 is 90 degrees (see
Figure 72). The rotation matrix [R] becomes
-3
-2
-1
-2

2
1
2
2

R

0.750
-0.274
1.112
0

3
2
2
2

A
B

C
D

1.140
0.250
-0.513
0

0.112
1.220
0.500
0

0
0
0

y

Z'

+----- - - - - - - - - - - ,
Q

55 0

D

I

Z

AR

I
IL ____
(3)
-+
B'

C'

r-

D'

90 0

P (5, -6.3)
I

I

y'

(1) THIS ARROW DEPICTS THE FIRST TRANSLATION
(2) THIS AROW DEPICTS THE 90 0 ROTATION
(3) THIS ARROW DEPICTS THE BACK TRANSLATION

Figure 72. Sequence of Matrix Operations

7-154

The point transformation equation (2) can be expanded to include all the vertices of
the tetrahedron as follows:

xa
xb
xc
xd

2-3
1- 2
2 -1
2 -2

ya
yb
yc
yd

3
2
2
2

za
zb
zc
zd

1
1
1
1

h1
h2
h3
h4

1 0 00
0.750 1.140 0.112 0 1 000
01 00 -0.274 0.250 1.22 0 0 1 0 0
00 1 0
1.112 -0.513 0.5000 o 0 1 0
- 56-31
1 5-6 3 1
0
0
0

I

I

I

translation
to origin

rotation about origin

translation
back to
initial
position

(3)

The 'ACT884 7 floating point unit can perform matrix manipulation involving
multiplications and additions such as those represented by equation (1). The matrix
equation (3) can be solved by using the' ACT884 7 to compute, as a first step, the
product matrix of the coordinate matrix and the first translation matrix of the righthand side of equation (3) in that order. The second step involves postmultiplying the
rotation matrix by the product matrix. The third step implements the back-translation
by pre multiplying the matrix result from the second step by the second translation
matrix of equation (3). Details of the procedure to produce a three-dimensional rotation
about an arbitrary axis are explained in the following steps:

"

"I:t

ex)
ex)

....
«
"I:t
u

"Z

en

7-155

Step 1
Translate the tetrahedron so that the axis of rotation passes through the origin. This
process can be accomplished by multiplying the coordinate matrix by the translation
matrix as follows:

2
1
2

-3
-2
-1

3
2
2

2

-2

2

1
0
0
-5

0
1
0
6

0
0
1
-3

(2-5)
(1 - 5)
(2-5)
(2-5)

0
0
0
1

(-3+6)
(-2+6)
(-1 +6)
(-2+6)

(3-3)
(2-3)
(2-3)
(2-3)

I

I

translation
to origin

vertices of translated
tetrahedron

-3
-4
-3
-3

+3
+4
+5
+4

0
-1
-1
-1

1
1
1
1

AT
BT
CT
DT

The' ACT884 7 could compute the translated coordinates AT, BT, CT, DT as indicated
above. However, an alternative method resulting in a more compact solution is
Presented below.
Step 2
Rotate the tetrahedron about the axis of rotation which passes through the origin after
the translation of Step 1. To implement the rotation of the tetrahedron, postmultiply
the rotation matrix [Rl by the translated coordinate matrix from Step 1 . The resultant
matrix represents the rotated coordinates of the tetrahedron about the origin as follows:

-3
-4
-3
-3

3
0
4 -1
5 -1
4 -1

7-156

1
0.750
1.140 0.112 0
1 -0.274
0.250 1.22 0
1
1.112 -0.513 0.500 0
1
0
0
0
1

- 3.072
- 5.208
-4.732
- 4.458

- 2.670
-3.047
-1.657
-1.907

3.324
3.932
5.264
4.044

I

I

rotation about origin

rotated coordinates

Step 3
Translate the rotated tetrahedron back to the original coordinate space. This is done
by premultiplying the resultant matrix of Step 2 by the translation matrix. The following
calculations produces the final coordinate matrix of the transformed object:

- 3.072
- 5.208
-4.732
-4.458

- 2.670
- 3.047
-1.657
-1.907

3.324
3.932
5.264
4.044

1
1
1
1

0
0
1
3

1
0
0
1
0
0
5 -6

1.928
-0.208
0.268
0.542

0
0
0
1

- 8.670
-9.047
-7.657
-7.907

6.324
6.932
8.264
7.044

1
1
1
1

I

I

translate back

final rotated coordinates

A more compact solution to these transformation matrices is a product matrix that
combines the two translation matrices and the rotation matrix in the order shown in
equation (3). Equation (3) will then take the following form:

xa
xb
xc
xd

ya
yb
yc
yd

za
zb
zc
zd

h1
h2
h3
h4

2
1
2
2

-3
-2
-1
-2

3
2
2
2

0.750
-0.274
1.112
-3.730

1.140
0.250
-0.513
-8.661

0.112
1.220
0.500
8.260

0
0
0
1

I
transformation matrix

7-157

The newly transformed coordinates resulting from the postmultiplication of the
transformation matrix by the coordinate matrix of the tetrahedron can be computed
using equation (1) which was cited previously:

n

Cij =

E

aik * Xkj

for i = 1, ... ,n

j = 1, ... ,n

(1 )

k=1

For example, the coordinates may be computed as follows:

xa = c11

a11 * x11 + a12 * x21 + a13 * x31 + a14 * x41
2 * 0.750 + (-3) * (-0.274) + 3 * 1.112 + 1 * (-3.73)
1.5 + 0.822 + 3.336 - 3.73
1.928

ya=c12= a11 *x12+a12*x22+a13*x32+a14*X42
2 * 1.140 + (-3) * 0.250 + 3 * (-0.513) + 1x(-8.661)
2.28 -0.75 - 1.539 - 8.661
-8.67
za = c13

a11 * x13 + a12 * x23 + a13 * x33 + a14 * x43
2 * 0.112 + (- 3) * 1.220 + 3 * 0.500 + 1 * 8.260
0.224 - 3.66 + 1.5 + 8.260
6.324

h1 = c14 = a11 * x14 + a12 * x24 + a13 * x34 + a14 * x44
2 * 0 + (- 3) * 0 + 3 * 0 + 1 * 1
0+0 + 0 + 1

1
A' = [1.928 - 8.67 6.324 11
The other rotated vertices are computed in a similar manner:
B' = [- 5.208 - 3.047 3.932 11
C' = [-4.732 -1.657 5.264 1)
0' = [- 4.458 -1.907 4.044 11

Microinstructions for Sample Matrix Manipulation
The' ACT884 7 FPU can compute the coordinates for graphic objects over a broad
dynamic range. Also, the homogeneous scalar factors h1, h2, h3 and h4 may be made
unity due to the availability of large dynamic range. In the example presented below,
some of the calculations pertaining to vertex A' are shown but the same approach
can be applied to any number of points and any vector space.

7-158

The calculations below show the sequence of operations for generating two
coordinates, xa and ya, of the vertex A' after rotation. The same sequence could be
continued to generate the remaining two coordinates for A' (za and h1 I. The other
vertices of the tetrahedron, B', C', and D', can be calculated in a similar way.
Table 52 presents a pseudocode description of the operations, clock cycles, and register
contents for a single-precision matrix multiplication using the sum-of-products sequence
presented in an earlier section. Registers used include the RA and RB input registers
and the product (PI and sum (SI registers.
Table 52. Single-Precision Matrix Multiplication (PIPES2-PIPESO .. 010)
CLOCK
CYCLE
1

MULTIPLIER/ALU
OPERATIONS
Loada11,x11
SP Multiply

a11 - RA, x11 -RB
p1=a11*x11

2

Load a12, x21
SP Multiply
Pass P to S

a12 -RA, x21 -RB
p2 = a12 * x21
p1 - Plp1)

3

Load a13, x31
SP Multiply
Add P to S

a13 - RA, x31 -RB
p3 = a13 * x31, p2 -Plp2)
Plp1) + 0-Slp1)

4

Load a14, x41
SP Multiply
Add P to S

a14 - RA, x41 - RB
p4 = a14 * x41, p3-Plp3)
Plp2) + Slp1) - SIp1 + p2)

5

Load a 11, x 12
SP Multiply
Add P to S

a11 - RA, x12 - RB
p5 = a11 * x12, p4 - Plp4)
Plp3) + SIp1 + p2) - SIp1 + p2

+ p3)

6

Load a12, x22
SP Multiply
Pass P to S
Output S

a12 - RA, x22 - RB
p6 = a12 * x22, p5 - Plp5)
Plp4) + SIp1 + p2 + p3) SIp1 + p2 + p3

+ p4)

7

Load a13, x32
SP Multiply
Add P to S
Load a 14, x42
SP Multiply
Add P to S

a 13 -RA, x32- RB
p7 = a13 * x32, p6-Plp6)
Plp5) + 0 - Slp5)
a 14-RA, x42 -RB
p8 = a14 * x42, p7 - Plp7)
Plp6) + Slp5)- SIp5 + p6)

Next operands
Next instruction
Add P to S

A - RA, B - RB
pi = A * B, p8 - Plp8)
Plp7) + SIp5 + p6) - SIp5

Next operands
Next instruction
Output S

C - RA, D - RB
pj = C * D, pi - Plpi)
Plp8) + SIp5 + p6 + p7) SIp5 + p6

8

9

10

PSEUDOCODE

I"-

~
CX)
CX)

I-

+ p6 + p7)

U

«~

I"-

Z

+ p7 + p8)

7-159

en

A microcode sequence to generate this matrix multiplication is shown in Table 53.
Table 53. Microinstructions for Sample Matrix Multiplication

I I
10-0

C CC
L 00 P P
K NN I I
M FF PP
0 II EE
DGGSS
E 1-02-0

SS
EE
LL
00
PP
7-0

RR
NN
DD
1-0

S
E
L
M
S S
BEE R
S
Y L L EH
FEE S /
ANNR
OOOTSSSATT
SRRCLEEEETTELPP
TAB C S Y C S P 1 -0 T T 1-0

000 0100 0000
10001100000
100 0000 0000
100 0000 0000
100 0000 0000

o 01
o 01
o 01
o 01
o 01

0101111 xxxx
01 0 1111 xxxx
01011111010
01011111010
01011111010

00
00
00
00
00

o1
o1
o1
o1
o1

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

xx
xx
xx
xx
xx

11
11
11
11
11

100 0110 0000
100 0000 0000
100 0000 0000
100 0000 0000
10001100000

o 01
o 01
o 01
o 01
o 01

01 0 1111 xxxx
01011111010
01011111010
01011111010
01 0 1111 xxxx

00
00
00
00
00

o1
o1
o1
o1
o1

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

x
x
x
x
x

xx
xx
xx
xx
xx

11
11
11
11
11

Six cycles are required to complete calculation of xa, the first coordinate, and after
fQur more cycles the second coordinate ya is output. Each subsequent coordinate can
be calculated in four cycles so the 4-tuple for vertex A' requires a total of 18 cycles
to complete.
Calculations for vertices S', C', and D', can be executed in 48 cycles, 16 cycles for
each vertex. Processing time improves when the transformation matrix is reduced,
i.e., when the last column has the form shown below:

The h-scalars h1, h2, h3, and h4 are equal to 1. The number of clock cycles to generate
each 4-tuple can then be decreased from 16 to 13 cycles. Total number of clock cycles
to calculate all four vertices is reduced from 66 to 54 clocks. Figure 73 summarizes
the overall matrix transformation.

7-160

v

Z'

x'--------------------~~~--------------~~------------------~x

1°
I
I

I
I

B
C'

I

Z

.-0.

0'

B'

:A'

90°
P (5, -6,3)

I
I
I
I

V'

Figure 73, Resultant Matrix Transformation
This microprogram can also be written to calculate sums of products with all pipeline
registers enabled so that the FPU can operate in its fastest mode. Because of timing
relationships, the C register is used in some steps to hold the intermediate sum of
products. Latency due to pipelining and chained data manipulation is 11 cycles for
calculation of the first coordinate, and four cycles each for the other three coordinates.
After calculation of the first vertex, 16 cycles are required to calculate the four
coordinates of each subsequent vertex. Table 54 presents the sequence of calculations
for the first two coordinates, xA and yA.
Products in Table 54 are numbered according to the clock cycle in which the operands
and instruction were loaded into the RA, RB, and I register, and execution of the ~
instruction began. Sums indicated in Table 54 are listed below:
CO
CO
~
s1 = p1 + 0
s5 = p5 + p7
s9 = p10 + p12
u
s2 = p1 + p3
xA
p1 + p2 + p3 + p4
s6 = p6 + p8
s3 = p2 + p4
yA = p5 + p6 + p7 + p8
s7 = p9 + 0
~
s4 = p5 + 0
s8 = p9 + p11

«

"2

en

7-161

Table 54. Fully Pipelined Single-Precision Sum of Products (PIPES2-PIPESO = 000)

CLOCK
CYCLE
0
1
2

I
BUS

3

Mul

4
5
6
7
8
9
10
11
12
13
14
15

Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn

Mul
Mul

Chn

DA
BUS
x11
x21
x31
x41
x12
x22
x32
x42
x13
x23
x33
x43
x14
x24
x34
x44

DB
BUS
a11
a12
a13
a14
a 11
a12
a13
a14
a11
a12
a13
a14
a11
a12
a13
a14

I
REG

RA
REG

RB
REG

MUL
PIPE

Mul
Mul

x11
x21
x31
x41
x12
x22
x32
x42
x13
x23
x33
x43
x14
x24
x34

a11
a12
a13
a14
a11
a12
a13
a14
a11
a12
a13
a14
a11
a12
a13

p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14

Chn
Mul

Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn
Chn

ALU
PIPE

51
t

52
53
54
xA
55
56
57
VA
58
59

P
REG

p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13

S
REG

C
Y
REG BUS

51

p2
p2
p2
52
p6
p6 xA
p6
55
p10
p10 VA
p10

t

52
53
54
xA
55
56
57
VA
58

tContents of this register are not valid during this cycle.

Chebyshev Routines for the SN74ACT8847 FPU
Introduction
Using the SN74ACT8847, very efficient routines can be developed for the
implementation of transcendental functions. A high degree of accuracy can be achieved
by taking advantage of the' ACT884 7's ability to perform calculations using doubleprecision floating point operands.
This application note describes how to use the' ACT884 7 to implement seven different
transcendental functions. TIM (Texas Instruments Meta-Macro Assembler) assembly
files have been written for all seven functions and these files are available upon request
from Texas Instruments. The algorithm chosen to implement these functions is the
Chebyshev expansion method [11. Table 55 lists the functions that have been
implemented, along with the number of cycles required, and time required to perform
the calculations. Also listed in the table is the cycle count and time required to perform
the same calculation using the Motorola MC68881 Floating Point Coprocessor and
the Intel 80387 Numeric Processor Extension.
The Chebyshev expansion method was chosen rather than some of the more well
known methods, such as the Taylor series and Newton-Raphson approximation, for
a variety of reasons. Tht:! primary advantage of Chebyshev's method is that it provides
a uniform convergence rate in the number of terms required to achieve the desired
accuracy. Thus the range of the input value will have little effect on the accuracy of
the result. Another advantage is that the number of terms required to calculate the

7-162

approximation is relatively small. This provides for faster execution. Also, Chebyshev's
method can be applied to any function which is continuous and of bounded variation.
Lastly, tables are available which contain the constants necessary to implement
Chebyshev's method.
In order that this application note be useful to the largest audience, only those
instructions and features common to all 'ACT884 7 versions have been used to
implement the routines.
Contact Texas Instruments VLSI Logic applications group at (214) 997-3970 for a
copy of the seven TIM assembly files.
Table 55. Cycle Count and Execution Speed for the Seven Chebyshev Functions
CYCLE COUNTt

FUNCTION

'ACT8847

MC68881

Sine

51

416

Cosine

51

416

Tangent

84

498

ArcSine

68

606

ArcCosine

68

650

104

428

52

522

ArcTangent
Exponentiation

80387
122 to
771
123 to
772
191 to
497
Not
Avail.
Not
Avail.
314 to
487
Not
Avail.

EXECUTION SPEED:!:
IN MICROSECONDS
'ACT8847
MC68881
80387
7.32 to
1.53
25.0
46.3
7.38 to
1.53
25.0
46.3
11.5 to
2.52
29.9
29.8
Not
2.04
36.4
Avail.
Not
39.0
2.04
Avail.
1B.8 to
3.12
25.7
29.2
Not
31.3
1.56
Avail.

tFor MC68881 cycle count refer to 'MC68881 Floating Point Coprocessor User's Manual', Document No.
MC68881UM/AD, Page 6-13. For 80387 cycle count refer to '80387 Programmer's Reference Manual',
Document No. 231917-001, Page E-36.
;, ACT8847 cycle speed is 30 ns, 33 MHz
MC68881 cycle speed is 60 ns, 16.6 MHz
80387 cycle speed is 40 ns, 25 MHz

Overview of Chebyshev's Expansion Method
If fIx) is continuous and of bounded variation over the interval - 1 :s
fIx) may be approximated by the following equation:

X

:s 1, then

00

E

arTr(x)

r=O

7-163

Note that the range for x is between - 1 and 1. For most functions, this restriction
requires that the input, x, be range reduced before the calculation begins. Range
reducing an argument means to scale the argument down to a certain range. In the
case of Chebyshev approximations, the range is usually - 1 :s X :s 1, or 0 :s X :s 1.
In the equation for fIx) above, the constants represented by an are known as Chebyshev
coefficients. The variables represented by T r are known as Chebyshev polynomials
and can be derived from the following relationship and values:
T r +1(x) - 2xTdx) + T r -1(x) = 0,
TO(x) = 1,
T1 (x) = x
To illustrate Chebyshev's expansion method, the procedure to approximate function
fIx) using the first seven polynomials is now covered. Let
fIx) = 1/2aO +
a1T1 (x) +
a2T2(x) +
a3T3(x) +
a4T4(x) +
a5T 5(x) +
a6T6(x)
Substituting in the expressions for the polynomials,
fIx) = 1/2aO +
a1(x) +
a2(2x 2 -1) +
a3(4x 3 - 3x) +
a4(8x 4 -8x 2 + 1) +
a5( 16x 5 - 20x 3 + 5x) +
a6(32x 6 - 48x 4 + 18x 2 - 1 )
Rearranging the expression, by grouping powers of x,
fIx) = xO(1/2aO - a2 + a4 - a6) +
x 1 (a1 - 3a3 + 5a5) +
x 2 (2a2 - 8a4 + 18a6) +
x 3 (4a3 - 20a5) +
x4(8a4 - 48a6) +
x 5 ( 16a 5) +
x 6 (32a6)

en

:5
.&:lo

l>

~

ex>
ex>
.&:lo
.....J

7-164

Next make the following substitutions:
Let cO
cl
c2
c3
c4
c5
c6

=

1/2aO - a2 + a4 - a6
- 3a3 + 5a5
2a2 - 8a4 + l8a6
4a3 - 20a5
8a4 - 48a6
l6a5
32a6

= al
=

=
=
=
=

Substituting the c's into the last equation for fIx),
fIx) = CoxO
C4x4

+ cl xl + C2x2 + C3x3 +
+ C5x5 + C6 x6

Applying Horner's Rule yields,
fIx) = (((((C6X + C5)X + C4)X +
C3)X + C2)X + Cl)X + cO
In the remainder of the paper, the above equation will be referred to as Cseries'
Therefore,
Cseries_f(x) = (((((c6x

+ c5)x + C4)X +
c3)x + C2)x + Cl)X + cO

The last step prior to approximating fIx) is to calculate the c's by substituting the values
for the Chebyshev coefficients into the equations for cO through C6.

Format for the Remainder of the Application Note
Each of the seven functions will be covered in a separate section. Each section will
include the following information:
1. General steps required to perform the calculation including a description of
any preprocessing and/or postprocessing
2. An algorithm for each of the above steps
3. What system intervention, if any, is required; this intervention may take the
form of branching based on comparision status generated by the' ACT884 7,
or storing and then later retrieving intermediate results
4. The number of ' ACT884 7 cycles required to calculate fIx)
5. A listing of the c's
6. Pseudocode table showing how the calculation is accomplished. The
pseudocode tables list the contents of all the rei event ' ACT884 7 registers
and buses for each instruction.
7. Microcode table listing the instructions

7-165

,....
~

~
IU

~
,....

Z

(/)

References
[1] C. W. Clenshaw, G. F. Miller, and M. Woodger, "Algorithms for Special
Functions I," Numerische Mathematik, Vol 4, 1963, pages 403 through 419.
[2] C. W. Clenshaw, "Chebyshev Series for Mathematical Functions," Vol 5
of the Mathematical Tables of the National Physical Laboratory, Department
of Scientific Industrial Research, England, 1960.

Cosine Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision. The input is in radians.

Steps Required to Perform the Calculation
STEP 1 - Preprocessing; range-reduce the input, X, to a range of [-1,1]. Next
square this range-reduced value, multiply it by 2.0, and finally subtract
1.0. X3 is the range-reduced input value, it must be stored externally.
'TRUNC' means to truncate.
X1 ~ X*(2.0/pi)
X2 ~ (4(TRUNC(0.25(X1 + 2.0)))) - X1
If X2 > 1.0
Then X3 ~ 2.0 - X2
Else X3 ~ X2
X4 ~ 2.0*(X3*X3) - 1.0

+ 1.0

STEP 2 - Core Calculation; X4 in Step 1 will be referred to as 'x' in the core
calculation.
X5 ~ Cseries cos
~ (((((((C8~X + C7)*X + cs)*x + C5)*x +
C4)*X + c3)*x + C2)*X + C1)*X + cO
STEP 3 - Postprocessing; multiply the output of the core calculation times X3.
Cosine(X)

~

X5*X3

Algorithms for the Three Steps
Step 1 perform the preprocessing:
T1 +-X*(2.0/pi)
T2 +-T1 + 2.0
T3 +-0.25*T2 and
T4 +-1.0 - CREG
T5 +-INT(T3)
T6 +-4*T5
T7 +-DOUBLE(T6)
T8 +-T7 + CREG
CMP (1.0,T8)
If (1.0 > T8)
Then T9 +- 2.0 - CREG
Else T9 +- CREG
T10 +-CREG*CREG
T11 +-T10 *2.0
T12+-T11 - 1.0

2.0/pi entered as a constant
CREG - T1, T3 and T4 result
from a chained instruction
round controls set to truncate
CREG - T4
convert from integer to double

CREG +- T8
T9 is X3 in Step 1, must
be stored externally
CREG -+ T9

T12 is X4 in Step 1, the
input to the core routine

Step 2 perform the core calculation:
T13 +-c8*CREG
T14+-T13 + c7
T15 +-T14*CREG
T16 +-T15 + c6
T17 +-T16*CREG
T18+-T17 + c5
T19 +-T18*CREG
T20 +-T19 + c4
T21 +-T20*CREG
T22 +-T21 + c3
T23 +-T22*CREG
T24 -T23 + c2
T25 +-T24*CREG
T26 -T25 + c1
T27 -T26*CREG
T28 +-T27 + cO

CREG +- T12

Step 3 perform the postprocessing:
Cosine(X) +- T28*T9

7-167

Required System Intervention
As seen in the algorithm for Step 1, the' ACT884 7 performs a compare. The results
of this compare determine which one of two calculations is to be performed. The
system, in which the' ACT884 7 is a part. must make the decision as to which of the
two calculations is to be performed. In addition, the system must store X3 and then
later furnish X3 as an input to the 'ACT884 7.

Number of ' ACT884 7 Cycles Required to Calculate Cosine(x)
Calculation of Cosine{x) requires 46 cycles. In addition, it is assumed that five additional
cycles are required due to the compare instruction, and resulting system intervention.
Therefore. the total number of cycles to perform the Cosine{x) calculation is 51.

Listing of the Chebyshev Constants (c's)
The constants are represented in IEEE double-precision floating point format.
c8
c7
c6
c5
c4
c3
c2
c1
cO

7-168

= 3D19D46B7D4C8F32

=

BD962909C5C01 ED6

= 3EOD53517735F927
= BE7CC930FDOADA9D

= 3EE3EOAF61F7677F
= BF41E5FDEF25C403
= 3F92A9FB40C119ED
= BFD23B03366AAOC9
= 3FF4464BCC8CBA 1 F

Pseudocode Table for the Cosine(x) Calculation
Table 56. Pseudocode for Chebyshev Cosine Routine (PIPES2-0 = 010, RND1-0 =00)
ClK

DA
BUS

DB
BUS

RB
REG

0

RA2.RB2

X is the input
2DIVPI is a constant
representing 2.0/pi

X l5H

2

2DIVPI
M5H

2DIVPI
l5H

X

2DIVPI

0

RA2.RB2

3

1.0 M5H

1.0 l5H

X

2DIVPI

0

PR4+RB4

4

2.0 M5H

2.0 l5H

1.0

2.0

0

PR4+RB4

1.0

0.25

1

5R5.RB5
RA5-CR5

6

1.0

0.25

0

DP2I(PR7)

7

1.0

0.25

0

DP2I(PR7)

0.25 l5H

ALU
PIPE

P
C
REG REG

V
S
REG BUS

INSTR

X M5H

0.25 M5H

MUl
PIPE

ClK
MODE

1

5

Pl
Pl
5R5.RB5

51

RA5-CR5

Double precision - integer
P2

4

0

5R8.RB8

1.0

4

1

12DP(PR9)

10

1.0

4

1

CR10+5Rl0

54

11

1.0

4

1

COMPARE
RA11,5Rll

55

1.0

4

0

NOP

1.0

2.0

1

RB13-CR13

1.0

4

1

PA5(CR13)

12
13a

2.0 M5H

2.0 l5H

13b

14

15
16

2.0 M5H

2.0 l5H

1.0

2.0
or 4

1

CR14.CR14

1.0

2.0
or 4

0

RA16.PR16

2.0

2.0
or 4

0

RA16.PR16

-'
0)
(0

SN74ACT8847

Cycles 6,7 set RND1, 0 = 01

52

1.0

4

COMMENT

Preload RA with 1.0 for
use in cycles 5 and 11

RA2.RB2

9

8

';'I

RA
REG

52

53

P3

Integer - double-precision
If 5Rll > RAll then 13a
If 5Rll s RA 11 then 13b
Wait for system response

55

Execute 13a or 13b
Pass contents of CREG

56

CR14.CR14

56
P4

56

56 is either RB13-CR13 or
CR13 from PA55 CR13, and
must be stored externally
for use in cycle 43

56

Output 56 in cycles 14 and
15

L1788.l:l"17LNS
";-I

Table 56. Pseudocode for Chebyshev Cosine Routine (PIPES2-0 .. 010. RND1-0 .. 00) (Continued)

~

o

CLK

DA
BUS

DB
BUS

17

RB
REG

CLK
MODE

INSTR

MUL
PIPE

2.0

2.0
or 4

0

PR18+RB18

RA16.PR16

RA

REG

18

-1.0 MSH

-1.0 LSH

2.0

-1.0

0

PR18+RB18

19

c8 MSH

c8 LSH

2.0

c8

1

SR19.RB19

2.0

c8

0

PR21 +RB21

c7 MSH

c7 LSH

2.0

c7

0

PR21 +RB21

22

2.0

c7

1

SR22.CR22

23

2.0

PR24+RB24 SR22.CR22

20
21

SR19.RB19

c7
c6

0

PR24+ RB24

25

2.0

c6

1

SR25.CR25

26

2.0

c6

0

PR27+RB27 SR25.CR25

2.0

c5

0

PR27+RB27

28

2.0

c5

1

SR28.CR28

29

2.0

c5

0

PR30+RB30 SR28.CR28

2.0

c4

0

PR30+ RB30

2.0

c4

1

SR31.CR31

2.0

c4

0

PR33+RB33 SR31.CR31

2.0

c3

0

PR33+RB33
SR34.CR34
PR36+RB36 SR34.CR34

27

30

c5 MSH

c4 MSH

c6 LSH

c5 LSH

c4 LSH

31
32
33

c3 MSH

c3 LSH

34

2.0

c3

1

35

2.0

c3

0

2.0

c2

36

c2 MSH

c2 LSH

PR36+RB36

0
_._--

-

-----

C

S

S7

2.0

c6 MSH

P

V

REG REG REG BUS

COMMENT

!

P5

0

24

ALU
PIPE

S7

Start core calculation
S7 is input to core calc.

P6
S8

P7
S9

P8
S10
,

P9
S11

Pl0
S12

I
I

Pll

----~

Table 56. Pseudocode for Chebyshev Cosine Routine (PIPES2-0 .. 010. RND1-0 == 001 (Concluded I
ClK

RA
REG

RB
REG

37

2.0

c2

1

5R37.CR37

38

2.0

c2

0

PR39+RB39

2.0

c1

0

PR39+RB39

40

2.0

c1

1

5R40.CR40

41

2.0

c1

0

PR42+RB42

39

DA
BUS

c1 M5H

DB
BUS

c1 l5H

ClK
MODE

INSTR

MUl
PIPE

ALU
PIPE

P

C

S

Y

REG REG REG BUS

COMMENT

513
5R37.CR37
P12
514
5R40.CR40

42

co M5H

co l5H

2.0

co

0

PR42+RB42

43

56 M5H

56 l5H

2.0

56

1

5R43.RB43

515

44

2.0

56

0

DUMMY

5R43.RB43

45

2.0

56

0

NOP

P14

P14 Output MSH of answer

46

2.0

56

0

NOP

P14

P14 Output L5H of answer

~
-..I

SN74ACT8847

P13
Begin postprocessing
Instruction is double·
precision RA + RB, allows
time for answer to
propagate to the Y bus

-

LV88.L::l'VvLNS
....';'I

Microcode Table for the Cosine(x) Calculation

N

All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be 1/2 pi.

-...J

P
A

D
A

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FF921FB
3FE45F30
3FFOOOOO
40000000
3FDOOOOO
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
40000000
00000000
00000000
BFFOOOOO
3D19D46B
00000000
BD962909

D
B

54442D18
6DC9C883
00000000
00000000
00000000
00000000
00000000
00000004
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
7D4C8F32
00000000
C5C01 ED6

PEE C P C C
B N N L I L 0
A B K P K N
C EMF
SOl
D G
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00_
1 1 _
00_
1 1 _
0 1J
00_
00_
0 1 S
00_
0 0 _
00_
0 0 S
00_
00_
0 0 S
1 0 _
00_
0 1 _
0 1 _
0 0 S
0 1 _

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

o
o
o
o
1

o
o
o
1
1

o
1

o
o
o
o
1

o
o

3
3
3
3
3
3
3
1
3
3
3
3
3
3
3
3
3
3
3
3
3

s R H
E
L
0
P

E F
E A N L
S L C 0
E T
W
T
C

FF
FF
FB
FB
1
BD
o
FB
1
FB
o
BF
FB
F6
1
FE
1
FF
o
F7
1
5F
1
EF
o
EF
1
FB
1
FB
BF
1
FB
1 0
FB 1 1 1

o
o
o
o
0

o
o
0

o
o
o
0

o
o
0

o
o
o
o
0

o

N
S
T
R

R F S B S T S 555
N A RYE E E E E E
D S C T L SLY S C
T C EST V
P T

1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
180 o 0 0 0 3 3
180 o 0 0 0 3 3
581 0 0 1 0 3 3
1A3
00033
1A3 1 000 3 3
240 o 0 0 0 3 3
1A2 o 0 0 0 3 3
180 o 0 0 0 3 3
182 o 0 0 0 3 3
300 o 0 0 0 3 3
1AO o 0 0 033
1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
180 o 0 0 0 3 3
180 o 0 0 0 3 3
1CO o 0 003 3
180 o 0 0 0 3 3
180 0 0 0 0 3 3

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000

Microcode Table for the Cosine(x) Calculation (Continued)
p

';'I
~

-.J

w

A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
00000000
3EOD5351
00000000
00000000
BE7CC930
00000000
00000000
3EE3EOAF
00000000
00000000
BF41E5FD
00000000
00000000
3F92A9FB
00000000
00000000
BFD23B03
00000000
00000000
3FF4464B
00000000
00000000
00000000
00000000

00000000
00000000
7735F927
00000000
00000000
FDOADA9D
00000000
00000000
61F7677F
00000000
00000000
EF25C403
00000000
00000000
40C119ED
00000000
00000000
366AAOC9
00000000
00000000
CC8CBA 1F
00000000
00000000
00000000
00000000

SN74ACT8847

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
D G

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
00_
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
0 1 _
0 0 _
0 0 _
0 0 _

II

1 3 9F
0 3 FB
3 FB
1 3 9F
0 3 FB
0 3 FB
1 3 9F
0 3 FB
0 3 FB
1 3 9F
3 FB
0 3 FB
1 3 9F
0 3 FB
0 3 FB
1 3 9F
0 3 FB
0 3 FB
1 3 9F
0 3 FB
0 3 FB
1 3 BF
0 -3 FF
0 3 FF
0 3 FF

R H

E F
E A N L
S L C 0
E T
W
T
C
1 1
1
1
1 1

o

1

o

1 1 1
1 1
1

1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

o
o

I
N
S
T
R

1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
300
300

aaa

R F S B S T S
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o
o
o
o
o
o
o
o
o
o
o
o
o

0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
00003
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
o0 0 0 3
o0 0 0 3
o0 0 0 3
o0 0 0 3
o0 0 0 3

o
o
o
o
o

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

1

1

1

1
0

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
0 0 0

Sine Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision. The input is in radians.

Steps Required to Perform the Calculation
STEP 1 - Preprocessing; range reduce the input, X, to a range of [ -1,1 J. Next
square this range-reduced value, multiply it by 2.0, and finally subtract
1.0. X3 is the range-reduced input value, it must be stored externally.
'TRUNC' means to truncate.
X1 - X*(2.0/pi)
X2 - X1 - (4(TRUNC(0.25(X1
If X2 > 1.0
Then X3 - 2.0 - X2
Else X3 - X2
X4 - 2.0*(X3*X3) - 1.0

+

1.0))))

STEP 2 - Core calculation; X4 in Step 1 will be referred to as 'x' in the core
calculation.
X5 -

Cseries_sin

-

(((((((C8**x
C4)*X

+ C7)*X + C6)*X + C5)*x +
+ c3)*x + C2)*X + c1 )*x + cO

STEP 3 - Postprocessing; multiply the output of the core calculation times X3.
Sine(X) -

X5*X3

Algorithms for the Three Steps
Step 1 perform the preprocessing:
T1 -X*(2.0/pi)
T2 -T1 + 1.0
T3 -0.25*T2
T4 -INT(T3)
T5 -4*T4
T6 -DOUBLE(T5)
T7 -CREG - T6
CMP (1.0,T7)
If (1.0 > T7)
Then T8 - 2.0 - CREG
Else T8 - CREG
T9 .... CREG*CREG
T10 .... T9 *2.0
T11 -T10 - 1.0

7-174

2.0/pi entered as a constant
CREG .... T1
round controls set to truncate
convert from integer to double
compare 1.0 to T7
CREG - T7
T8 is X3 in Step 1, must
be stored externally
CREG .... T8

T11 is X4 in Step 1 above, the input to
the core routine
T11 = 'x' from Step 2 above

Step 2 perform the core calculation:
T12 +-c8*CREG
T13+-T12 + c7
T14 +-T13*CREG
T15+-T14 + c6
T16 +-T15*CREG
T17+-T16+C5
T18 +-T17 *CREG
T19 +-T18 + c4
T20 +-T19*CREG
T21 +-T20 + c3
T22 +-T21 *CREG
T23 +-T22 + c2
T24 +-T23*CREG
T25 +-T24 + c1
T26 +-T25*CREG
T27 +-T26 +- cO

CREG +- T11

Step 3 perform the postprocessing:
Sine(X) +- T27*T8

Required System Intervention
As seen in the algorithm for Step 1, the' ACT884 7 performs a compare. The results
of this compare determine which one of two calculations is to be performed. The
system, in which the 'ACT884 7 is a part, must make the decision between which
two calculations are to be performed. In addition, the system must store X3 and then
later furnish X3 as an input to the' ACT884 7.

Number of ' ACT8847 Cycles Required to Calculate Sine(x)
Calculation of Sine(x) requires 46 cycles. In addition, it is assumed that five additional
cycles are required due to the compare instruction and resulting system intervention.
Therefore, the total number of cycles to perform the Sine(x) calculation is 51.

Listing of the Chebyshev Constants (c's)
The constants are represented in IEEE double-precision floating point format.
c8
C7
c6
c5
c4
c3
c2
c1
cO

= 3D19D46B7D4C8F32
= BD962909C5C01 ED6
= 3EOD53517735F927
= BE7CC930FDOADA9D
= 3EE3EOAF61F7677F
= BF41E5FDEF25C403
= 3F92A9FB40Cl19ED
= BFD23B03366AAOC9
= 3FF4464BCC8CBA 1 F
7-175

L1788.l:l'd17LNS
~

-

Pseudocode Table for the Sine(x) Calculation

-.J
(])

Table 57. Pseudocode for Chebyshev Sine Routine (PIPES2-0 ... 010, RND1-0 - 00)
ClK

DA
BUS

DB
BUS

1

X MSH

X lSH

2

2DIVPI
MSH

2DIVPI
lSH

3

RA
REG

RB
REG

ClK
MODE

INSTR

0

RA2.RB2

X is the input

i

2DIVPI is a constant
representing 2.0/pi

i

X

2DIVPI

0

RA2.RB2

X

2DIVPI

0

PR4+RB4

4

1.0 MSH

1.0 lSH

X

1.0

0

PR4+RB4

5

0.25 MSH

0.25 lSH

X

0.25

1

SR5.RB5

6

1.0 MSH

1.0 lSH

X

0.25

0

DP2I(PR7)

1.0

0.25

0

DP2I(PR7)

1.0

4

0

SR8.RB8

7
4

8

MUl
PIPE

ALU
PIPE

P
C
REG REG

S
Y
REG BUS

RA2.RB2
P1
P1

S1
Double precision

P2

4

1

12DP(PR9)

1.0

4

1

CR10-SR10

S3

11

1.0

4

1

COMPARE
RA11,SR11

S4

12

1.0

4

0

NOP

1.0

2.0

1

RB13-CR13

1

PAS(CR13)

2.0 lSH

1.0

4

1.0

2.0
or 4

1

CR14.CR14

1.0

2.0
or 4

0

RA16.PR16

16

2.0

2.0
or 4

0

RA16.PR16

17

2.0

2.0
or 4

0

PR18+RB18

2.0

-1.0

0

PR18+RB18

13b

14

15

2.0 MSH

18 -1.0 MSH

2.0 lSH

-1.0 lSH

integer

Cycles 6,7 set RND1,O = 01

1.0

-

-+

I

S2

9

2.0 MSH

!

SR5.RB5

10

13a

COMMENT

Integer

P3

-+

double precision
I
I
•

If SR11 -+ RA 11 then 13a
If SR11 :s RA11 then 13b

S4

Execute 13a or 1 3b
Pass contents of CREG

S5

CR14.CR14

S5
P4

RA16.PR16
P5

I

Wait for system response

S5

S5 is either RB13-CR13 or
CR13 from PASS CR13, and
must be stored externally
for use in cycle 43

S5

Output S5 in cycles 14 and
15

.

Table 57. Pseudocode for Chebyshev Sine Routine (PIPES2-0
CLK

DA
BUS

DB
BUS

RA
REG

RB
REG

c8 MSH

c8 LSH

2.0

CLK
MODE

INSTR

MUL
PIPE

c8

1

SR19.RB19

2.0

c8

0

PR21 +RB21

2.0

c7

0

PR21 +RB21

22

2.0

C7

1

SR22.CR22

23

2.0

c7

0

PR24+RB24 SR22.CR22

2.0

19
20
21

24

c7 MSH

c7 LSH

c6

0

PR24+RB24

2.0

c6

1

5R25.CR25

2.0

c6

0

PR27+RB27

2.0

c5

0

PR27+RB27

2.0

c5

1

SR28.CR28

2.0

c5

0

PR30+RB30

2.0

c4

0

PR30+RB30

31

2.0

c4

1

SR31.CR31

32

2.0

c4

0

PR33+RB33

2.0

c3

0

PR33+RB33

34

2.0

c3

1

5R34.CR34

35

2.0

c3

0

PR36+RB36

2.0

c6 MSH

c6 L5H

25
26
27

c5 M5H

c5 L5H

28
29
30

33

36

c4 MSH

c3 MSH

c2 M5H

c4 L5H

c3 L5H

c2 L5H

37
38
39
40

c1 M5H

c1 L5H

c2

0

PR36+RB36

2.0

c2

1

5R37.CR37

2.0

c2

0

PR39+RB39

2.0

c1

0

PR39+RB39

2.0

c,

1

5R40.CR40

~
-.J
-.J

SN74ACT8847

ALU
PIPE

010, RND1-0
P

C

S

00) (Continued)
y

REG REG REG BUS
S6

SR19.RB19

COMMENT
Start core calculation
S7 is input to core calc.

S6
P6
S7

P7
S8
5R25.CR25
P8
59
5R28.CR28
P9
510
5R31.CR31
P10
511
5R34·CR34
P11
S12
SR37.CR37
P12
513

L t7BB.i::n1t7LNS
Table 57. Pseudocode for Chebyshev Sine Routine (PIPES2-0

';'l
-.J

co

ClK

DA
BUS

DB
BUS

41

RA
REG

RB
REG

ClK
MODE

INSTR

MUl
PIPE

2.0

c1

0

PR42+RB42

SR40.CR40

42

Co MSH

co LSH

2.0

cO

0

PR42+RB42

43

S5 MSH

S5 LSH

2.0

S5

1

SR43.RB43

AlU
PIPE

010, RND1-0
P

C

S

00) (Concluded)
y

REG REG REG BUS

COMMENT

P13
S14

Begin postprocessing
Instruction is doubleprecision RA + RB, allows
time for answer to

44

2.0

S5

0

DUMMY

SR43.RB43

45

2.0

S5

0

NOP

P14

P14

Output MSH of answer

46

2.0

S5

0

NOP

P14

P14

Output LSH of answer

propagate to the Y bus

Microcode Table for the Sine(x) Calculation
All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be 1/2 pi.
p

A

0
A

0
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FF921FB
3FE45F30
00000000
3FFOOOOO
3FOOOOOO
3FFOOOOO
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
40000000
00000000
00000000
BFFOOOOO
3D 19046B
00000000
B0962909

54442018
60C9C883
00000000
00000000
00000000
00000000
00000000
00000004
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
704C8F32
00000000
C5C01 E06

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P

o

~
~

-..J

co

SN74ACT8847

o

E F
E A N L
S L C 0
E T
W
T
C

I
N
S
T
R

R F S B S T S 555
N A RYE E E E E E
o S C T L SLY S C
T C EST Y
P T

G

F 0 0 _ 2 0 3
F 1 1 _ 2 0 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
F01.r2 1 3
F 0 0 _ 2 0 3
F 1 0 _ 2 0 3
F01_201
F 00_ 2 1 3
F 00_ 2 1 3
F 00_ 2 1 3
F 0
2 0 3
F 00_ 2 1 3
F 00_ 2 1 3
FOO.I2
3
F 1 0 _ 2 0 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
F 0 1 _ 2 1 3
F 0 0 I
2 0 3
F 0 1 _ 2 0 3

o.r

R H

FF
FF
FB
FB
BF
FB
FB
BF
FB
F6
FE
FF
F7
5F
EF
EF 1
FB
FB
BF
FB
FB

1
1

o

o
o
o
o
0

1
1
1
1

o
o
o
o
o
o

1
1

o
o

1

1

o
o
o
o

1

o

o
o

o

0

0

0

1CO
1CO
180
180
1CO
1A3
1A3
240
1A2
181
182
300
1AO
1CO
1CO
1CO
180
180
1CO
180
180

o
o
o
o

0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
001 033
000
1 000 3 3
000
1 000 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000
000 3 3
000
0 0 0 3 3 1 000
0 0 0 3 3
000
0 0 0 3 3
000
000 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3
000

o
o
o
o
o
o
o
o
o
o
o
o
o

LV88.1:lVvLNS
;J
00

0

-

Microcode Table for the Sine(x) Calculation (Continued)

~

A

,0
A

0
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
00000000
3E005351
00000000
00000000
BE7CC930
00000000
00000000
3EE3EOAF
00000000
00000000
BF41E5FO
00000000
00000000
3F92A9FB
00000000
00000000
BF023B03
00000000
00000000
3FF4464B
3FFOOOOO
00000000
00000000
00000000

00000000
00000000
7735F927
00000000
00000000
FDOADA90
00000000
00000000
61F7677F
00000000
00000000
EF25C403
00000000
00000000
40C119EO
00000000
00000000
366AAOC9
00000000
00000000
CC8CBA 1 F
00000000
00000000
00000000
00000000

p

PEE C
B N N L
A B K
C

P C C S
I L 0 E
P K N L
EMF 0
SOl P

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

o
00_
00_
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
00_
0 1 _
00_
0 0 _
0 1 _
00_
00_
0 1 _
0 1 _
0 0 _
0 0 _
0 0 _

o

o

o

E F
E A N L
S L C 0
E T
W
T
C

I
N
S
T
R

R F S B S T S 555
N A RYE E E E E E
o S C T L SLY S C
T C EST Y
P T

G

1 3
3
0 3
1 3
0 3
3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
3
0 3
1 3
0 3
0 3
1 3
3
0 3
1 3
0 3
0 3
0 3

o

R H

9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
BF
FF
FF
FF

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1 1
1
1
1
1
1
1
1
1 1
1 1
1 1
1 1
1 1
1
1
1
1
1
1
1
1
1
1
1 1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
300
300

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
003
0 0 3
0 0 3
0 0 3
0 0 3
003
0 0 3
0 0 3
0 0 3
0 0 3

3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
3 1 000
300 0 0

Tangent Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision. The input is in radians.

Steps Required to Perform the Calculation
STEP 1 - Preprocessing; range reduce the input, X, to a range of [- 1,11. Next
square this range-reduced value, multiply it by 2.0, and finally subtract
1.0. X3 is the range-reduced input value, it must be stored externally.
'TRUNC' means to truncate. If X2 > 1 .0, then in the postprocessing
part of the routine, the answer is the reciprocal of X5*X3.
X1 +- X*(4.0/pi)
X2 +- X1 - (4(TRUNC(0.25(X1
If X2 > 1.0
Then X3 +- 2.0 - X2
Else X3 +- X2
X4 +- 2.0*(X3*X3) - 1.0

+

1.0))))

STEP 2 - Core Calculation; X4 in Step 1 will be referred to as 'x' in the core
calculation.

+- (((((((((((((C14)*X
cS)*x + cS)*x
C2)*X + c1 )*x

+ C13)*X + C12)*X + C11)*X + ClO)*x +
+ C7)*X + c6)*x + c5)*x + C4)*x + C3)*x +
+ cO

STEP 3 - Postprocessing; multiply the output of the core calculation times
X3. If X2 > 1.0, then the reciprocal of X5*X3 is the answer, if
X2 :5 1.0 then X5*X3 is the answer.
Tangent(X) +- X5*X3 (or reciprocal of X5*X3)

Algorithms for the Three Steps
Step 1 perform the preprocessing:
+-X*(4.0/pi)
T2 +-T1 + 1.0
T3 +-0.25*T2
T4 +-INT(T3)
T5 +-4*T4
T6 +-DOUBLE(T5)
T7 +-CREG +- T6
CMP (1.0,T7)
If (1.0 > T7)
Then TS +- 2.0 - CREG
Else TS +- CREG

T1

4.0/pi entered as a constant
I'
~

CREG +- T1
round controls set to truncate

CO
CO

....
«~

U

convert from integer to double

I'

Z

CREG +- T7
TS is X3 in Step 1, must
be stored externally

CJ)

7-181

T9 -CREG*CREG
T10 -T9*2.0
T11-T10 - 1.0

CREG - T8
T11 is X4 in Step 1, the
input to the core routine

Step 2 perform the core calculation:
T12 -c14 *CREG
T13-T12 + c13
T14-T13*CREG
T15-T14 + c12
T16-T15*CREG
T17-T16 + c11
T18 -T17*CREG
T19-T18 + c10
T20 -T19*CREG
T21 -T20 + c9
T22 -T21 *CREG
T23-T22 + c8
T24 -T23*CREG
T25-T24 + C7
T26 -T25*CREG
T27 -T26 + c6
T28 -T27*CREG
T29 -T28· + c5
T30 -T29*CREG
T31 -T30 + c4
T32 -T31 *CREG
T33-T32 + c3
T34 -T33*CREG
T35-T34 + c2
T36 -T35*CREG
T37 -T36 + c1
T38 -T37*CREG
T39-T38 + cO
Step 3 perform the postprocessing:
T40-T39*T8
If X2 (in Step 1) > 1.0
Then Tangent(X) - 1.0/T40
Else Tangent(X) - T40

7-182

CREG - T11

Required System Intervention
As seen in the algorithm for Step 1, the' ACT884 7 performs a compare. The results
of this compare determine which one of two calculations is to be performed. The
system, in which the' ACT884 7 is a part, must make the decision as to which of the
two calculations is to be performed. In addition, the system must store X3 and then
later furnish X3 as an input to the' ACT884 7. Finally, the system will have to determine
if it is necessary to take the reciprocal of the final product (T40 in the Algorithm for
Step 3) to yield the answer. If it is necessary to take the reciprocal, then the system
will be required to direct the variable T 40 from the' ACT884 7' s output bus to the input
buses. This is because operands for division instructions must be provided by the RA
and RB registers; feedback is not an option.

Number of ' ACT8847 Cycles Required to Calculate Tangent(x)
Calculation of Tangent(x) requires 79 cycles. In addition, it is assumed that five
additional cycles are required for system intervention due to the compare instruction.
Therefore, the total number of cycles required to perform the Tangent(x) calculation
is 84.

listing of the Chebyshev Constants (c's)
The constants are represented in IEEE double-precision floating point format.
c14
c13
c 12
c11
c10
c9
c8
C7
c6
c5
c4
c3
c2
c1
cO

= 3D747D842210CC35
= 3DA 1 D66636043991

= 3DCCD078F52B3A 73
= 3DF938F9CDDFF864

= 3E2620430E99B5B7

= 3E535C2C953CE515
=

3E80F07AFC099D7F

= 3EADA4D789EB45C4
= 3ED9F03D4C51A771
= 3F06B236DE4D014C

= 3F33DBFB01B3F415
= 3F6160DE701 F3A53

= 3F8E70A18736FC10
= 3FBAEA2653199611
= 3FEC14B2675B10BA

7-183

Lv881~.HfvLNS
~

-

Psuedocode Table for the Tangent{x) Calculation

00

"'"

Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0
CLK

DA
BUS

DB
BUS

RA
REG

RB
REG

X

4DIV
PI

INSTR

0

RA2.RB2

X is the input

0

RA2.RB2

4DIVPI is a constant
representing 4.0/pi

X MSH

X LSH

2

4DIVPI
MSH

4DIVPI
LSH

X

4DIVPI

0

PR4+RB4

4

1.0 MSH

1.0 LSH

X

1.0

0

PR4+RB4

5

0.25 MSH

0.25 LSH

X

0.25

1

SR5*RB5

6

1.0 MSH

1.0 LSH

X

0.25

0

DP2I(PR7)

1.0

0.25

0

DP21(PR7)

1.0

4

0

SR8.RB8

7
4

8

ALU
PIPE

P

C

S

RA2.RB2
P1
P1

S1
Double precision

SR5·RB5
P2

1.0

4

1

12DP(PR9)

4

1

CR10-SR10

S3

11

1.0

4

1

COMPARE
RA11,SR11

S4

1.0

4

0

NOP

1.0

2.0

1

RB13-CR13

1.0

4

1

PAS(CR13)

1.0

2.0
or 4

1

CR14.CR14

1.0

2.0
or 4

0

RA16·PR16

16

2.0

2.0
or 4

0

RA16.PR16

17

2.0

2.0
or 4

0

PR18 + RB18

2.0

-1.0

0

PR18 + RB18

2.0 LSH

13b

integer

= 01

S2

1.0

2.0 MSH

~

Cycles 6,7 set RND1,0

9

13a

COMMENT

REG REG REG BUS

10

12

y

CLK
MODE

1

3

MUL
PIPE

010, RND1-0 = 0)

P3

Integer

~

double precision

If SR 11 > RA 11 then 13a
If SR 11 ,,; RA 11 then 13b

S4

Wait for system response
Execute 13a or 13b
Pass contents of Creg
S5 is either RB13-CR13 or

14

15

18

2.0 MSH

-1.0 MSH

2.0 LSH

-1.0 LSH

S5

CR14·CR14

S5
P4

RA16·PR16
P5

S5

S5

CR13 from PASS CR13, and
must be stored externally
for use in cycle 61
Output S5 in cycles 14 and
15

Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0 ... 010. RND1-0
CLK
19

DA
BUS

DB
BUS

RA
REG

RB
REG

c14 MSH

c14 LSH

2.0

c14

1

SR19.RB19

2.0

c14

0

PR21 +RB21

2.0

20
21

MUL
PIPE

0

PR21 + RB21

c13

1

SR22.CR22

23

2.0

c13

0

PR24+RB24 SR22.CR22

2.0

c12

0

PR24+RB24

c12

1

SR25.CR25

c12 MSH

c13 LSH

c12 LSH

25

2.0

26

2.0

c12

0

PR27+RB27

2.0

c"

0

PR27+RB27

2S

2.0

cll

1

SR2S.CR2S

29

2.0

cll

0

PR30+RB30

2.0

cl0

0

PR30+RB30

31

2.0

clO

1

SR31.CR31

32

2.0

clO

0

PR33+RB33

2.0

c9

0

PR33+RB33

30

33

cll MSH

clO MSH

c9 MSH

cl1 LSH

clO LSH

c9 LSH

34

2.0

c9

1

SR34.CR34

35

2.0

c9

0

PR36+RB36

2.0

36

Cs

0

PR36+RB36

37

2.0

cs

1

SR37.CR37

3S

2.0

Cs

0

PR39+RB39

2.0

c7

0

PR39+RB39

40

2.0

c7

1

SR40.CR40

41

2.0

c7

0

PR42+RB42

2.0

c6

0

PR42+RB42

39

42

cs MSH

c7 MSH

c6 MSH

Cs LSH

c7 LSH

c6 LSH

U1

SN74ACT8847

PIPE

P

C

S

y

REG REG REG BUS
S6

c13

c13 MSH

ALU

SR19.RB19

2.0

27

";'l

INSTR

22

24

IX)

CLK
MODE

S6

0) (Continued)
COMMENT
Start core calculation
S7 is input to core calc.

P6
S7

P7
SB
SR25.CR25
PS
S9
SR2S.CR2S
P9
S10
SR31.CR31
Pl0
Sll
SR34.CR34
Pll

I

S12
SR37·CR37
P12
S13
SR40.CR40
P13

,

Lv88.l:>"vLNS
Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0

;J

.....

00
Ol

RA
REG

RB
REG

43

2.0

c6

1

5R43.CR43

44

2.0

PR45+ RB45

CLK

DA
BUS

DB
BUS

CLK

MODE

INSTR

c6

0

2.0

c5

0

PR45+ RB45

46

2.0

c5

1

5R46.CR46

47

2.0

c5

0

PR48+RB48

2.0

c4

0

PR48+ RB48

49

2.0

c4

1

5R49.CR49

50

2.0

c4

0

PR51 +RB51

2.0

c3

0

PR51 +RB51

1

5R52.CR52

45

48

51

c5 L5H

c5 M5H

c4 L5H

c4 M5H

c3 M5H

c3 L5H

52
53
54

c2 M5H

c2 L5H

2.0

c3

2.0

c3

0

PR54+RB54

2.0

c2

0

PR54+RB54

c2

1

5R55.CR55

2.0

55

2.0

c2

0

PR57+RB57

2.0

c1

0

PR57+RB57

58

2.0

c1

1

5R58.CR58

59

2.0

PR60+RB60

56
57

c1 M5H

c1 L5H

c1

0

60

co M5H

Co L5H

2.0

co

0

PR60+RB60

61

55 M5H

55 L5H

2.0

55

1

5R61>RB61

2.0

62
-

55

0

DUMMY

MUL
PIPE

ALU
PIPE

010. RND1-0
P

C

S

V

REG REG REG BUS

0) (Concluded)
COMMENT

514
5R43.CR43
P14
515
5R46.CR46
P15
516
5R49.CR49
P16
517
5R52.CR52
P17
51B
5R55.CR55
P18
519
5R5B.CR5B
P19
520
5R61.RB61

Begin postprocessing
Instruction is RA + RB, used
to allow time for result
to propagate to Y bus

Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0 .. 010. RND1-0
ClK

DA
BUS

DB
BUS

63

RA
REG

2.0

RB
REG

S5

ClK
MODE

0

INSTR

MUl
PIPE

AlU
PIPE

NOP

P

C

S

Y

REG REG REG BUS

P20

0) (Continued)
COMMENT

Output MSH. if cycle 13b
was executed then P20 is
the answer; if cycle 13a
P20
was executed then the
answer is 1.0/P20. which
is calculated next

64

1.0 M5H

1.0 l5H

2.0

55

0

DIV

65

P20 M5H

P20 L5H

1.0

P20

0

DIV

Operands for Division must
come from RA and RB.
feedback is not an option

P20 Output l5H

66

1.0

P20

0

NOP

Wait for Division result

67

1.0

P20

0

NOP

Wait for Division result

68

1.0

P20

0

NOP

Wait for Division result

69

1.0

P20

0

NOP

Wait for Division result

70

1.0

P20

0

NOP

Wait for Division result

71

1.0

P20

0

NOP

Wait for Division result

72

1.0

P20

0

NOP

Wait for Division result

73

1.0

P20

0

NOP

Wait for Division result

74

1.0

P20

0

NOP

Wait for Division result

75

1.0

P20

0

NOP

Wait for Division result

76

1.0

P20

0

NOP

Wait for Division result

77

1.0

P20

0

NOP

78

1.0

P20

0

NOP

79

1.0

P20

0

NOP

...~

CO
-.J

SN74ACT8847

Wait for Division result

-----------

P21

P21

Output M5H of answer

P21

P21

Output L5H of answer

L~88.l::>-V~lNS
';'I

.....
(10
(10

Microcode Table for the Tangent(x) Calculation
All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be 1/3 pi.
P
A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FFOC152
3FF45F30
00000000
3FFOOOOO
3FDOOOOO
3FFOOOOO
00000000
00000000
00000000
00000000
00000000
00000000
40000000
00000000
40000000
00000000
00000000
BFFOOOOO
3D747D84
00000000
3DA1D666

382D7365
6DC9C883
00000000
00000000
00000000
00000000
00000000
00000004
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
2210CC35
00000000
36043991

.p EE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
£ M F 0
SOl P
D G

F 0 0 _ 2 0 3
F 1 1 _ 2 0 3
F 0 0 _ 2 0 3
F 0 1
2 0 3
F 0 1 S
2 1 3
F 0 0 _ 2 0 3
F 1 0 _ 2 0 3
F01_201
F 00_ 2 1 3
F 00_ 2 1 3
F 00_ 2 1 3
F 0 0 J
2 0 3
F 0 1
2 1 3
F 00_ 2 1 3
F 0 0 I
2 0 3
F 1 0 _ 2 o 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
2 1 3
F 0 1
F 0 O..f" 2 0 3
F 0 1 _ 2 0 3

FF
FF
FB
FB
BF
FB
FB
BF
FB
F6
FE
FF
F7
5F
EF
EF
FB
FB
BF
FB
FB

R H

E F
E A N L
S L C 0
E T
W
T
C

1
1
1
1

o

o
o
o
o
0

1
1
1
1
1
1

o
o
o
o
o
o

1
1

o
o

o

0

1
o 0
1
1 o
·1
1 o
1 1 1 o
1
1 o
o 0
1
1
1 o

I
N
S
T
R

RF
N A
D S
T

S B S T S 0 0 0
RYE E E E E E
C T L SLY S C
C EST Y
P T

1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
180 o 0 0 0 3 3
180 o 0 0 0 3 3
1CO 0 0 1 0 3 3
1A3 1 000 3 3
1A3 1 000 3 3
240 o 0 0 0 3 3
1A2 o 0 0 0 3 3
181 .0 0 0 0 3 3
182 o 0 0 0 3 3
300 o 0 0 0 3 3
183 o 0 0 0 3 3
1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
1CO o 0 0 0 3 3
180 o 0 0 0 3 3
180 o 0 0 03 3
1CO o 0 0 0 3 3
180 o 0 0 0 3 3
180 o 0 0 0 3 3 1

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000

Microcode Table for the Tangent(x) CalcuJation (Continued)
p
A

~
.....

co

(l)

D
A

F 00000000
F 00000000
F 3DCCD078
F 00000000
F 00000000
F 3DF938F9
F 00000000
F 00000000
F 3E262043
F 00000000
F 00000000
F 3E535C2C
F 00000000
F 00000000
F 3E80F07A
F 00000000
F 00000000
F3EADA4D7
F 00000000
F 00000000
F 3ED9F03D
F 00000000
F 00000000
F 3F06B236

D
B

00000000
00000000
F52B3A73
00000000
00000000
CDDFF864
00000000
00000000
OE99B5B7
00000000
00000000
953CE515
00000000
00000000
FC099D7F
00000000
00000000
89EB45C4
00000000
00000000
4C51 A 771
00000000
00000000
DE4D014C

PEE C
B N N L
A B K
C

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

SN74ACT8847

00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _

s RH

P C C
I L 0 E
P K N L
EMF 0
SOl P
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

o

G

1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

o

E F
E A N L
S L C 0
E T
W
T
C

9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB 1
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB

1
1
1
1 1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

I
N
S
T
R

1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180

R F S B S T S 000
N A RYE E E E E E
o S C T L SLY S C
·T C EST Y
P T

o
o
o
o
o
o
o
o
o
o
o
o
o

0 003 3
0 0 0 3 3
0 0 0 3 3
0 003 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3 1
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 003 3
0 0 0 3 3
00003 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
00003 3
0 003 3
0000331
0 003 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3

o
o
o
o

o
o
o
o

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000

Lv88.l:lVvLNS
....~
0

-

Microcode Table for the Tangent(x} Calculation (Continued)

<0

P
A

0
A

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
00000000
3F330BFB
00000000
00000000
3F61600E
00000000
00000000
3F8E70A1
00000000
00000000
3FBAEA26
00000000
00000000
3FEC14B2
3FE55555
00000000
00000000
3FFOOOOO
3FE279A7
00000000
00000000
00000000
00000000

0
B

00000000
00000000
01 B3F415
00000000
00000000
701F3A53
00000000
00000000
8736FC10
00000000
00000000
53199611
00000000
00000000
675B10BA
55555555
00000000
00000000
00000000
45903310
00000000
00000000
00000000
00000000

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
o G

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

00_
0 0 _
0 1
00_
0 0 _
0 1
00_
0 0
0 1
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
0 1
0 0 _
0 0
0 0 _
1 1 _
0 0 _
0 0 _
0 0 _
0 0 _

1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
0
0
0
0
0

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
BF
FF
FF
FF
FF
FF
FF
FF
FF

R H

E F
E A N L
S L C 0
E T
W
T
C

1
1
1
1
1
1
1
1
1
1
1
1
1
1

1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

I
N
S
T
R

1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
300
1EO
1EO
0300
300
300
300

o
o
o

R F S B S T S 000
N A RYE EE E E E
o S C T L SLY S C
T C EST Y
P T
000 0 3 3
000
0 0 0 3 3
000
0 0 0 3 3 1 000
000 0 3 3 1 000
0 0 0 331 000
0 0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
000 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 300 0 0
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000
0 0 0 3 3 1 000

o
o
o

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

Microcode Table for the Tangent(x) Calculation (Concluded)
p
A

D
A

F 00000000
F 00000000
F 00000000
F 00000000
F 00000000
F 00000000
FOOOOOOOO
F 00000000
F 00000000
F 00000000

D
B

00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000

PEE C
B N N l
A B K
C

pee
I l 0
P K N
EMF
SOl
D G

s
E
l
0
P

R H

F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2

FF
FF
FF
FF
FF
FF
FF
FF
FF
FF

1
1
1
1
1

~
~

CD

SN74ACT8847

0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0

_
_
_
_
_
_
_
_
_
_

0
0
0
0
0
0
0
0
0
0

3
3
3
3
3
3
3
3
3
3

E F
E A N l
S leo
E T
W
T
C

1
1

o
o
o
o
o
o
o
o
o
o

I
N
S
T
R

300
300
300
300
300
300
300
300
300
300

R F S B S T S 000
N A RYE E E E E E
D seT l SLY S C
TeE STY
P T

o
o
o
o
o
o
o
o
o
o

0
0
0
0
0
0
0
0
0
0

003
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3

3
000
3
000
3
000
3
000
3 1 000
3
000
3
000
3
000
3 1 000
3 0 0 0 0

ArcSine & ArcCosine Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision. The output is in radians.

Steps Required to Perform the Calculation
STEP 1 - Preprocessing; range reduction is not needed, because an input, X,
outside the range of [-1,11 indicates an error. This routine requires
that the X2 be less than or equal to 1/2. The first operation to be
performed is to square X, then multiply it by 4.0, and finally subtract
1.0.

STEP 2 - Core Calculation; X1 in Step 1 will be referred to as 'x' in the core
calculation.
X2 ~ Cseries_asin&acos
~

((((((((((((((((C1S*X +C17)*x + C16)*x +
c15*x + C14)*X + C13)*x + C12)*X + C11)*X + C10)*X +
cg)*x + cS)*x +C7)*X + C6)*x + 05)*x + C4)*X + C3)*x +
C2)*X + c1 )*x + cO
.

STEP 3 - Postprocessing; ml!ltiply the output of the core calculation times
SORT(2.0), then multiply this product by X, the original input. This
yields ArcSine(X). To calculate ArcCosine(X), the fqllowing identity
is used:
ArcCosine(X) = pi/2 - ArcSine(X)
X3 ~ X2*SORT(2.0)
ArcSine(X) +- X3*X
ArcCosine(X) +- pi/2 - ArcSine(X)

Algorithms for the Three Steps
Step 1 perform the preprocessing:
T1 +-X*X
T2 ~4.0*T1
T3 +-T2 - 1

7-192

T3 is X 1 in Step 1, the input to the core
routine

Step Two perform the core calculation:
T4 -c18*CREG
T5 -T4 + c17
T6 -T5*CREG
T7 -T6 + c16
T8 -T7*CREG
T9 -T8 + c15
Tl0 -T9*CREG
Tll -Tl0 + c14
T12 -Tll *CREG
T13-T12 + c13
T14-T13*CREG
T15-T14 + c12
T16 -T15*CREG
T17 -T16 + c11
T18-T17*CREG
T19-T18 + clO
T20 -T19*CREG
T21 -T20 + c9
T22 -T21 *CREG
T23-T22 + c8
T24 -T23*CREG
T25-T24 + c7
T26 -T25*CREG
T27-T26 + c6
T28 -T27*CREG
T29-T28 + c5
T30 -T29*CREG
T31 -T30 + c4
T32 -T31 *CREG
T33-T32 + c3
T34 -T33*CREG
T35-T34 + c2
T36 -T35*CREG
T37 -T36 + cl
T38 -T37*CREG
T39-T38 + cO

CREG - T3

"

~

00
00
~

u

«~

Step 3 perform the postprocessing:
T40 - X*T39
ArcSine(X) - T40*SORT(2.0)
ArcCosine(X) - pi/2 - ArcSine(X)

SORT(2.0) entered as a constant

7-193

"Z

tJ)

Required System Intervention
There is no system intervention required to calculate ArcSine(X) and ArcCosine(X).

Number of 'ACT8847 Cycles Required to Calculate ArcSine(x) and
ArcCosine(x)
The total number of cycles required to perform the ArcSine(x) and ArcCosine(x)
calculation is 68.

Listing of the Chebyshev Constants (c's)
The constants are represented in IEEE double-precision floating point format.
c18
c17
c16
c15
c14
c13
c12
cll
clO
c9
c8
c7
c6
c5
c4
c3
c2
cl
cO

7-194

3DA4A49F8CCD9E73
3DC05DFE52AAD200
3DCCF3l E26F94C8D
3DE86CDA3C8CAEBO
= 3E0768D9F4E950EA
= 3E2383A37598FC80
= 3E403E4B2F65FODE
= 3E5BAFC8245ABDF8
= 3E77E3333AFF1AB4
= 3E94E3A4D4220C9C
= 3EB296DD4C084ACB
= 3EDOE9l3F5F9D496
= 3EEFA74E896F8FA8
= 3FOEC76B7832DBB6
= 3F2F978698C8B2E4
= 3F5l9B1087542073
= 3F7696895FFC05AO
= 3FA375CA6l D2988C
= 3FE7B20423D1D930
=
=
=
=

Pseudocode Table for the ArcSine(x) and ArcCosine(x) Calculation
Table 59. Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-0 '" 010, RND1-0
ClK

DA
BUS

DB
BUS

1

X MSH

X LSH

2

X MSH

X LSH

3

4.0 MSH

4.0 LSH

X

X

ClK
MODE

INSTR

0

RA2.RB2

0

RA2.RB2

X

X

0

RA4.PR4

4.0

X

0

RA4.PR4

5

4.0

X

0

PR6+RB6

6

-1.0 MSH

-1.0 LSH

4.0

-1.0

0

PR6+RB6

7

c18 MSH

c18 LSH

4.0

c18

1

SR7.RB7

4.0

c18

0

PR9+RB9

9

4.0

c17

0

PR9+RB9

10

4.0

c17

1

SR10·CR10

11

4.0

c17

0

PR12+RB12

4.0

c16

0

PR12+RB12

13

4.0

c16

1

SR13·CR13

14

4.0

c16

0

PR15+RB15

4.0

c15

0

PR15+RB15

16

4.0

c15

1

SR16.CR16

17

4.0

c15

0

PR18+RB18

4.0

c14

0

PR18+RB18

c14

1

SR19.CR19

12

15

18

c17 MSH

c16 MSH

c15 MSH

c14 MSH

c17 LSH

c16 LSH

c15 LSH

c14 LSH

MUl
PIPE

y

COMMENT

P2
Sl
SR7.RB7

S2
SR10.CR10
P4
S3
SR13.CR13
P5
S4
SR16·CR16
P6

c14

0

4.0

c13

0

PR21 +RB21

22

4.0

c13

1

SR22.CR22

23

4.0

c13

0

PR24+RB24 SR22.CR22

Start core calculation
S 1 is input to core calc.

Sl
P3

20

SN74ACT8847

S

P1

PR21 +RB21

(11

C

RA4.PR4

4.0
c13 LSH

P

REG REG REG BUS

RA2.RB2

4.0

c13 MSH

ALU
PIPE

X is the input

19

21

co

RB
REG

4

8

i"

RA
REG

00)

S5
SR19.CR19
P7
S6

Lv88.L~nfvLNS

....~
co
0)

Table 59. Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-0 = 010. RND1-0 == 00) (Continued)
ClK
24

DA
BUS
c12 MSH

DB
BUS

RA
REG

c12 lSH

4.0

RB
REG

ClK
MODE

INSTR

c12

0

PR24+RB24

25

4.0

c12

1

SR25.CR25

26

4.0

cl2

0

PR27+RB27

4.0

27

c11 MSH

cll LSH

2S
29
30

clO MSH

clO LSH

c11

0

PR27+RB27

4.0

cll

1

SR2S.CR2S

4.0

cll

0

PR30+RS30

4.0

ClO

0

PR30+RS30
SR31·CR3'1

31

4.0

ClO

1

32

4.0

ClO

0

PR33+RB33

4.0

C9

0

PR33+RB33

34

4.0

c9

1

SR34.CR34

35

4.0

c9

0

PR36+RB36

4.0

cs

0

PR36+RB36

37

4.0

cs

1

SR37.CR37

3S

4.0

cs

0

PR39+RS39

4.0

33

36

c9 MSH

Cs MSH

c9 LSH

Cs LSH

c7

0

PR39+RB39

4.0

c7

1

SR40·CR40

4.0

c7

0

PR42+RS42

4.0

c6

0

PR42+RB42

43

4.0

c6

1

SR43.CR43

44

4.0

c6

0

PR45+RB45

4.0

c5

0

PR45+RB45

46

4.0

c5

1

SR46.CR46

47

4.0

c5

0

PR4S+RB4S

4.0

c4

0

PR4S+RB4S

39

c7 MSH

c7 LSH

40
41
42

45

4S

c6 MSH

c5 MSH

c4 MSH

c6 LSH

c5 LSH

c4 LSH

MUl
PIPE

ALU
PIPE

P

C

S

y

REG REG REG BUS
PS
S7

SR25.CR25
P9
SS
SR2S.CR2S
P1Q
S9
SR31.CR31
Pll
SIO
SR34.CR34
P12
S11
SR37.CR37
P13
S12
SR40.CR40
P14
S13
SR43.CR43
P15
S14
SR46.CR46
P16

COMMENT

Table 59. Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-0 - 010, RND1-0 ... 00) (Concluded)
ClK

DA
BUS

DB
BUS

RB
ClK
REG . MODE

. INSTR

49

4.0

c4

1

SR49*CR49

50

4.0

c4

0

PR51 +RB51

4.0

c3

0

PR51 +RB51

52

4.0

c3

1

SR52.CR52

53

4.0

c3

0

PR54+RB54

4.0

51

54

c3 MSH

c3 lSH

c2

0

PR54+RB54

55

4.0

c2

1

SR55.CR55

56

4.0

c2

0

PR57+RB57

4.0

c1

0

PR57+RB57

4.0

c1

1

SR58.CR58

57

c2 MSH

c1 MSH

c2 LSH

c1 LSH

58
59

4.0

c1

0

PR60+RB60.

LSH

4.0

cO

0

PR60+RB60

X MSH

X LSH

4.0

X

1

SR61.RB61

SORT(2)
MSH

SORT(2)
LSH

4.0

X

0

RA63.PR63

63

SORT
2

X

0

RA63.PR63

64

SORT
2

X

0

DUMMY

SORT
2

pi/2

1

RB66-PR66

67

SORT
2

pi/2

0

NOP

68

SORT
2

pi/2

0

NOP

60

Co

61
62

66

':"
CD
-.J

RA
REG

MSH

pi/2 MSH

Co

pi/2 LSH

MUl
PIPE

AlU
PIPE

P
REG

C
REG

S
REG

Y
BUS

COMMENT

S15
SR49.CR49
P17
S16
SR52.CR52
P18
S17
SR55.CR55
P19
S18
SR58"CR58
P20
Begin postprocessing

S19

SORT(2) is the real value
of square root of 2.0

SR61.RB61
P21

Instruction is doubleprecision RA + RB, prevents
ArcCosine from overwriting ArcSine result

RA63.PR63

P22

P22

Output LSH of ArcSine

S20

S20

Output MSH of ArcCosine

S20

S20

Output LSH of ArcCosine

L v88.l::nfv L NS

-

....-;J

Microcode Table for the ArcSine(x) and ArcCosine(x)· Calculation

00

All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be 1/(SQRT(2.0)).

(I)

p

A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FE6A09E
3FE6A09E
40100000
00000000
00000000
BFFOOOoo
3DA4A49F
00000000
3DC05DFE
00000000
00000000
3DCCF31 E
00000000
00000000
3DE86CDA
00000000
00000000
3E0768D9
00000000
00000000

667F3BCD
667F3BCD
00000000
00000000
00000000
00000000
8CCD9E73
00000000
52AAD200
00000000
00000000
26F94C8D
00000000
00000000
3C8CAEBO
00000000
00000000
F4E950EA
00000000
00000000

PEE C
B N N L
A B K
C

P C C S
I L 0 E
P K N L
EMF 0
SOl P
D G

F 0 0 _ 2
F 1 1 _ 2
F 00_ 2
F 1 0 _ 2
F 00_ 2
F 0 1 _ 2
F 0 1 _ 2
FOOI2
F 0 1 _ 2
F 00_ 2
F 00_ 2
F 0 1 _ 2
F 00_ 2
F 00_ 2
F 0 1 _ 2
F 00_ 2
F 00_ 2
F 0 1 _ 2
F 00_ 2
F 00_ 2

o
o
o
o
o
o
1

o
o
1

o
o
1

o
o
1

o
o
1

o

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

FF
FF
EF
EF
FB
FB
BF
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB

R H

E F
E A N L
S L C 0
E T
W
T
C

o
o
o
o
o
o
o

I
N
S
T
R

1CO
1CO
1CO
1CO
180
1
180
1
1CO
0 180
180
1
1
1CO
1
180
180
1
1CO
1
1
180
1
180
1
1eO
180
1
180
1CO
180

o

o
o
o
o
o
o
o
o
o
o
o
o

R F S B S T S 555
N A RYE E E E E E
D S C T L SLY S.C
T C EST Y
P T
000 0
0 0 0
000
0 0 0
000 0
0 0 0
0 0 0
000 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
000 0
0 0 0
000 0
0 0 0
0 0 0
0 0 0

o
o
o
o
o
o
o
o
o
o
o
o

o
o
o

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

000
000
000
000
000
000
000
000
000
000
0' 0 0
0 0 .
000
000
000
000
000
000
000
000

o

Microcode Table for the ArcSine(x) and ArcCosine(x) Calculation (Continued)
p
A

...';'I

(0
(0

D
A

F 3E2383A3
F 00000000
F 00000000
F 3E403E4B
F 00000000
F 00000000
F 3E5BAFC8
F 00000000
F 00000000
F 3E77E333
F 00000000
F 00000000
F 3E94E3A4
F.OOOOOOOO
F 00000000
F 3EB296DD
F 00000000
F 00000000
F 3EDOE913
F 00000000
F 00000000
F 3EEFA74E
F 00000000
F 00000000

D
B

7598FC80
00000000
00000000
2F65FODE
00000000
00000000
245ABDF8
00000000
00000000
3AFF1AB4
00000000
00000000
D4220C9C
00000000
00000000
4C084ACB
00000000
00000000
F5F9D496
00000000
00000000
896F8FA8
00000000·
00000000

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
D G

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

SN74ACT8847

0 1
00_
0 0 _
0 1 _
00_
00_
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
0 0 _
0 0 ~

o
1
0
0
1

o
o
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB

aa

R H

E F
E A N L
S L C 0
E T
W
T
C

1 1
1
1
1
1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

N
S
T
R
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180

R F S B S T SO
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
0 0 3
003
0 0 3
0 0 3
0 0 3
0 0 3

3
000
3
000
3
000
3
000
3
000
3 1 000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000

L1788.l:l"17LNS
.....

~

0
0

Microcode Table for the ArcSine(x} and ArCosine(x} Calculation (Concluded)
p
A

0
A

0
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FOEC76B
00000000
00000000
3F2F9786
00000000
00000000
3F519B10
00000000
00000000
3F769689
00000000
00000000
3FA375CA
00000000
00000000
3FE7B204
3FE6A09E
3FF6A09E
00000000
00000000
00000000
3FF921FB
00000000
00000000

78320BB6
00000000
00000000
98C8B2E4·
00000000
00000000
87542073
00000000
00000000
5FFC05AO
00000000
00000000
6102988C
00000000
00000000
2301 0930
667F3BCO
667F3BCO
00000000
00000000
00000000
54442018
00000000·
00000000

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
o G

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
0 1 _
0 0 _
1 0 _
0 0 _
0 0 --.:.
0 1 ~
0 0 _
0 0 _

0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
0
0

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

R H

E F
E A N L
S L C 0
E T
W
T
C

FB
9F
FB
FB
9F
FB
FB 1
9F 1
FB 1
FB 1
9F 1
FB 1
FB
9F
FB
FB
BF
EF
EF
FF
FF1
FB 1
FF 1
FF 1

1
1
1
1
1
1
1
1
1
1

1
1
1
,

1
1
1
1
1
1
1

1
1
1
1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

I
N
S
T
R
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
1CO
lCO
180
300
183
300
300

R F S B S T S 000
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o

0 0 0 3
0 0 0 3
000 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 003
0 0 0 3
0 003
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3 1 000
3 1 000
3 1 000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3 1 000
3 000 0
3 1 000
30 0 0 0

ArcTangent Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision. The output is in radians.

Steps Required to Perform the Calculation
STEP 1 -

Preprocessing; If the magnitude of the input, X, is greater than 1.0,
then the reciprocal must be taken. If the magnitude of X is not greater
than 1.0, then pass X. Let this number (either X or 1.0/X) be referred
to as Xl. Next multiply Xl times 2.0, then multiply this resulting
number by Xl. Finally, subtract 1.0 from this last product.

IXI > 1.0
Then Xl +- 1.0/X
Else Xl +- X
X2 +- Xl *2.0*Xl - 1.0
If

STEP 2 - Core Calculation; X2 in Step 1 will be referred to as 'x' in the core
calculation.
X3 ...... CSeries_atan
+-

((((((((((((((((((C19*x +C1S)*X + C17)*x + C16)*x + C15)*x +
C14)*X + C13)*x + C12)*X + Cll)*x + Cl0)*x + C9)*x
+CS)*x + C7)*x + c6)*x + C5)*x + C4)*x + c3)*x + C2)*X
+ Cl)*X + cO

STEP 3 - Postprocessing; mUltiply the output of the core calculation times Xl.
Let this number be referred to as X4. The next computation will yield
the answer. If X was greater than 1.0, then subtract X4 from pi/2.
If X was less than -1.0, then subtract X4 from - pi/2. If neither of
the two conditions above are true, then X4 is the answer.
X4 +- X3*Xl
If X > 1.0
Then ArcTangent(X) +- pi/2 - X4
Else If X < - 1 .0
Then ArcTangent(X) +- - pi/2 - X4
Else ArcTangent(X) +- X4

I"

.q

00
00

....
«
.q
u

I"

Z

en

7-201

Algorithms for the Three Steps
Step 1 perform the preprocessing:
If

IXI > 1.0
Then Tl T2 T3 T4 <0Else Tl <0T2 T3 T4 -

1.0/X
T1 *2.0
T2*CREG
T3 - 1.0
X
Tl *2.0
T2*Tl
T3 - 1.0

T1 is Xl in Step 1, must be stored
externally
CREG - Tl

Step 2 perform the core calculation:
T5 -C19*CREG
T6 -T5 + c18
T7 -T6*CREG
T8 -T7 + c17
T9 -T8*CREG
Tl0-T9 + c16
Tll -T10*CREG
T12 -T11 + c15
T13-T12*CREG
T14-T13 + C14
T15 -T14*CREG
T16-T15 + c13
T17-T16*CREG
T18-T17 + c12
T19 -T18*CREG
T20-T19 + cll
T21 -T20*CREG
T22 -T21 + cl0
T23 -T22*CREG
T24-T23 + c9
T25 -T24*CREG
T26 -T25 + c8
T27 -T26*CREG
T28 -T27 + C7
T29 -T28*CREG
T30 -T29 + c6

en
2

'"
~

:t>

(")

-i

(X)
(X)

~

'"
7-202

CREG

+-

T4

T31 -T30*CREG
T32 -T31 + c5
T33 -T32*CREG
T34-T33 + c4
T35 -T34*CREG
T36-T35 + c3
T37 -T36*CREG
T38-T37 + c2
T39 -T38*CREG
T40 -T39 + cl
T41 -T40*CREG
T42-T41 + co
Step 3 perform the postprocessing:
T43 - T42*Tl
If X > 1.0
CREG - T43
Then ArcTangent(X) - pi/2 - CREG
Return
If X < -1.0
Then ArcTangent(X) - - pil2 - CREG
Return
ArcTangent(X) - CREG

Required System Intervention
As seen in the algorithm for Step 1, the' ACT884 7 performs a compare. The results
of this compare determine what kind of preproccessing is to be performed. In Step 3,
there are two more compare operations. The system must therefore perform additional
decision making. In addition, the system must store Tl , and later (in the postprocessing)
provide this value to the 'ACT884 7.

Number of 'ACT8847 Cycles Required to Calculate ArcTangent(x}
Calculation of ArcTangent(x) requires at most 89 cycles (including the divide
instruction). In addition, it is assumed that 15 additional cycles are required due to
the compare instructions, and resulting system intervention. Therefore, the total number ,....
of cycles to perform the ArcTangent(x) calculation is 104.
~

CO
CO
~

U

 1.0 then execute

80

X MSH

X LSH

20rX

Tl

0

83 through 86, otherwise
skip to 83b. In either case

COMPARE
X,1.0
--

--

-----

--'--

,_execute 80_through 82

Table 60. Pseudocode for Chebyshev ArcTangent Routine (PIPES2-0 ... 010. RND1-0 .. 00) (Concluded)
ClK

C

S

y

REG

REG

BUS

REG

RB
REG

ClK
MODE

1.0 MSH

1.0 LSH

X

1.0

0

82

X

1.0

0

NOP

83

X

1.0

0

RB84-CR84

X

pi/2

0

RB84-CR84

85

X

pi/2

0

NOP

S21a

S21a

86

X

pi/2

0

NOP

S21a

S21a

0

COMPARE
-1.0,X

81

84

pi/2 MSH

pi/2 LSH

INSTR
COMPARE

AlU
PIPE

P

DB
BUS

RA

MUl
PIPE

REG

DA
BUS

COMMENT

P23

X,l.0

Wait for system response

P23

Execute if X

Output MSH of answer
Output LSH of answer
The calculation is done
Execute if X

83b -1.0 MSH -1.0 lSH

X

1.0

0

NOP

P23

0

RB87-CR87

-pi/2

0

RB87-CR87

-1.0

-pi/2

0

NOP

S21b

89b

-1.0

pi/2

0

NOP

S21b

86c

-1.0

X

1

PASS(CR86)

87c

-1.0

X

0

NOP

88c

-1.0

X

0

NOP

0

85b

-1.0

X

86b

-1.0

X

-1.0

88b

87b

X LSH

-pi/2

-pi/2

MSH

LSH

1.0.

skip to 86c. In either case
execute 83b thru 85b
P23

X

X MSH

s

If - 1.0 > X then execute
86b through 89b, otherwise

COMPARE
-1.0,X

-1.0

84b

> 1.0

Wait for system response
Execute if - 1.0

> X

S21b Output MSH of answer
S21b

Output LSIo-l of answer.
The calculation is done.
Execute if X is within the
range [- 1 ,11, Pass CREG

"r:..,

oCO

SN74ACT8847

_._-

S21c

S21c

Output MSH of answer

S21c

S21c

Output LSH of answer

Lv88.l::l"vLNS
.....

~

Microcode Table for the ArcTangent(x) Calculation

~

0

All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be SQRT(3.0).
p

A

D
A

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

3FFOOOOO
3FFBB67 A
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
40000000
00000000
00000000
00000000
00000000
BFFOOOOO
BDC4D6CC

D
B

00000000
E8584CAB
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
6308553F

PEE C
B N N L
A B K
C

P C C S
I L 0 E
P K N L
EMF 0
SOl P
D G

F 0 0 _
F 1 1 _
F 0 0 _
F 00_
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
F 0 0 _
FlO _
F 0 O..r
F 0 0 _
F 0 0 _
F 0 1
F 0 1 _

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
EF
EF
6F
6F
FB
FB
BF

R H

E F
E A N L
S L C 0
E T
W
T
C

o
o
o
1
1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
0
1
1
1
1

N
S
T
R

18A
18A
300
OlEO
300
300
300
300
300
300
300
300
300
300
300
lCO
lCO
0 lCO
lCO
180
180
lCO

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

R F S B S T S 555
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o
o
o
o

0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
00003
000 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 003
0 0 0 3
00003
0 0 0 3
00 0 3
0 0 0 3
00003
00103
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3

o
o
o
o
o
o
o
o
o
o
o
o

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

000
000
000
000
000
000
000
000
000
000
000
000
000
1 000
1'0 0 0
1 000
0 000
000
000
000
000
000

Microcode Table for the ArcTangent(x) Calculation (Continued)

-.J

,:.,
...

...

p
A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
3DDFFD56
00000000
00000000
BDE88078
00000000
00000000
3E040967
00000000
00000000
BE237C82
00000000
00000000
3E3F1358
00000000
00000000
BE587CD2
00000000
00000000
3E73D238
00000000
00000000
BE9028E9

00000000
FCFD2315
00000000
00000000
2D99D071
00000000
00000000
OCB71218
00000000
00000000
39249B77
00000000
00000000
EC1D6ACO
00000000
00000000
5F4AFBED
00000000
00000000
8BOB8A86
00000000
00000000
21 CA6A94

PEE C
B N N l
A B K
C

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

SN74ACT8847

0 O.r
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _

P C C s
I l 0 E
P K N l
EMF 0
SOl P
D G
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

0
0
1
0
0
1
0

o
1
0

o
1
0
0
1
0
0
1
0
0
1
0
0

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

R H

E F
E A N l
S l C 0
E T
W
C
T

FB
FB
9F
FB 1
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB 1

o
1

I
N
S
T
R

0 180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

R F S B S T S 555
N A RYE E E E E E
D S C T l SLY S C
T C EST Y
P T

o
o
o
o
o
o
o
o
o
o
o

0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 003
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
000 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3
0 003
0 003
0 0 0 3
0 0 0 3
0 0 0 3
0 0 0 3

o
o
o
o
o
o
o
o
o
o
o

3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000

L1788.l:l'V17LNS

-

.....
N
....

Microcode Table for the ArcTangent(x) Calculation (Continued)

N

P
A

D
A

F 00000000
F 00000000
F 3EAA8149
F 00000000
F 00000000
F BEC5EDAD
F 00000000
F 00000000
F 3EE256E5
F . 00000000
F 00000000
F BEFF171F
F 00000000
F 00000000
F 3F1 ACFA9
F 00000000
F 00000000
F BF37A846
F 00000000
F 00000000
F 3F558DF7
F 00000000
F 00000000
F BF749B3E

D
B

00000000
00000000
97A38D4E
00000000
00000000
9A21FE5F
00000000
00000000
7BA07FAE
00000000
00000000
48FDF707
00000000
00000000
F95CAODF
00000000
00000000
4221 D994
00000000
00000000
. A83283C9
00000000
00000000
2E433683

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
D G

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2

00_
00_
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _
00_
0 0 _
0 1 _

1 3
3
0 3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
0 3
0 3
1 3
0 3
o3

o

R H

E F
E A N L
S L C 0
E T
W
T
C

9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB 1

o
o
o
o
o
o

1
1
1
1
1 1
1 1
1
1
1
1
1 1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

I
N
S
T
R

1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180
1CO
180
180

aaa

R F S B S T S
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o
o
o
o
o
o

0 0 0 3
0 0 0 3
0 003
0 0 0 3
0 0 0 3
0 003
0 0 0 3
000 0 3
o0 0 0 3
o0 0 0 3
o 0 003
00003
o 0 003
o0 0 0 3
o 0 003
0 0 0 3
0 0 0 3
000 3
o 0 003
o0 0 0 3
0 0 0 3
o0 0 0 3
0 0 0 3
o0 0 0 3

o
o
o
o
o

3
000
3 1 000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3
000
3 1 000
3 1 000
3
000
3
000
3
000
3
000
3
000

Microcode Table for the ArcTangent(x) Calculation (Concluded)
p

A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
00000000
3F955A30
00000000
00000000
BFBA 1494
00000000
00000000
3FEBDA 7A
3FE279A7
3FFBB67A
3FFOOOOO
00000000
00000000
3FF921FB
00000000
00000000

00000000
00000000
OBFB8078
00000000
00000000
C19FADD4
00000000
00000000
85BD40CB
4590331C
E8584CAB
00000000
00000000
00000000
54442D18
00000000
00000000

PEE C P C C
B N N L
L 0
A B K P K N
C EMF
SOl
D G
F 0 0 _ 2 1 3
F 0 0 _ 2 o 3
F 0 1
2 o 3
F 0 0 _ 2 1 3
F 0 0
2 o 3
F 0 1
2 o 3
F 0 0 _ 2 1 3
F 0 0 _ 2 o 3
2 o 3
F 0
F 0 1 _ 2 1 3
F 0 0 _ 2 o 3
F 1 1 _ 2 o 3
FOO.I2 o 3
F 0 0
2 o 3
F 0 1 _ 2 o 3
F 0 0 _ 2 o 3
F 0 0 _ 2 o 3

-..j

N
w

SN74ACT8847

s RH

E F
E E A N L
L S L C 0
0 E T
W
P T
C
9F
FB
FB
9F
FB
FB
9F
FB
FB
BF
FF
FF
FF
F7
F7
FF
FF

o
o
o
o
o
o
o
o
o
o
o
o

I

N
S
T
R

R F S B S T S 000
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

1CO o 0 003
180 o 0 0 0 3
180 o 0 0 0 3
1CO o 0 0 0 3
180 o 0 0 0 3
180 o 0 0 0 3
1CO o 0 0 0 3
180 o 0 0 0 3
180 o 0 0 0 3
1CO o 0 0 0 3
182 o 0 0 0 3
1
182 o 0 0 0 3
o 0 300 0 0 1 0 3
o 183 o 0 003
o 183 o 0 0 0 3
o 300 o 0 0 0 3
o 300 o 0 0 0 3

3
000
3
000
3
000
3
000
3
000
3
000
000
3
3
000
3
000
3
000
000
3
3
000
3
000
3
000
3 1 000
3 1 000
300 0 0

Exponential Routine Using Chebyshev's Method
All floating point inputs and outputs are double precision.

Steps Required to Perform the Calculation
STEP 1 - Preprocessing; first multiply the input, X, by log2e (yielding X1). Next,
convert this product to an integer, using truncate mode (yielding X2).
Form the variable EX by adding 1024 to X2. EX is used in the
postprocess!ng part of the routine. Subtract 1023 from EX to find
the variable N (N is a~tually X2 incremented by 1). Convert N to a
floating point number (yielding X3). Subtract X1 from X3, multiply
this difference by 2.0, and then finally subtract 1.0. This last
computation is. the input to the core routine.
X1'"
X2'"
EXNX3 X4-

X*1092e
TRUNC(X1)
1024 + X2
EX - 1023
DOUBLE(N)
2.0*(X3 - X1) - 1.0

STEP 2 - Core Calculation; X4 in Step 1 will be referred to as 'x' in the core
calculation.
X5'" Cseries_exp
-

((((((((((C11 *x + C10)*X + Cs)*x + c8)*x + C7)*X + c6)*x
C5)*X + C4)*X + c3)*x + C2)*x + C1)*x + cO

+

STEP 3 - Postprocessing; multiply the output of the core calculation times 2N.
To generate 2N, perform the following: shift left logical 20 positions
(bits) the variable EX (which was calculated in Step 1). The resulting
bit pattern will be the double precision floating point representation
of 2N. However, the 'ACT.8847 will not at this point recognize the
bit pattern as floating point number. So this number must be output
from the Y bus, and then input (declaring the input to be a double
precision floating point number) on the input bus. Now the' ACT884 7
wjll process 2N as a double float, and so the COre output, X5, can
be multiplied by 2N to produce the final result. 'SLL' means to shift
left logical.

a

X6'"
Y busDA busExp(X) ...

7-214

EX SLL by 20 bits
X6
Y bus
XEi * X6

Algorithms for the Three Steps
Step 1 perform the preprocessing:
Tl +-X*lo92e
T2 +-INT(Tl)
T3 +-1024 + T2
T4
T5
T6
T7
T8
T9

+- T3 - 1023
+-1*T4
+-DOUBLE(T5)
+-T6 - CREG
+-2.0*T7
+-T8 - 1.0

lo92e entered as a constant
round controls set to truncate
T3 is EX in Step 1, must be
stored externally, CREG +- Tl
makes T4 available to A2 MUX
convert from integer to double

T9 is X4 in Step 1, the
input to the core routine

Step 2 perform the core calculation:
Tl0 +-cll *CREG
Tll ..... Tl0 + c1Q
T12 +-Tll *CREG
T13+-T12 + c9
T14+-T13*CREG
T15 +-T14 + c8
T16 +-T15*CREG
T17+-T16 + c7
T18 +-T17*CREG
T19 +-T18 + c6
T20 +-T19*CREG
T21 +-T20 + c5
T22 +-T21 *CREG
T23 +- T22 + c4
T24 +-T23*CREG
T25 +-T24 + c3
T26 +-T25*CREG
T27 +-T26 + c2
T28 +- T27 *CREG
T29 +-T28 + cl
T30 +-T29*CREG
T31 +-T30 + cO

CREG +- T9

7-215

Step 3 perform the postprocessing:
T32 +- T3 SLL by 20 bits
Y bus +- T32
DA bus

+-

Y bus (= T32)

Exp(X)

+-

T32*CREG

Shift T3 20 bits left
Output and then Input T32
CREG +- T31
Two cycles required to
input both halves of T32

Required Systf!m Intervention
The system is required to store the variable EX, and then later provide this variable.
In addition, the system is required to route the variable T32 (in Step 3) from the Y
bus to the DA bus.

Number of ' ACT884 7 Cycles Required to Calculate Exp(x)
Calculation of Exp(x) requires 52 cycles. Since there are no decisions which the system
is required to perform, the total number of cycle to perform the Exp(X) calculation is 52.

Listing of the Chebyshev Constants (c's)
The constants are represented in IEEE double-precision floating point format.
c11
c10
cg
c8
c7
c6
c5
c4
c3
c2
c1
cO

7-216

= BD45A7FC05D3B501
=: 3D957BFD2DBF487C
= BDE351B821AC16D5

= 3E2F5BOE17440879
= BE769E51EE631E87
= 3EBC8D7530548DD5
=

BEFEE4FD234A4926

= 3F3BDB696E8987 AC

= BF741839EB88156E
= 3FA5BE298ADF0369

= BFCF5E46537AB906
= 3FE6A09E667F3BCC

Pseudocode Table for the Exp(x) Calculation
Table 61. Pseudocode for Chebyshev Exponential Routine (PIPES2-0
ClK

DA
BUS

DB
BUS

1

X MSH

X LSH

2

L092e
MSH

Lo92e
LSH

X

ClK
MODE

INSTR

0

RA2.RB2

Lo92e

0

RA2.RB2

RB
REG

3

X

Lo92e

0

DP2I(PR4)

4

X

Lo92e

0

DP2I(PR4)

1024

Lo92e

0

RA5+SR5

-1023 Lo92e

0

RA6+SR6

5

1024

6

-1023

-1023

1

0

SR7.RB7

8

-1023

1

1

12DP(PR8)

9

-1023

1

1

SR9-CR9

-1023

2.0

1

SR10.RB10

-1023

2.0

0

PR12+RB12

0

PR12+RB12
SR13.RB13

7

10

1

2.0 MSH

2.0 LSH

11
12

-1.0 MSH -1.0 LSH -1023 -1.0

13

cll MSH

-1023

cll

1

14

-1023

cll

0

PR15 + RB15

15

-1023

cl0

0

PR15+RB15

16

-1023

cl0

1

SR16.CR16

17

-1023

cl0

0

PR18+RB18

-1023

cg

0

PR18+RB18

18
'-I

RA
REG

cl0 MSH

c9 MSH

cll LSH

cl0 LSH

Cg LSH

r:,

19

-1023

'-I

20

-1023

c9

1

SR19.CR19

c9

0

PR21 +RB21

MUl
PIPE

AlU
PIPE

010. RND1-01

P

C

S

y

REG

REG

REG

BUS

COMMENT
X is the input

RA2.RB2

Double-precision - integer
Pl
Pl

Sl
S2

S2

Store S2. which is the
variable EX. for use in
cycle 46

S3
P2

Integer - double-precision
S4
S5

SR10.RB10
P3
S6
SR13.RB13

S6
P4
S7

SR16.CR16
P5
S8
SR19.CR19

Start core calculation.
S6 is the input to the
core calculation

L1788.l::l"17LNS
-.J

Table 61. Pseudocode for Chebyshev Exponential Routine (PIPES2-0 ... 010. RND1-0) (Continued)

~

(XI

ClK

DA
BUS

DB
BUS

RA
REG

RB
REG

ClK
MODE

INSTR

Cs MSH

cs LSH

-1023

Cs

0

PR21 +RB21

22

-1023

cs

1

SR22.CR22

23

-1023

cs

0

PR24+RB24

-1023

c7

0

PR24+RB24

25

-1023

c7

1

SR25.CR25

26

-1023

c7

0

PR27+RB27

-1023

c6

0

PR27+RB27

2S

-1023

C6

1

SR2S.CR2S

29

-1023

c6

0

PR30+RB30

-1023

21

24

27

c7 MSH

c6 MSH

c7 LSH

c6 LSH

c5

0

PR30+RB30

-1023

c5

1

SR31.CR31

-1023

c5

0

PR33+RB33

-1023

c4

0

PR33+RB33

34

-1023

c4

1

SR34.CR34

35

-1023

c4

0

PR36+RB36

-1023

c3

0

PR36+RB36

37

-1023

c3

1

SR37.CR37

3S

-1023

c3

0

PR39+ RB39

-1023

c2

0

PR39+RB39

40

-1023

c2

1

SR40·CR40

41

-1023

c2

0

PR42+RB42

-1023

cl

0

PR42+RB42

cl

1

SR43.CR43

30

c5 MSH

c5 LSH

31
32
33

36

39

42
43

c4 MSH

c3 MSH

c2 MSH

cl MSH

c4 LSH

c3 LSH

c2 LSH

cl LSH

-1023

MUl
PIPE

ALU
PIPE

P

C

S

y

REG

REG

REG

BUS

P6
S9
SR22.CR22
P7
S10
SR25.CR25
PS
Sll
SR2S.CR2S
P9
S12
SR31.CR31
Pl0
S13
SR34.CR34
Pll
S14
SR37.CR37
P12
S15
SR40.CR40
P13
S16

COMMENT

Table 61. Pseudocode for Chebyshev Exponential Routine (PIPES2-0 - 010, RND1-0) (Concluded)
ClK

DA
BUS

DB
BUS

44
45

46

co MSH

S2

47
48

49

S18

20

RB
REG

ClK
MODE

INSTR

MUl
PIPE

-1023

c1

0

PR45+RB45

SR43*CR43

-1023

Co

0

PR45+ RB45

S2

20

0

SLL
RA46,RB46

S2

20

0

NOP

S2

20

0

RA48.CR48

AlU
PIPE

P
REG

C
REG

S
REG

y

BUS

COMMENT

P14
Begin post processing.
S2 is the variable EX, and
was calculated in cycle 5.
Shift left logical S2
20 bit positions
S17

S18

S18

Allows time for S18 to be
output from the Y bus and
input to the DA bus
RA holds S18', which is
the double precision
floating point equivalent
of 2 N, where N was
calculated in cycle 6

S18'

20

0

RA48.CR48

S18'

20

0

DUMMY

51

S18'

20

0

NOP

P15

P15

Output MSH of answer

52

S18'

20

0

NOP

P15

P14

Output LSH of answer

50

0

Co lSH

RA
REG

-.I

~

co

SN74ACT8847

Instruction is RA + RB, used
to allow time for result
to propagate to Y bus

RA48.CR48

Lv88.l:lVvLNS
-..J

N
!'.)
0

Microcode Table for the Exp(x} Calculation
All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode
table, the value of X has been chosen to be 6.25.
p
A

D
A

D
B

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

40190000
3FF71547
00000000
00000000
00000400
FFFFFC01
00000000
00000000
00000000
40000000
00000000
BFFOOOOO
BD45A7FC
00000000
3D957BFD
00000000
00000000
BDE351B8
00000000
00000000
3E2F5BOE

00000000
652B82FE
00000000
00000000
00000000
00000000
00000001
00000000
00000000
00000000
00000000
00000000
05D3B501
00000000
2DBF487C
00000000
00000000
21AC16D5
00000000
00000000
17440879

PEE C
B N N L
A B K
C

P C C s
I L 0 E
P K N L
EMF 0
SOl P
D G

F 0 0 _ 2 0 3
F 1 1
2 0 3
F 0 0 _ 2 0 3
F 0 0 _ 2 0 3
F10...r201
F10_201
F01_201
F 00_ 2 1 3
F 00_ 2 1 3
F 0 1 _ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
F 0 1 _ 2 1 3
F 0 OS 2 0 3
2 0 3
F 0 1
F 00_ 2 1 3
F 00_ 2
3
F 0 1 _ 2 0 3
F 00_ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3

o

FF
FF
FB
FB
FE
FE
BF
FB
F6
BF
FB
FB
BF
FB
FB
9F
FB
FB
9F
FB
FB

R H

E F
E A N L
S L C 0
E T
W
T
C

o
o
o
o

1
1
1
1
1
1
1
1
1

N
S
T
R

1CO
1CO
1 1
1A3
1A3
1 1
1 0 0 200
1 1
200
1 1
240
1A2
1
183
1CO
180
180
1
1CO
0 180
1
180
1CO
180
180
1CO
1
1
180
1
1
180
1

o

o
o
o
o
o
o
o
o

o
o
o
o
o
o
o

R F S B S T S 000
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o

0 003 3
0 003 3
0 0 3 3
1 0 0 0 3 3 1
0010331
0 0 0 3 3 1
0 0 0 3 3 1
0 0 0 3 3 1
0 0 0 3 3 1
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 0 0 3 3
0 003 3
0 003 3
0 0 0 3 3
00003 3

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

o

000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000

Microcode Table for the Exp(x) Calculation (Continued)
p

-..J

N
N

~

A

D
A

F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F

00000000
00000000
BE769E51
00000000
00000000
3EBC8D75
00000000
00000000
BEFEE4FD
00000000
00000000
3F3BDB69
00000000
00000000
BF741839
00000000
00000000
3FA5BE29
00000000
00000000
BFCF5E46
00000000
00000000
3FE6A09E

D
B

00000000
00000000
EE631E87
00000000
00000000
30548DD5
00000000
00000000
234A4926
00000000
00000000
6E8987 AC
00000000
00000000
EB88156E
00000000
00000000
8ADF0369
00000000
00000000
537 AB906
00000000
00000000
667F3BCC

PEE C P C C
B N N L I L 0
A B K P K N
C EMF
SOl
D G
F 00_ 2 1 3
F 00_ 2 o 3
F 0 1
2 0 3
F 00_ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 o 3
F 00_ 2 1 3
F 00_ 2 o 3
F 0 1 _ 2 o 3
F 00_ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 o 3
F 00_ 2 1 3
F 00_ 2 o 3
F 0 1
2 o 3
F 00_ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
F 00_ 2 1 3
F 0 0 _ 2 0 3
F 0 1 _ 2 0 3
F 00_ 2 1 3
F 00_ 2 o 3
F 0 1 _ 2 o 3

SN74ACT8847

s R H E F
E E A N L
L S L C 0
0 E T
W
P T
C
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB
9F
FB
FB

1

1
1 1
1
1
1
1 1 1
1
1 1 1
1
1
1
1 1

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o

N
S
T
R

R F S B S T S 000
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

1CO o 0 0 0 3 3
000
180 000 0 3 3
000
180 o 0 0 0 3 3 1 000
1CO o 0 0 0 3 3 1 000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
1CO o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
1CO o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3 1 000
1CO o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
1CO o 0 0 0 3 3
000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3 1 000
1CO o 0 0 0 3 3
000
000
180 o 0 0 0 3 3
180 o 0 0 0 3 3
000
1CO o 0 0 0 3 3 1 000
000
180 o 0 0 0 3 3
000
180 o 0 0 0 3 3

Lv88.l~VvLNS
-..J

N
N
N

Microcode Table for the Exp(x) Calculation (Concluded)
P
A

D
A

F
F
F
F
F
F
F

00000409
00000000
40900000
00000000
00000000
00000000
00000000

D
B

00000014
00000000
00000000
00000000
00000000
00000000
00000000

PEE C
B N N L
A B K
C

F
F
F
F
F
F
F

P C C s
I L 0 E
P K N L
EMF 0
SOl P
D G

11_201
0 1 .J 2 0 3
0 0 _
2 ·0 2
1 0 _
2 0 2
0 0 _
2 0 3
0 0 _
2 0 3
0 0 _
2 0 3

R H

E F
E A N L
S L C 0
E T
W
T
C

FF
FF
DF
DF 1
FF
FF
FF 1

1

o

1 1
1
1
1 1
1 1

o
o
o
o
o

o

I
N
S
T
R

228
0 300
1CO
1CO
180
300
300

R F S B S T S 000
N A RYE E E E E E
D S C T L SLY S C
T C EST Y
P T

o
o
o
o
o
o
o

0
0
0
0
0
0
0

0
0
0
0
0
0
0

0
0
0
0
0
0
0

3
3
3
3
3
3
3

3
000
3
000
3
000
3
000
3 1 000
3 1 000
3 0 000

High-Speed Vector Math and 3-D Graphics
Introduction
Texas Instruments SN74ACT8837 and SN74ACT8847 floating point units (FPU) are
designed to execute high-speed, high-accuracy mathematical computations. The
devices are especially suited for matrix manipulations such as those used in graphics or
digital signal processing. These FPUs multiply and add data elements by executing
sequences of microprogrammed calculations to form new matrices. Each device may be
configured for either single- or double-precision operation. Single-precision operation is
assumed throughout this report.
The 'ACT8847 is a functional superset of the 'ACT8837 and operates at higher clock
rates (up to 33 MHz) than the 16-MHz '8837. Unlike the 'ACT8837, the 'ACT8847 can
perform integer and logical operations and has built-in, hardwired algorithms for division
and square root operations.
This application report outlines the timing, data flow, and programming for several
common data vector calculations and matrix transformations. Further, it illustrates some
of the programming "tricks" resulting in fastest operation. Throughout, this document
compares the timing schemes for programs in which all registers, including the ALU and
multiplier internal pipeline registers, are enabled ("pipelined" mode) with those for
equivalent programs in which the internal pipeline registers are disabled ("unpiped"
mode). Equations are provided to help the programmer select the more efficient mode,
and performance figures are included for both devices, with times given for 15-MHz and
3D-MHz operations.
This report begins by covering simple vector arithmetic operations, which are
categorized as "computational" or "compare" functions for convenience. This document
then compares these operations as they are used in graphics applications to perform
three-dimensional coordinate transformations, perspective viewing, and Clipping.

SN74ACT8837 and SN74ACT8847 Floating Point Units

......
q-

Both the 'ACT8837 and 'ACT8847 floating point units (FPU) combine a multiplier and an
arithmetic-logic unit (ALU) in a single microprogram mabie VLSI device. These devices
are implemented in Tl's advanced one-micron CMOS technology and are fully
compatible with the IEEE standard for binary floating pOint arithmetic, STO 754-1985, for
either single- or double-precision operation.

00
00

Instruction inputs can select independent ALU operation, independent multiplier
operation, or simultaneous ALU/multiplier operation. Each FPU can handle three types
of data input formats. The ALU accepts data operands in integer format or IEEE floating

7-223

~

(.)

~

......

Z

CJ)

point format. In the 'ACT8837, integers are converted to normalized floating point
numbers with biased exponents prior to further processing. A third type of operand,
denormalized numbers, can also be processed after the ALU has converted them to
"wrapped" numbers, which are explained in detail in the SN74ACT8800 Family Data
Manual. The 'ACT8837 multiplier operates only on normalized floating pOint numbers or
wrapped numbers. The 'ACT8847 multiplier also operates on integer operands.
Data enters the 'ACT8837 or 'ACT8847 through two 32-bit data buses, DA and DB (see
Figures 74 and 75), which can be configured to operate as a single 64-bit data bus for
double-precision operations. Data can be latched in a 64-bit temporary register or
loaded directly into the input registers, RA and RB, which pass data to the multiplier and
ALU.
A clock-mode control allows the temporary register to be clocked on the rising or falling
edge of the clock to support double-precision ALU operations at the same rate as singleprecision operations. Using the temporary register, double-precision numbers on a
single 32-bit input bus can be loaded in one clock cycle.
The input registers RA and RB are the first of three levels of internal data registers.
Additionally, the ALU and multiplier each have an internal pipeline register and an output
register. The ALU's output register is denoted by"S" (sum), and the multiplier's output
register is denoted by "P" (product). Any or all of these internal registers may be
bypassed.
A 64-bit constant register (C) with a separate clock is provided for temporary storage of a
multiplier result, ALU result, or constant for feedback to the multiplier and ALU. An
instruction register and a status register are also included.
Four multiplexers select the multiplier and ALU operands from the input, C, S, or
P registers. Results are output on the 32-bit Y bus; a Youtput multiplexer selects the
most or least significant half of the result for output.
In addition to add, subtract, and multiply functions, the 'ACT8837 can be programmed to
perform floating pOint division using a Newton-Raphson algorithm. Absolute value
conversions, floating point-to-integer and integer-to-floating pOint conversions, and a
compare instruction are also available.

en
2

~

»
(")

~

The 'ACT8847 FPU is fully compatible with IEEE Standard 754-1985 for addition,
subtraction, multiplication, division, square root, and comparison. The 'ACT8847 FPU
also performs integer arithmetic, logical operations, and logical shifts. Additionally,
absolute value conversions and floating pOint-to-integer and integer-to-floating point
conversions are available.

(X)
~

-...I

7-224

l-1

PERRA

DA31-DAO

PA

to

PO

PARITY
CHECK

I

I
I
'~ t t r-l
I
J

PERRB

to

I

32

32

I
I

PARITY
CHECK

DB31-0BO

TEMPORARY
REGISTER

32

32

CONFIGURATION
lOGIC

--------

2

~t60

I RA INPUT
I REGISTER

ENRA

---r60

I

I

I
I

RO INPUT
REGISTER

t

ENRB

f 60

60

I

I

~ ~

~~

MULTIPLIER CORE

AlU

PIPELINE REGISTER

PIPELINE REGISTER

ADDER/ROUNDER

NORMALIZER

60

60

60

60

I

INSTRUCTION
REGISTER

19-10
SELOP7-SELOPO
PIPES2-PIPES1
FAST
RND1-RNDO

I

t

60

60

PRODUCT IPI REGISTER

SUM (SI REGISTER

60 60

\4-1

SRec

MUX

CLKC

C REGISTER

---~

FROM
INSTRUCTION.
REGIST ER

PARITY
GENERATE

~

4

CLKMODE
RESET

4

TP1-TPO

1~t

4

Vce

4

GND

L

7..

C

I

0

Y31-YO

-OEY

SELST1SELSTO

11

3

0

PY3-PYO

ClK
PIPESO

1STATUS REGISTER

32

I

4
4

f60

~
&--

2

HALT
BYTEP

~~
7

f60

4
4

4

17~

SELMSfLS

CONFIG1CONFIGO

T
MASTER/
SLAVE
COMPARE

t

MSEAR

J
UNORD
AGT B
A EO B

IVAL
IHEX
OVER
UNDER
OENORM
OENIN
RNDCO
SRCEX
CHEX
STEX 1-STEXO

Figure 74. SN74ACT8837 Floating Point Unit
7-225

PERRA

CA31·DAO

PA

t

I
J

PARITY
CHECK

DB31·DBO

I

REGISTER

I t t

'----l

32

32

PARITY
CHECK

I

I

TEMPORARY

t

I

32

32

PERRB

P8

r-

CONFIGURATION
LOGIC

I
I

CONFIG1·

2

CONFIGO

;t2S

t64

I

I

I RA INPUT

ENRA

I REGISTER

t64
I

RS INPUT
REGISTER

ENRB

I

164

J64

I

I

~ ~

~l:
64

I

64

MULTIPLIER CORE

1-64

llNSTRUCTION
REGISTER

I 10-10
SELOP7~SELOPO

PIPES2-PIPES 1
FAST
RND1-RNDO

1-64

ALU

PIPELINE REGISTER

PIPELINE REGISTER

ADDER/ROUNDER

I

64
PRODUCT IP) REGISTER

~ INSTRUCTIONJ

NORMALIZER

PIPELINE

t64
SUM IS) REGISTER

84 64

W

SRce

MUX

CLKC

C REGISTER
2,
;

l0o!
;

-1-64
SELMS/I:!

FROM

--~

INSTRUCTION ..
REGIS TER

2

';<

-1-64

~
&-

lSTATUS REGISTER

ENRC

of

FLOWC

of

HALT

of

BYTEP

of

CLK

of

P'PESO

of

CLKMODE

of

REm

of

TP1-TPO

of

VCC

of

GND

SELST1SELSTO

32

I

PARITY

GENERATE

U
4

3

PV3·PYO

11

'"7.
~

.c
T

4

I
Y3,·YO

orv

MASTERI
SLAVE
COMPARE

t

MSERR

I

ED
DIVBYO
IVAL

UNORD
AGT B
AEQR

IHEX
OVER
UNDER
DENORM

Figure 75. SN74ACT8847 Floating Point Unit
7-226

of

OES

DENI N
RND CO
SRCEX
CHE X
STE X1·STEXO
NEG
INF

For both the 'ACT8837 and 'ACT8847, the ALU and multiplier can operate in parallel to
perform sums of products and products of sums. Detailed information regarding the
instruction inputs for the various 'ACT8837 and 'ACT8847 configurations and operations
is given in the SN74ACT8800 Family Data Manual.

Mathematical Processing Applications
Tl's SN74ACT8837 and SN74ACT8847 high-speed floating point units (FPU) are
designed to perform high-accuracy, computationally-intensive mathematical operations.
In particular, these FPUs can meet the computational demands of high-end graphics
workstations and advanced signal processing. Both applications involve repetitive
computations on arrays of data typically expressed as vector arithmetic operations.
For example, the calculation of the sum of products, or multiply-accumulate function, is
frequently used in both signal and graphics processing. In general form, the sum of
products equation is:
n
S = I kiXi, for coefficients ki and data xi.
i=1

This sum of products is the central function involved in multiplying matrices. Such
matrices might represent a system of linear differential equations or the geometrical
transformation of a graphic object. Specifically, an n x n matrix A multiplied by an n x m
matrix B yields an n x m matrix C whose elements Cij are given by:

n
Cij = I aik x bkj for i = 1, ... ,n and j = 1, ... ,m.
k=1
The 'ACT8837 and 'ACT8847 are designed to handle efficiently this kind of parallel
multiplication and addition.

Graphics Applications
The basic principle of graphics processing is that any object can be reduced to a
combination of points, lines, and polygons and then defined as a collection of points in
three-dimensional space. Because pOints, planes, transformation matrices and other
common data structures are vectors, most of the computations involved in graphics
processing are vector operations.

.....

~

CO

t;
«

.q

.....
2:

en

7-227

Computations for a 3-D graphics display are highly involved due to the complexity
introduced by the z-axis. Viewing an object from a particular perspective involves
transforming the object's world coordinates, or its coordinates in the model space, into
viewing, or eyepoint, coordinates. A series of translations and rotations map the viewing
system axes onto the world coordinate axes. Each individual pOint must be translated,
rotated and, if necessary, scaled in a proper order. Once the coordinate transformation is
complete, the coordinates are clipped to a viewing volume. Clipping algorithms employ
arithmetic operations to determine whether an object, or part of an object, is inside or
outside a pyramidal volume. Hidden surface routines may then be employed to delete
surfaces that fall behind a "nearer" surface from the viewer's perspective.
Matrix arithmetic is required for scaling, rotating, translating, or shearing an object, as
well as for the final process of projecting its visible parts to a two-dimensional frame
buffer. Any sequence of these transformations can be represented as a single matrix
formed by concatenating the matrices for the individual operations. The generalized
4 x 4 matrix for transforming a three-dimensional object is shown below, partitioned into
four component matrices, each of which produces a specific effect on the image. The
3 x 3 matrix produces linear transformation in the form of scaling, shearing, and rotation.
The 1 x 3 row matrix produces translation, while the 3 x 1 column matrix produces
perspective transformation with multiple vanishing points. The final single-element 1 x 1
matrix produces overall scaling.

Overall operation of the matrix T on the position vectors of a graphics object produces a
combination of shearing, rotation, reflection, translation, perspective, and overall
scaling.

Vector Arithmetic
Programs that require repetitive computations on multiple sets of operands lend
themselves to vector-processing algorithms, in which the operands are viewed as
succeeding elements of long "data vectors." The next two sections outline the
programming for commonly-used vector operations. Most of these examples conclude
with a comparison of program timing for pipelined (internal pipeline registers enabled)
and unpiped (internal pipeline registers disabled) operation. For convenience, the
operations are labeled "computational," which includes simple and compounded adds,
multiplies, and divides, or "compare," which can be used to select maximum or minimum
values from succeeding pairs of numbers or from a list.

7-228

Computational Operations on Data Vectors
This section covers the following vector operations: vector add, vector multiply, vector
divide, sum of products (also called inner, scalar, or dot product), and product of sums.
Since matrix multiplication is composed of a sequence of sum of products operations,
these two functions are discussed in the same section. In some cases, a whole class of
operations is covered under one heading. For example, the vector add operation
includes sums and differences of Ai, Bi, IAi I,and IBi I in all combinations.

Vector Add
The vector add operation adds corresponding components of data vectors to obtain the
components of the output vector. Hence, for input vectors A and B and output vector V,
each with N components,
Vi = Ai

+

Bi,

1

:$

i

:$

N.

The 'ACT8837 and 'ACT8847 perform this calculation in unchained, independent ALU
mode.
Table 62 shows the contents of the data registers at successive clock cycles for N = 6
with the FPU operating in pipelined mode. Since the data travels by way of the internal
pipeline register, two cycles pass before the first sum appears in the S register. The
contents of the internal pipeline register are not given in the flow.

Table 62. Data Flow for Pipelined Single-Precision Vector Add, N
RA
RB

A1
B1

A2
B2

1

2

S

A3
B3
A1+B1

=6

A4

A5
A6
B4
B6
B5
A2+B2 A3+B3 A4+B4 A5+B5 A6+B6

p

C
Y

ClK

Y1
3

Y2
4

Y3
5

Y4
6

Y5
7

Y6

8

9

Data transfers and operations for each clock cycle are summarized in the program listing
in Table 63. Detailed information on the instruction inputs required to perform each ,....
operation is included in sections 5 and 7. Note that the selection of the output source (in qthis case, the S register), which is determined by the 16 instruction bit, is programmed ~
along with the ALU or multiplier operation that generates the output.
~

U

c:r
z
q,....

en

7-229

Table 63. Program Listing for Pipelined Single-Precision Vector Add, N
REGISTER TRANSFERS

ALU OPERATION

1.
2.
3.

lOAD RA, RS;
lOAD RA, RS;
lOAD RA, RS;

Y--s
y--s
y--s

ADD(RA,RS)
ADD(RA,RS)
ADD(RA,RS)

6.

lOAD RA, RS;

y--s

ADD(RA,RS)

=6

MULTIPLIER
OPERATION

Timing and programming are similar for other independent ALU operations involving two
operands, such as (A - B), (B - A), and compare (A,B). However, when the compare
function is used, two status bits must be generated before numeric values can be output
(see "Compare Operations on Data Vectors").
Because the vector add program closely parallels that for vector multiplication, pipelined
and unpiped modes for both vector add and multiply are compared in the next section.

Vector Multiply
The vector multiply operation multiplies corresponding elements of data vectors to
obtain the components of the output vector. Hence, for input vectors A and B and output
vector Y, each with N components,
Yi = Ai x Bi,

1 s; i s; N.

The 'ACT8837 and 'ACT8847 perform this calculation in unchained, independent
multiplier mode.

Pipelined Mode

CJ)

Table 64 shows the contents of the data registers at successive clock cycles for N = 6
with the FPU operating in pipelined mode. The product may be replaced by a variety of
other independent multiplier operations, such as - (A x B), A x IB I, - (A x IB I), IA I
x IB I, and - ( IA I x IB I). Data transfers and operations for each clock cycle are
summarized in the program listing in Table 65.

Z

--.J
~

l>

(')

-i
00
00

Table 64. Data Flow for Plpelined Single-Precision Vector Multiply, N = 6
RA
RS

A1
S1

A2
S2

S
P
C

~

--.J

Y
Y1
elK

7-230

A3
S3
A1 xS1

1

2

Y1
3

A4
S4

A5
S5

A6
S6

A2xS2 A3xS3 A4xS4 A5xS5 A6xS6

Y2
4

Y3
5

Y4
6

Y5

Y6

7

8

9

=6

Table 65. Program Listing for Pipelined Single-Precision Vector Multiply, N

1.
2.
3.

lOAD RA, RB;
lOAD RA, RB;
lOAD RA, RB;

V-P
V-P
V-P

MULTIPLIER
OPERATION
MUlT(RA,RB)
MUlT(RA,RB)
MUlT(RA,RB)

6.

lOAD RA, RB;

V-P

MUlT(RA,RB)

ALU OPERATION

REGISTER TRANSFERS

Unpiped Mode
Table 66 shows the contents of the data registers at successive clock cycles during a
vector multiply operation for N = 6 with the FPU operating in unpiped mode. The vector
add operation progresses similarly. Since there is no "single-clocked storage" in the
internal pipeline register, each product or sum is performed in one cycle.
Table 66. Data Flow for Unpiped Single-Precision Vector Multiply, N
RA
RB

A1
B1

A2
B2

A3
B3

A4
B4

A5
B5

=6

A6
B6

S
p
C
V
ClK

A1 xB1

1

A2xB2 A3xB3 A4xB4 A5xB5 A6xB6

V1
2

V2
3

V3
4

V4
5

V5
6

V6
7

8

9

Comparison of Pipelined and Unpiped Modes
For both vector add and vector multiply operations carried out in pipelined mode, results
are output to the Y bus on clocks 3, ... , N + 2. In unpiped mode, results are output to the
Y bus on clocks 2, ... , N + 1, thereby saving a cycle. Unfortunately, it is necessary to
operate at a lower clock rate in unpiped mode than in pipelined mode. The following
equation can be used to determine which of the two modes provides the faster
performance in a particular application. Pipelined operation is faster if:
(N

+

2)/Fp < (N

+

1)/Fu,

1'0

¢

where Fp and Fu are the clock rates in pipelined and unpiped modes, respectively. As of
publication, pipelined mode provides faster performance for input vectors with N > 2.

ex)

~

U

 4.

Product of Sums
The product of sums operation adds corresponding elements of data vectors and
multiplies the resulting sums. For input vectors A and B, each with N components, the
product of sums operation yields a single output Y defined as follows:

N
Y=

'IT (Ai

+

Bi)

i=1
The product of differences can be computed by simply making the ALU operation
(A - B) or (B - A). The 'ACT8837 and 'ACT8847 perform this calculation in chained
mode so that concurrent operation of the ALU and multiplier is possible. The data flow
and program listing for the product of sums are identical to those for the sum of products,
except that the roles of add and multiply are reversed. The criteria used to decide
between pipelined and unpiped modes are also identical to those previously given.

Vector Divide
The vector divide operation divides corresponding elements of data vectors to obtain the
components of the output vector. Hence, for vectors A and B and output vector Y, each
with N components,
Yi = Ai / Bi,

en

1 :5 i :5 N.

2 The 'ACT8837 and 'ACT8447 perform this calculation using the Newton-Raphson
~ iterative method. This algorithm, which is described in detail in the SN74ACT8800 Family

l> Data Manual,

calculates the value of a quotient Y by approximating the reciprocal of the

Q divisor B and then multiplying the dividend A by that approximation.
00
00
~

The following sections review the vector divide programs for the 'ACT8837 and the
'ACT8847. In the 'ACT8847, the divide algorithm is built-in.

7-234

SN74ACT8837 Vector Divide
For division using single-element inputs A and B, the value of the reciprocal of B,
denoted by X, is determined iteratively using the following equation:
Xi + 1 = Xi (2 - B

x Xi)

The seed approximation, XO, is assumed to be given. The iteration stops when X is
determined to the desired level of precision. Assuming the presence of a seed ROM
providing 4-bits accuracy, three iterations are necessary to correctly determine a singleprecision result X. Given the seed for 1/B = XO, Xi+1 = XI (2 - B x Xi). A is eventually
multiplied by the value "s.
An 8-bit seed ROM is commonly employed and gives single-precision accuracy in only
two iterations and double-precision accuracy in three iterations. Instructions for
implementing an 8-bit seed ROM are included in the SN74ACT8800 Family Data Manual.
This example assumes that a 4-bit seed is used to develop the program.

Pipelined Mode
The 'ACT8837 performs the vector divide in chained mode. Table 70 shows the data flow
for pipelined operation. The value of (2 - B x Xj) is denoted as Ti. Note that the value X3
does not appear, per se, in the table, but is expressed in terms of X2 to save
unnecessary calculations. The output Y is determined from the calculation of (A x X~
x T2 in cycle 17, which is equivalent to A x X3, since X3 = X2 x T2.
In order to keep Xi available for the final calculation of Xi+ 1, a few programming "tricks"
are employed to keep the original value of each Xi within the chip while it is being altered
in the calculation of (2 - B x Xi). First, Xi is stored in the 5 register by adding 0 to it. Then,
when the 5 register is needed, Xi Is moved to the P register by multiplying it by 1.

Table 70. Data Flow for 'ACT8837 Pipellned Single.Precision
Vector Divide, N = 1

RA
RB

XO
B

B
XO
BxXO

S
P

TO
XO

X1
BxX1

X1

C
V

ClK

1

2

RA
RB
S
P

3

4

5

6

7

8

9

10

B
T1
X1

A
X2
BxX2

X2

T2
AxX2

V

C

v

ClK

V

11

12

13

14

15

16

17

18

19

20
7-235

Data transfers and operationsa.resummari~ed, in the program; listing in Table 7,1._
Because no operations begin. on even~numbered cycles,only the odd-nuJTlbered .clock
cycles are shown.
.
. . .. ,
...
Table 71. Program Listing for 'ACT8837 F'ipelineC\Single-Precision
Vector Divide, .N=1
REGISTER TRANSFERS
1.
3.
5.
7.
9.

LOAD RA, RB

ADD(RA,O)
ADD(2,':"P),

LOAD RA

AOD(P,O) ,
ADD(2,-P)

LOAD RA
LOAD RB

ADD(P,O)
ADD(2,-P)

11.
13.
15.
17.

MuLTIPLIER '.
OPERATION'
MULT(RA,RB)
MULT(S,1)
MULT(S,P)
MULT(RA,P)
. , MULT(S,1)
.,
MULT(S;P)
MULT(RA,P)
MULT(S,RB) .' .
MULT(S,P)

ALUQPERA:rION "

Y-P

In steps 1, 7, and 13, 0 is added to Xi so that Xi appears two cycles later in the S register.
In steps 3 and 9, the Xi value inthe S register is multlpliedby 1 s6 that it appears in the P
register two cycles later. In step 15, Xi (from the S register) is multiplied by the dividend A
just input to RB.
Because no operations begin on ,even cycles; two vector divide operations may
be interleaved, calculating two. quotients in .20 cycles. Table 72 shows the data flow
for computing two quotients, Y1 and Y2, where Y1 = AlB and Y2 = C/D. The
approximation for 1/B is denoted by Wi, an~ the approximation for 1/D is denoted by Xi.
Ti = (2 - B x Wi), and Qi = (2 - D x Xi).
Table 72. Data Flow for 'ACT8837 Pipelined Slngle.~Precisionlnterleaved
Vector Divide, N == 2
RA
RB
S
P
C
Y
eLK
RA
RB
S

7-236

WO
B

XO
D
WO
XO
BxWO DxXO

1

2

0

B

3

4

B

D

P

T1
W1

01
X1

W2

X2

C
Y
CLK

11

12

13

14

.

TO
WO

00
XO

W1

X1

5

6

7

8

A
C
W2
X2
T2
02
BxW2 DxX2 AxW2 CxX2

15

16

17

18

W1
X1
BxW1 DxX1

9

10

Y1

Y2

Y1
19

Y2
20

The program listing for an interleaved vector divide is similar to that for a single divide
operation, with functions listed in each odd line and duplicated in the next even line for
the second operation.
As previously stated, the time needed to compute two single-precision divide operations
starting with a 4-bit seed ROM is 20 clock cycles. Since a new pair of divides can start at
ClK = 19, the time required to perform the vector divide operation on two N-dimensional
vectors is given by the following equation:
TIME = [18 x CEllING(N/2)

+ 2] cycles,

where the ceiling function rounds to the next highest integer for fractional values. With an
8-bit seed ROM, the time reduces to [12 x CEllING(N/2) + 2] cycles, which equals
2.5 million divides per second at 15 MHz.

Unpiped Mode
Table 73 shows the data flow for a vector divide in unpiped, chained mode.

Table 73. Data Flow for 'ACT8837 Unpiped Single-Precision
Vector Divide, N = 1

RA
R8

XO
B

B

XO TO
BxXO XO

S
P

X1

B

T1
X1
BxX1 X1

A
X2 T2
X2 8xX2 AxX2

C

y

ClK

Y
y

1

2

3

4

5

6

7

8

9

10

This program uses the same methods as the pipelined version to keep Xi within the chip.
The time needed to compute a vectOr divide of two N-element vectors is (9N + 1) cycles
with a 4-bit seed ROM and (SN + 1) cycles with an 8-bit seed ROM.

Comparison of Pipe lined and Unpiped Modes

~

Using a 4-bit seed ROM, pipelined mode is faster if:

CO
~

[18 x CEllING(N/2) + 2]/Fp < (9N + 1)/Fu,
where Fp and Fu are the clock rates in pipelined and unpiped modes. As of publication,
pipelined mode provides faster performance for input vectors with N > 1.

~
~

"

Z

CIJ

7-237

A General Principle
The vector divide example illustrates a general programming principle that should be
considered whenever a program begins a new instruction every other cycle. In cases
where the C register is not used, it is simple to interleave another program, even one not
performing the same function.
Interleaving programs is not as easy if the C register is used because the C register is the
only nonpiped register. However, even using the C register, programs may often be
interleaved by staggering one against the other so that their use of the C register does
not overlap in time. Many of the programs so far discussed can be thought of as two such
interleaved programs, with the C register being used to delay the first result until it can be
combined with the second. (See, for example, the sum of products operation.)

SN74ACT8847 Vector Divide
Since the 'ACT8847 has a built-in algorithm for divide, the microprogram is more simple
than that for the 'ACT8837. Table 74 shows the data flow for pipelined operation. Data
transfers and operations are summarized in the program listing in Table.75.

Table 74. Data Flow for 'ACT8847 Plpelined Single-Precision
yector Divide
RA
RB

A1
B1

A2
B2

S
P
C
V
ClK

A1/B1
V1
1

2

4

3

5

6

7

8

9

10

Table 75. Program Listing for 'ACT8847 Plpelined Single-Precision
Vector Divide

1.

lOAD RA, RB;

v-p

MULTIPLIER
OPERATION
DIVIDE

7.

lOAD RA, RB;

v-p

DIVIDE

13.

lOAD RA, RB;

v-p

DIVIDE

REGISTER TRANSFERS

7-238

ALU OPERATION

Note that the microinstructions are presented on the steps indicated (1 , 7, 13, ...), with a
six-cycle lapse before the next operands can be input to RA and RB. Performing avector
divide of two N-element single-precision vectors takes (6N + 2) cycles in pipelined
mode. M such pairs of vectors would require [6(N x M) + 2] cycles in pipelined mode. In
unpiped mode, the equation is 7(N x M).

Compare Operations on Data Vectors
In 'inde'pendent ALU mode (unchained), two operands may be compared for equality
(A= B) and order (A > 8). Additionally, the absolute Values of either or both operands
may, be compared. The compare function' uses two status bits, the AGTB and AEQB
output signals. (When any operation other than a compare is' perforrned, either
by the ALU or the multiplier, the' AEQB signal is used as a zero detect. Hence, numerical
results cannot be output in the same cycle in which comparison status is output.)
For greatest efficiency, programs for compare operations should be written without
requiring conditional branches in thssequencer. If branches can be avoided,the
rnicrocoding is simplified and the programs are immediately scalable to SIMDsystems
employing many 'ACT8837 or'ACT8847 chips.
This section covers vector max/min and,

~st

max/min operations.

Vector MAX/MIN
The vector max/min operations compare corresponding elements of data vectors and
select the maximum or minimum value to obtain the components of the output vector.
Hence, for input vectors A and B and output vector Y, each with N components,
Yi = MAX/M IN (Ai , Bj),

1 s i s N.

Pipelined Mode
Table 76 shows the suggested data flow for a pipelined vector MAX operation, where Yi
is set to the max of (Ai, Bi) for all i. Included are rows to indicate the setting of the chain
mode instruction bit (19 for the 'ACT8837, 110 for the 'ACT8847) and the status bit being
sensed.

~

~

00
00
~

o

~

v
,...,.

z

en

Table 76. Data Flow for Pipelined Single-Precision Vector MAX

CHAIN
RA
RB
S

N
A
B1

Y
A1

Y

Y
B1

N

Y

A2.

A2.

Y
B2

N
A3
B3

B2
B1
A1

A1

P

C
Y
STATUS
ClK

Y

Y
A3
B2

A2.

A2.

Y1
1

A>B
3

2

Y2
A>B

4

5

6

7

8

10

9

A comparison starts at ClK = 1, 5, etc., when the chain-mode instruction bit is low. The
result appears at ClK = 3, 7, etc., indicated by the AGTB and AEQB signals. AGTB is
saved off-chip for use as instruction bit 16 (output source) at ClK 4, 8, etc. This value for
16 selects the output source, either the multiplier or the AlU result, at elK 6, 1D, etc. For
example, if a comparison result is A > B, the AGTB signal goes high and is used to set 16
high. 16 then selects the multiplier result (Ai) to output. Similarly, if A :s B, AGTB and 16
are low, and the AlU result (Bi) is output. The circuitous route taken by Ai on the way to
the P register is necessary because it is not possible to pass RA or RB through the
multiplier in parallel with passing the other through the AlU.
The program is not particularly well-packed and produces the vector max of a pair of
vectors of length N in (4N + 2) cycles. For M pairs of vectors of length N, the total time is
(4MN + 2) cycles. The program can be improved by applying the interleaving principle
previously discussed. The steps are rearranged so that a new operation begins every
other cycle, thus allowing two compare programs to be interleaved. Table 77 shows the
suggested data flow for a pipelined vector min/max operation, where Vi = MAX/MIN(Ai,
Bi) and Zi = MAX/MIN (Ci, Di).

Table 77. Data Flow for Pipelined Single-Precision Interleaved
Vector MAX/MIN

en
:2
.....
~

»

n

-f
CO
CO

~
.....

CHAIN
RA
RB
S

N
A1
B1

N
C1
01

Y
A1

Y
C1

Y
B1

Y
01

A1

C1

P

C
Y
STATUS
ClK

1

A>B A>B
2
4
5
3

6

N
B2
B1
A1

N
C2
02
01
C1

Y1

Z1

A2.

7

Y
A2.

Y
C2

Y
B2

Y
02

A2.

C2

A>B A>B
9
10 11
8

12

N

N

B2
A2.

02
C2

Y2

Z2

13

14

Again, Ai (and Ci) reaches the P register by an indirect route. However, this tighter
program performs M vector comparisons, two vector comparisons at a time, in
[6 x N x CEILlNG(M/2) + 2] cycles. (As previously defined, the ceiling function rounds
to the next highest integer for fractional values.) In this example, two separate vector

7-240

comparisons on two-dimensional vectors are performed, giving 6 x 2 x 1 + 2 = 14
cycles. For M = 2 pairs of vectors, all of length N, the second program is as good as the
first. For M > 2, the interleaved program performs increasingly better as M gets larger.
This second program requires more off-chip logic, since the status outputs at CLK 3 and
4 must be saved separately off-chip for use at CLK 5 and 6, respectively. This problem
can easily be avoided by starting the calculations on the second pair of vectors two
cycles later than shown (Le., at CLK 4). The time necessary to perform the vector MAX
operation on M pairs of N-dimensional vectors, two pairs concurrently, then increases to
[6 x N x CEILlNG(M/2) + 4] cycles.
Data transfers and operations for the odd lines only are summarized in the program
listing in Table 78. The complete program is obtained by repeating the equivalent of
each odd-numbered line in the next even line for the second pair of vectors.
Table 78. Program Listing for Pipelined Single·Precision Interleaved
Vector MAX/MIN
REGISTER TRANSFERS
1.
3.

LOAD RA, RB
LOAD RA
LOAD RA;

5.

ALU OPERATION

V-PIS

MULTIPLIER
OPERATION

COMPARE(RA,RB)
ADD(RA,O)
ADD(RA,O)

MULT(S,1)

Unpiped Mode
Table 79 shows the data flow for an un piped vector MAX operation.
Table 79. Data Flow for Unpiped Single·Precision Vector MAX
CHAIN
RA
RB
S
P
C
V
STATUS
CLK

N
A1
81

V
A1

V
B1
A1

N
A2
B2
B1
A1

V
A2

V
B2

A3

A2

B3
B2

1

2

4

V

B3

A3

V2
A>B

3

V
A3

A2

V1
A>B

N

5

f'
.;:t-

A>B

6

7

8

OO
00

9

The status bit is saved off-chip at CLK = 2, 5, etc., and used at CLK = 3, 6, etc., as
the 16 bit of the instruction. 16 selects either the multiplier or ALU result to output to the
Y bus at CLK = 4,7, etc.

I-

U

~
f'

Z
The program computes the vector comparison of M pairs of vectors of length N in
[3 x M x (N + 1)] cycles.

7-241

fJ)

Compa.rison

of Pipelined and Uripiped Operati6~ .

Pipelined operation is faster if~
[6

x

N

x

+ 2J/Fp

CEILlNG(M/2)

< (3 x M x N + 1)/Fu,

whereFp and, Fu are the clock rat~jn pipelined.andvnpiped mOdes, resp~cti~eIY. As of
publication, pipelined. mode provides faster performance for,~ >,1.

Ust MAX/MIN

"',"

;,

The list max(rhin operatiOr:1SSelectthEi maximum ormjnirn,un;l value,Z,of a list of N
elements. Hence, for input vector A with N components and output Z,
,

"

Z

= MAX/MIN (Ai) ,

"1 ::;i::; N.

List min/max is an essential operation in computer graphics because it is used to find the
"extents" of a polygon or polyhedron. The extents are the maximum values of X, Y; and Z
among the .list of vertices for the 'object in question. Many fotms of comparison are
possible since the absolute value of either or both ALU operands may be employed.
However, the example in this section assumes that the largest element of a list of
N elements .is desired. '

Pipelined Mode
Table 80 shows the data flow for a pipeli~ed list MAX op~ration,
where M1 = MAX(A1, A2);Mi == MAX[M(i"::'1), A(i+1)],2 ::; I::; N - 2.;'

Table 80. Data Flow for Pipelined .Single-Precision List MAX

CHAIN
RA

Y N Y Y
A1 A1 A2..

RB
S

'A2

Y
'

A1

Y

Y

M1
A3

A1

C

M1 M1

y

;

3

Y

Y

Y

Y
'

.

Y
•

,t,,'

M3

M2
A4

M2 M2

M3

"

M3

,

A>B
2

N Y
A4 A4

, ...

A2

p

STATUS
CLK
1

N Y
A3 A3

4

A>B
5

6

7

8

9

A>B
10 11 12 13 14 15 16

;

As with vector comparison, the max/min of the absolute values isaVaiiable, since the
chip operates in independent ALU mode on the comparison steps. The comparison is
between the RA register and the RS register in step 2 and between RA and C in steps 6,
10, etc. In these steps, the chip is switched into unchained, independent ALU mode. The
status is saved off-chip and used to set the SRCC Signal, which selects whether the P or
S data goes into the C register in steps 5, 9, etc.

7-242

When the list max is in the C register, at ClK == 4N - 2, the C register contents must
then be passed through one of the functional units to the output. The MAX/MIN of an
N-element Ust therefore takes 4N cycles. M such vectors can be processed in
[M(4N - 1) + 1] cycles.
Data transfers and operations for the list max operation are summarized in the program
listing in Table 81. The program is carried but in pipelined mode, alternating between
unchained and chained modes. The list max reaches the output in cycle 4N.

Table 81. Program Listing for Pipellned Single-Precision List MAX
REGISTER TRANSFERS
1.
2.

3.

ALU OPERATION

LOAD RA
LOAD RA, RB
LOAD RA

MULTIPLIER
OPERATION

ADD(RA,O)
COMPARE(RA,RB)
ADD(RA,O)

MULT(S,1)

COMPARE(RA,C)
ADD(C,O)

MULT(RA,1)

4.

5.
6.

7.
8.
9.

C
LOAD RA
LOAD RA

+-

PIS

C +- PIS
REPEAT STEPS 6 THROUGH 9 UiJTIL STEP 4N-2 IS RjACHED, THEN:
ADD(C,a)
4N - 2
Y +- S

Comparison of Pipelined and Unpiped Modes
The equivalent unpiped program takes [M(3N -1) + 1] cycles. Pipelined mode is fastest
if:
[M(4N - 1)

+ 1J/Fp < [M(3N - 1) + 1]/Fu,

where Fp and Fu are the clock rates in pipelined and unpiped modes, respectively. As of
publication, pipelined mode provides faster performance for all M and N.

Graphics Applications
This section summarizes the concepts related to creating a three-dimensional image "
and examines a few of the matrix operatiohs used in three-dimensional graphics :;
processing. These operations include coordinate transformations and clipping 00
operations. Additionally, this section illustrates some of the programming techniques
used to perform these operations.

t;

«

~

"

2!

(fJ

7-243

Creating

a 3·0 Image

ConceptUlllly, translating 3-D images to 2~D display screens involves defining a view
volume that limits the scope of the vista the viewer can see at one time. For simplicity, a
standardized frame bf reference, in which the viewer's eye is located at the brigin of the
coordinate system, is adopted in this example.
.
As illustrated in Figures 76a and 76b, the arbitrary world coordinates of the objects under
scrutiny are transformed into normalized "viewing". or "eye" coordinates that reflect this
frame of reference. Once the normalizing transformation is complete, the images within
the view volume are projected onto a 2-D view plane, which is assumed to be located,
like a projection screen, at a suitable relative distance from the viewer (see Figures 76c
and 77).
A basic model for creating a 3-D view, illustrated in Figure 78a, transforms arbitrary world
coordinates to normalized viewing coordinates and then "clips" the image to remove
lines that do not fall within the normalized view volume. Clipping is followed by projecting
the image to the 2-D projection plane (or "window"). The image is then mapped onto a
canonical 2·0 viewport display and from there onto the physical device.
To incorporate image transformations, another model must be adapted (see Figure 78b).
After clipping, instead of projecting to the view plane, a perspective transformation is
performed on the Clipped viewing coordinates, transforming the view volume into a 3-D
Viewport, the "screen system" in which image transforms are performed. Then the image
is projected to the 2-D viewport display and onto the physical device.
In .both models, the Clipping operation is performed on coordinates in the vi(3wing
system. This approach is referred to as "clipping in the eye system." In practice, clipping
is often performed after transformation to the screen system. A trivial accept/reject test is
performed on viewing coordinates, the image is transformed to the screen system, and
then Clipping is performed.

7-244

Y

vup'
c----vup

x

a

.Figure 76a. In sequence of transformations, the world coordinate positions for the house are
transformed into the normalized viewing coordinate system (also called the eye system). For clarity,
the house is pictured outside the view column. Also shown are the direction vectors VUP (view up),
VPN (view normal), and VUP' (the projection of VUP parallel to VUN onto the view plane.

Yv

vup'

Figure 76b. After a series of translations,
rotations, and shearing and scaling
operations, the view volume becomes the
canonical perspective projection view volume,
which is a truncated pyramid with apex at the
origin, and the house has been transformed
from the world to the viewing coordinate
system.

Figure 76c. This figure illustrates the
projection of the house from the perspective
of the viewer, with eye located at the origin of
the coordinate system.

""d'

00
00
~

u

«"d'

Figure 76. Creating a 3-D Image
J. D. Foley and A. Van Dam, Fundamentals of Interactive Computer Graphics, Addison-Wesley Publishing
Company, Reading, MA, 1982,291-293. Reprinted with permission.

7-245

"enZ

The following sections illustrate programming techniques used in both of these
approaches to normalizing, clipping, and tra,nsforming a 3-D image. The operations are
grouped as "3-D Coordinate Transforms," "Clipping in the Eye System," and "Clipping in
the Screen System."

Y

VIEW VOLUME

WORLD COORDINATE SYSTEM

PROJECTION
PLANE

VIEWING (EYEI
COORDINATE SYSTEM

Figure 77. View Volume
Adapted with permission from a paper by Stephen R. Black entitled "Digital Processing of 3-0 Data to Generate
Interactive Real-Time Dynamic Pictures" from Volume 120 of the 1977 SPIE journal "Three Dimensional
Imaging."

7-246

3-D
WORLD

TRANSFORM
TO
EYE SYSTEM

~

COORD

2-D
VIEWING
COORD

•

VIE~NG

-----i.~

COORD

TRANSFORM
TO
2-D VIEWPORT

[J
CLIP

2-D
NORMALIZED
DEVICE
COORD

PROJECT
TO
WINDOW

---+~

TRANSFORM
TO
PHYSICAL
DEVICE

~

Figure 78a. Model of Procedure for Creating a 3·0 Graphic
3-D
WORLD
COORD

~

TRANSFORM
TO
EYE SYSTEM

VIE~NG
---+.
COORD

[J
CLIP

3-D
NORMALIZED
DEVICE
COORD

•

3-D
IMAGE
TRANSFORM

PROJECTION
TO
2-D

-----i.~

TRANSFORM
TO
SCREEN
SYSTEM

2-D
NORMALIZED TRANSFORM
_ _ _.~
TO
DEVICE
COORD

PHYSICAL
DEVICE

Figure 78b. Model for Creating and Transforming a 3·0 Image

Three-Dimensional Coordinate Transforms
One of the computationally-intensive functions of a 3-D computer graphics system is that
of transforming points within the object space, such as translating an object or rotating
an object about an arbitrary axis. Equally complex is the transformation of pOints within
the object space (or "world coordinate system") into pOints defined by a particular
perspective and located within the viewing space (or "eye coordinate system"). This
latter process, known as the viewing transformation, generates points in a left·handed
cartesian system with the eye at the origin and the z-axis pointing in the direction of view.
The arbitrary world-system view volume and the objects therein are translated, rotated,
sheared, and scaled to match the predefined, canonical view volume of the eye system.

"
"=t

~
I(,)

«

For a "realistic" image, the canonical view volume will be a truncated pyramid that mimics
"=t
the cone of vision available to the human eye. Alternatively, the volume can be a unit
cube. The series of operations that make up each transformation differ, but if Z
CJ)
homogeneous coordinates are used, either transformation can be expressed as a
simple matrix multiply.

"

7-247

For each point (X, V, Z) in the world system, a projection inhomogeneous coordinates is
denoted by (Xh, Vh, Zh, Wh) where,
.
. " ,
(Xh, Vh, Zh, Wh)= (X x Wh, V x Wh, Z x Wh, Wh),
and Wh is simply a scale factor, typically unity whenJloating point numbers areus.ed.
(With fixed point values, non unity values of Wh are used to maximize use of the numeric
range.) To transform a point in homogeneous coordinates, it is post-multiplied bya4x4
transform matrix:;',')~",:
[Xh', Vh', Zh', Wh'] = [Xh, Vh, Zh, Wh] x [A11 A12 A13 A14]
A21 A22 A23 A24
•. A31.A32A33,A34
A41A42A43A44
The transformed pOint can later be converted back to 3-space by dividing byWh: .

The transform matrix is constructed by multiplying together a sequence of matrices,
each of which performs a simpl!:! task. The product of 4 or 5 elementary matrices may be
used to perform some complex overall operation on a set of points representing an
object or an entire scene. Once constructed, the transform matrix is used on each point
of the object to be transformed.
This section describes two approaches to the viewing transformation--the gener'aJ c~se
and the specific yet typical case in which a reduced version of the transform matrix m/iiY
be used. Performance times are given for 15-MHz and 3D-MHz frequencies, which
roughly correspond to the operating speeds of the '8837 and '8847, respectively.

Operation with General Transform Matrix
Table 82 shows part of the data flow for the pipelined and chained program for the
product of the homogeneous point [X, V, Z, W] and the 4 x 4 transform matrix A~
Table 82. Partial Data Flow for Product of
General Transform Matrix

[X, V, Z,

W] and

RA

X

y

Z

W

x

y

z

w

x

y

RB

A11

A21

A31

A41

A22

P1 (1)

P2(1)

A12
51 (1)
P3(1)
P2(1)

P4(1)
P2(1)

A32
53(1)
P1(2)

A42
54(1)
P2(2)
53(1)

A13
'51 (2)
P3(2)
P2(2)

A23
T1
P4(2)
P2(2)

3

4

5

6

7

8

5
P

·C

X!

Y

ClK

7-248

1

2

9.

10

The technique is that already illustrated for the sum of products operation. The numbers
in parentheses indicate which column ofthe transform matrix is involved in the operation.
Here, P1 (i) = X x A1 j, P2(i) = V x A2i, etc. 51 (i) = P1 (i) + 0, 53(i) = 51 (i) + P3(i), 54(i)
= P2(i) + P4(i), and Ti = 53 (i) + 54(i). T1 = X', T2 = V', T3 = Z', T4 = W'. As in the sum
of products illustration, in order to make the most efficient use of the 5 register, P2 is
used directly instead of summing by 0 to form 52.
The time to transform N pOints in a system is 16N + 6 cycles. The system can transform
approximately .94 million points per second at a clock rate of 15 MHz and 1.875 million
pOints per second at a clock rate of 30 MHz.

Operation with the Reduced Transform Matrix and Wh = 1
Because viewing transformations are frequently carried out using a single-vanishingpoint perspective, the 3 x 1 column that performs perspective transformations with
multiple vanishing pOints is often not used. Additionally, with Wh = 1, the 1 x 1 scale
factor is often equal to one. In these cases, the transform matrix takes the following form:

[".0]
... 0

".0
".1

:

With multiple vanishing points, and in other graphics operations such as clipping, 4 x 4
matrices are used with nonzero values in the fourth coiumn. The transform matrix is
termed "reduced" when its fourth column is the same as that previously shown. In such
cases, the transform of each point requires only 9 multiplications and 9 additions.
Table 83 shOws part of the data flow for the reduced matrix program.

Table 83. Partial Data Flow for Product of [X, V, Z, W] and Reduced
Transform Matrix
RA

RB

X
A11

Y
A21

5
P

Z
A31
P1 (1)

x
A41

X
A12

P2(1)
P1(1)

P3(1)
P2(1)

4

5

Y
A22
51 (1)

x

Z
A32
52(1)
P1 (2)
51 (1)

P2(2)
P1 (2)

7

8

A42

Y

eLK

1

2

3

6

X
A13
T1
P3(2)
P2(21
X'

9

~

'I::t

co
co
~
u

«

Again, the numbers in parentheses refer to the column of the transform matrix involved in 'I::t
~
the operation. In this case, however, only the first three columns are used. Hence, for Z
1 :s i :s 3, P1(i) = X x A1i, P2(i) = V x A2i, etc. 51 (i) = P1(i) + A4i. 52 (i) = P2(i)+ P3(i), en
and Ti = 51 (i) + 52(i). T1 = X', T2 = V', T3 = Z'. Note that W values are not calculated
since they are all 1.

7-249

The time to tran~form N pOints in a system is (12N + 5) cycles; The system can transform
1,25 million points per second 'at 15 MHz and 2.5 million points per second at 30 MHz.

Three-Dimensional Clipping
Once an image istransfotmed into viewing coordinates, it mustbe clipped so that lines
extending outside the view volume are'removed. There are several approaches to
Clipping, some moreefficientthan others. This section surveys the most commonly used
techniques and estimates the throughput of several single- and multi-processor
arrangements.
'
First considered is the technique of fully clipping the line segments to fit within the
viewing pyramid in the eye coordinate system. This technique is commonly referred to
as "Clipping before division."
Clipping in the screen system is considered second; This method eliminates lines that
are obviously invisible in the eye system; the rest are clipped after projection to the
screen.

Clipping in the Eye System
If an object is composed of straight line segments and a perspective view is to be taken,
the viewing volume is a pyramid defined by the following plane equations:

x = K x Z, X = -K x Z, Y = K x

Z, Y

= -K x

Z,

where K is a constant to be defined below. Thus, -KZ < (X,V) < KZ. Two other clipping
planes are usually employed at Z = Nand Z = F, where Nand F are the near and far
'
limits, respectively, of the view. This gives:

N < Z < F.

rJ)

Z

Looking in the direction of the z-axis (see Figure 79), the eye can imagine a screen
located at a distance N from the eye. K is formed from the half-screen height divided
by N. A specific line segment might intersect any or all of the six clipping planes. One
common approach to this problem is to use six processors in a pipeline, each Clipping
the line to one plane.

--..I
~

l>

(")

-t

00
00
~

--..I

7-250

/ ' _ k, pl.n,

SCREEN

~

Ve

<1i~~--+---",~"""",

k

_(~)

I&-

~=--=-_N-----J_J ,_x Plane~
-_kZ

FAR VIEWING
LIMIT

Figure 79. Viewing Pyramid Showing Six Clipping Planes
Consider the case of clipping the line defined by the points P1 = (X1, Y1, Z1) and
P2 == (X2, Y2, Z2) against the Z = N plane. First computed are (Z1 - N) and (Z2 - N). If
both are negative, the line is invisible, and a notation meaning an empty line is passed
on. If both are positive, both ends of the line are on the visible side of the Z = N plane,
and the line is passed on unclipped.
When one of these computed values is negative and the other positive, the line must be
clipped and the new values for its endpoints passed down the rest of the pipeline. To do
so, a parameter t that indicates what fraction of a segment Z1Z2, and therefore of P1 P2
as a whole, lies on the P1 side of the Z = N plane, is computed as follows:
t

=

(Z1 - N)/(Zl - Z2).

In general, the value of the parameter is derived as described in Newman and Sproull,1
using the following equations of the line: X = X1 + (X2 - X1)u; Y = Y1 + (Y2 - Y1)u;
Z = Z1 + (Z2 - Z1)u. These equations are each inserted into the corresponding plane
equation. In the current example, N = Z1 + (Z2 - Z1)t.

~
ex)
ex)

Since N is between Z1 and Z2, t is always positive, and the signs of Z1 - Nand Z2 - N ~
(,)
are used to determine which end to clip. If Z1 - N is negative, the P1 end is clipped, 

(')

-I

00
00

~
.....

LOAD
LOAD
LOAD
LOAD
LOAD

RA,RB
RA, RB
RA, RB
RA, RB;
RA, RB

LOAD RA" RB

y...s
y ....s

MULTIPLIER
OPERATION

AlU OPERATION

REGISTER TRANSFERS

C-S·

C-s

V-S
V-S
v-S

LOAD RA

ADD
ADD
ADD
ADD
'ADD
ADD
ADD
ADD
ADD
ADD

(RA,-RB)
(RA, - Ra)
(RA,-RB)
(RA,O)
(O,'-ICI)
(2,-P)
(C, -ICI)
(RA,-RB)
(RA, -RB)
(P,O)

ADD (2,-P)

LOADRB

tn" Z=, N PI~ne

MULT(RA,RB),

,
k'

,
\

, , MULT(S.I)'
..

MULT(S,P)

.

',"

. MULT(RA,P)
MULT(S,RB)
MULT(S,P)

LOAD RA
LOAD RA
LOAD RA
LOAD RA
LOAD RA,
LOAD RA,
LOAD RA,
LOADRA,
LOAD RB
LOAD RB

C-P
C-p
RB
RB
RB
RB

V-S
V-S
v-S
V-S
V-S
V-S

C ...... S

ADD
,ADD
ADD
ADD

(P,O)
(P,RB)
(P,RB)
(P,RB)

ADD (P,RB)
ADD (P,RB)
ADD (P,R6)

MULT(IRAI,IPI)
MULT(IRAI,ICI)
. MULT(RA,P)
. MULT(RA,C)
MULT(RA,C)
MULT(RA,C)
MULT(RA,C)
MULT(RA,C)

In pipelined mode, computing (Z1 -~) t~kes 2 cycles. This v~I,\Jeis passed off-chip~l'Jc;!
used to get the first approximation to 0.5/(Z1 - Z2) from,an 8-bit seed ROM. Ite,ration to
correctly determine the value begins in the 4th cycle, with subsequent operations
starting on even-numbered cycles. The computations of H1'and H2'are interleaved with
the divide algorithm and are completed before it. '
(X2 - X1), (Y2 - Y1), and (Z2 - Z1) are also ~omputeqduring th~divide.The'vaiues of
t1 and t2 are ready in steps 18 and 19. New values of X1, X2,Y1, Y2, Z1,andZ2 are all
computed and output by step 28. Each chip, therefore, clips against one Clipping plane
in 28 cycles. With a two-cycle overlap, the hextline segment can be presented in cycle

26 .

7-254

For the two X and two Y clipping planes, the c.alculations are slightly more complicated.
For the X = KZ plane, the two parameters ti are defined in terms of the values W1 = KZ1,
W2 = KZ2 and H1 = W1 - X1, H2= W2 - X2 as follows:

IH1'/;2(H1

t1 =

- H2) I and t2 =

IH2'/2(H1

- H2) I,

where, as before, Hi' = Hi - IHi I. The equations for the new endpoints, (X1', Y1', Z1')
and (X2', Y2', Z2'), are the same as before. It is still possible to compute the new
endpoints in under 30 cycles. At 15 MHz, a six-chip '8837 system would clip 577,000 line
segments per second.
In the '8847 a similar process is employed, but the built-in divide instruction is used
beginning in step 7 and ending in step 15. t1 and t2 are calculated by step 18, and the
entire operation completes in step 27, one cycle shorter than for the '8837. The data flow
is shown in Table 86. A six-processor '8847 system operating at 30 MHz would clip
1.2 million line segments per second with a new operation beginning every 25 cycles.

Table 86. Data Flow for Clipping a Line Segment Against the Z
USing the SN74ACT8847
RA
RB

Z1
Z2

Z1
N

Z2
N

d

S

X2
X1
H1

P
C

0.5
d
H2

X2X1

H1'

Y2
Y1

H1'

H2'
Y2Y1

H2'
1/D

H1

·Y

d

H2
X2X1

H1'

H2'

7

8

= N Plane
SAME AS FOR
'8837
t1

1/D
Y2Y1

t2
t1

STATUS

ClK

1

2

~

4

5

6

14

15

16

17

STEPS
20
THRU
28

18

Since the performance levels obtained from the six-chip systems described below are
slower than the rate of endpoint transformation by a single-chip system, some further
speed improvement is desirable. Hence, rather than going through the code for clipping
to the X and Y planes, another approach is proposed.

.....

Clipping to All Six Planes at a Time

~

The "window edge Clipping method" derived in Newman and Sproull can be used to clip CO
CO
to all six planes at once. Recall that the viewing volume for a perspective view is a Ipyramid defined by the following plane equations:
U
X = K x Z, X = -K x Z, Y = K x Z, Y

= -K x

Z, Z



n

~

00
~

To take advantage of this speedup, the only change in the sequence given above js that
while computing Q and R, the logjcal AND and OR is formed for the signs of the
corresponding pairs of values, Qj and Ri. This is best performed off-chip if the '8837 is
being used but may be done using independent ALU (unchained) mode in the '8837 or a
logical operation in the '8847. For the '8837, with two operands Qj and Ri, Table 89
shows the A > Bstatus bit for an A > BcomparisononA=-Qjx IRil and B = IQil x Ri
for all signs of Qj and Rj.

......

7-258

Table 89. A > B Comparison Function Table
Sign QI

Sign RI

+
+

-

Sign A

+
+

= -QI x IRJi
+
+
-

Sign B

= IQII x RI
-

+

-

+

A>B
T
F
F
F

A=B
F
T
T
F

The A > B status provides the needed AND function of the sign bits of Oi and Ri. In
computing these A > B values, if A > B is TRUE, the sequencer branches to code that
rejects the line as invisible. A comparison A > B of A = (Oi x IRi I) and B = (I Oi I x Ri)
gives the logical AND of the complement of the sign bits. It is TRUE when both Oi and Ri
are positive. If all six values are TRUE, the sequencer can branch to code that passes the
line segment unclipped.
For a three-processor parallel system, lockstep operation with a single sequencer is still
possible since aU three processors are working on the same line segment, and the
branch options apply equally to them aU. The estimated time for a three-processor
system is 56 cycles; not much interleaving is possible.
Now that the operations have been reduced to a minimum, the remaining steps are
necessarily sequential. Rejecting invisible or passing totally visible line segments without
division, however, is still beneficial.

Clipping in the Screen System
In most graphics systems, full line clipping is not performed in the eye system. Instead, a
trivial accept/reject test is performed, in which the line segments are simply tested
against the six clipping planes. If a line has both ends on the invisible side of anyone of
the Clipping planes, it is rejected. Lines surviving this test may still be outside the viewing
pyramid. In any case, the lines are transformed to the screen coordinate system and
then clipped against a cube defined by the simple plane equations -1 < (X, V, Z) < 1.
The next three sections describe this process.

Trivial Accept/Reject Test
In the eye system, the clipping planes are:
X

= W, X = -W, V =

W, V

= -W, Z = N, and Z = F,

7-259

where W = K x Z. After -W1 and -W2 are computed, a sequence of comparison
operations are performed, summarized as follows:
with
with
with
with
with
with

X1
X1
Y1
Y1
Z1
Z1

in
in
in
in
in
in

RB and
RA and
RB and
RA,
RB and
RA and

-W1 in P,
-W1 in e,
-W1 in e,

P > RB (Le., -W1 > X1)

lei

RA >

e>

(Le., X1 > W1)

RB

e

RA > I I comparison
RA > RB (Le. N > Z1)
RA > RB (Le., Z1 > F).

N in RA,
F in RB,

These six operations are carried out in successive cycles and then repeated for (X2, Y2,
Z2). The two six-tuples are saved off-chip and a bit-wise AND is carried out. If anyone of
the resulting six boolean values is TRUE, the line is rejected. This entire operation takes
only 16 cycles, thereby providing a speed of 1,071,000 line segments per second at
15 MHz and 2,143,000 line segments per second at 30 MHz. The data flow for an accept/
reject test is given in Table 90. Accept/reject testing of individual points takes only
8 cycles.
Table 90. Data Flow for Accept/Reject Testing
CHAIN

N

N

RA

K

K

RB

Zl

Z2

Y

Y

Y

Xl
Xl

Y

Y

Y

Y

Y

Y

Y

Y

Y

Yl

N

Zl

-W2

X2

-W2

Yl

N

Z2

Zl

F

X2

-W2

Yl

-W2

Z2

F

Yl

N

N

S

p

-W~

C

-Wl -Wl

Y

-W2

STATUS
ClK

1

2

3

-W2
-Wl

-Wl

-Wl
-W2
-W2
-Wl
>Xl Xl>Wl >Yl Yl>Wl N>Zl Zl>F >X2 X2>W2 >Y2 Y2>W2 N>Z2 Z2>F
12
13
14
15
16
4
7
9
10
11
5
6
8

Transformation to the Screen System

en
2
.....

After the line segments have passed the trivial accept/reject test, they are transformed to
the screen coordinate system. The following transformation is first applied to the Z
coordinate in order to scale its Clipping planes to Z' = -W, and Z' = W:
Z' = [-W x (F

+ N)]/(F - N) + (2 x W x Z)/(F - N) .

~

l> The value of 1/(F - N) is constant for all line segments and is therefore computed only

(")

-I
CO
CO

~
.....

once. In fact, two constants, a = 2K1(F - N) and b = - (F + N)/2, can be available so that
Z' = Z x a x (b + Z). (Note that other transformations on Z can also be used.)

After the trivial accept/reject test, the following transformation to the screen system
occurs:
Xs

7-260

= X/W, Ys = Y/W, Zs = Z'/W.

The clipping planes then have these equations:
Xs

= -1, Xs = 1, Ys = -1, Ys = 1, Zs = -1, Zs = 1.

Z1' and Z2' can be formed in 8 cycles. Only two reciprocals, 1/W1 and 1/ W2, need to be
computed, and they can be interleaved and completed in 13 cycles in an '8837 if an 8-bit
seed ROM is employed and in 12 cycles in an '8847. The line segment is transformed to
the screen system in a further 6 cycles. The total is 26 cycles for the 'ACT8847 and
27 cycles for the 'ACT8837. A single-processor system would transform 600,000 line
segments per second with a 15 MHz clock and 1.2 million line segments per second at
30 MHz.
Note that the above projection does not preserve planarity. See Newman and Sproull for
perspective projections that do preserve planes.

The Clipping Operation
The final operation on line segments is to clip them to the cube:
Xs

= 1, Xs = -1, Ys = 1, Ys =

-1, Zs

= 1 and Zs = -1.

It is important to realize that the required resolution of Xs , Ysand Zs may only be 10 or
11 bits. Any divisions needed in an '8837 implementation at this stage could feasibly be
done entirely by table look-up. It would certainly not be necessary to perform more than
one iteration if an 8-bit seed ROM is employed. Two divisions can therefore be
interleaved and completed in 7 cycles. However, three iterations are assumed in this
example to give full single-precision accuracy.
Consider a three-processor pipeline, with each processor clipping against two parallel
planes. The first will clip against the x planes -1 < X < 1. For clipping the P1 end of the
line segment, 0 = (1 + X1, 1 - X1) is computed and 0' is formed, where OJ' = 0i -I Oi I.
I.e.,
01' = 2(1 + X1), if (1 + X1) < 0; 01' =
02' = 2(1 - X1), if (1 - X1) < 0; 02' =

a otherwise.
a otherwise.

At least one of OJ' will be zero; the other will be negative. Hence, MIN(01', 02') = 01' I"'+ 02' = [(1 + X1) - 11 + X111 + [(1 - X1) - j1-X11l. Therefore, MIN(01', 02') = (1 ~
- IX1j) - 11 - IX111. SO, t = l(m1-jm11) / 2dl and s = l(m2-lm2j) / 2dl, where ~
mi = 1 - lXii, and d = X1 - X2. Note that only one reciprocal is required per processor. ~

U

A three-processor parallel system would have each processor work on one dimension, 00,
2.54 (0.1001

I

35.1 (1.3801
32.5 (1.2801

n~l.~!.~t~ ~ ~f~,:""."OM
OIA (4 PLACES)

0.406 (0.0161

OIA TYP

2.54 (0.100)

2.54 (0.100) T.P.

(See Note

00000000000@0
0000000000000
000@000000000
0000000000000
0000000000000
30.5 (1.200) REF
H 0000000000000
G0000000000000
F0000000000000
E0000000000000
00000000000000
C 0000000000000
80000000000000
lL.---A0000000000000

T.P.
AI

I ~:

1

2

3

4

5

6

7

8

...caca

ALL POSSIBLE PIN LOCATIONS ARE
SHOWN. SEE APPLICABLE PRODUCT
OAT A SHEETS FOR ACTUAL PIN
LOCATIONS USED.

C

'ii
(J
°2
ca

.c
(J
CI)

9 10 11 1213

~

NOTE A: Pins are located within 0,13 (0.005) radius of true position relative to each other at
maximum material condition and within 0,381 (0.051) radius relative to the center of
the ceramic.
ALL LINEAR DIMENSIONS ARE IN MILLIMETERS AND PARENTHETICALLY IN INCHES

9-7

13 x 13 GC pin grid array ceramic package

.. ____ 35.1 (,.3801 ____....1
32.5 (1.280)
I

r

~

INDEX CORNER
MARK OR CHAMFER
, .27

l

~

10.051.45·

35.1 (1.380)
32.5 (1.280)

::~: ::::::h

~ ::: ::g;g:

,. . ",..~~ ~ !l,Hn~~ ~ ~~~,:" .
2.54 (0.100)

0.406 (0.016)
OIA TYP

2.54 (0.100) T.P.

I

3:
(')

:::r
I»
::s

,),

e!.

o

I»
r+
I»

(See Note A)

N~:0000000000000

0€>000®0000000
0000000®00000
00000®0000000
000000000000®
30.5 (1.200) REF H 000®000®0®®00
G0000000000000
F 0000000000000
E0000000000000
00000000000000
c0®0®00000®00®
80000000000000
L---A00®®0000000®0
1 2 3 4 5 6 7 8 9 10 11 1213

CD

"oo.

DIA (4 PLACES)
2.54 (0.100) T.P.

ALL POSSIBLE PIN LOCATIONS ARE
SHOWN. SEE APPLICABLE PRODUCT
OATA SHEETS FOR ACTUAL PIN
LOCATIONS USEO.

NOTE A: Pins are located within 0,13 (0.005) radius of true position relative to each other at
maximum material condition and within 0,381 (0.051) radius relative to the center of
the ceramic.
ALL LINEAR DIMENSIONS ARE IN MILLIMETERS AND PARENTHETICALLY IN INCHES

9-8

15 )( 15 GB pin grid array ceramic package

r

INDEX CORNER"",-"

----il

40.1 ( 1 . 5 8 0 ) - 1
37.6 (1.480)

I

.~~.

40.1 (1.580)
37.6 (1.480)

L..-.--._ _ _ _ _ _ _ _

L -_ _ _ _ _ _ _ _ _ _ _ _ _ _ _

~

________

~

_ _ __

j

4.95 (0.195)

if5
.
"
'''
Source Exif Data: 
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
XMP Toolkit                     : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Create Date                     : 2017:07:29 12:21:31-08:00
Modify Date                     : 2017:07:29 14:38:06-07:00
Metadata Date                   : 2017:07:29 14:38:06-07:00
Producer                        : Adobe Acrobat 9.0 Paper Capture Plug-in
Format                          : application/pdf
Document ID                     : uuid:aee954e9-aa9f-4349-87f3-f98f0135c06d
Instance ID                     : uuid:0d764b12-bc0c-a346-84ee-98ff7640a29d
Page Layout                     : SinglePage
Page Mode                       : UseNone
Page Count                      : 754

EXIF Metadata provided by EXIF.tools
1989_TI_SN74ACT8800_Family_Data_Manual 1989 TI SN74ACT8800 Family Data Manual

Navigation menu

Versions of this User Manual:

Views

Navigation