1990_AMD_29K_Family_Data_Book 1990 AMD 29K Family Data Book

User Manual: 1990_AMD_29K_Family_Data_Book

Open the PDF directly: View PDF PDF.
Page Count: 447

Download1990_AMD_29K_Family_Data_Book 1990 AMD 29K Family Data Book
Open PDF In BrowserView PDF
29K Family
1990 Data Book

Advanced
Micro
Devices

Advanced
Micro
Devices

29K Family
Data Book

© 1989 Advanced Micro Devices
Advanced Micro Devices reserves the right to make changes in its products without notice in
order to improve design or performance characteristics. The performance characteristics
listed in this document are guaranteed by specific tests, correlated testing, guard banding,
design and other practices common to the industry.
For specific testing details, contact your local AMD sales representative.
The company assumes no responsibility for the use of any circuits described herein.
901 Thompson Place, P.O. Box 3453, Sunnyvale, California 94088-3000
(408)732-2400 TWX: 910-339-9280 TELEX: 34-6306

Am29000, Am29027, Am29041, 29K, ADAPT29K, ASM29K, BTC, Branch Target Cache, Fusion29K, HighC29K,
MON29K, PCEB29K, and XRAY29K are trademarks of Advanced Micro Devices, Inc.
CROSSTALK is a registered trademark of Digital Communications Associates, Inc.
DEC is a registered trademark of Digital Equipment Corporation.
Hewlett-Packard is a registered trademark of Hewlett-Packard, Inc.
IBM and PC-AT are registered trademarks of International Business Machines Corporation.
MetaWare is a trademark of MetaWare, Inc.
Motorola and MC68000 are registered trademarks of Motorola, Inc.
PAL is a registered trademark of Advanced Micro Devices, Inc.
Sun Workstation is a registered trademark of Sun Microsystems, Inc.
Sun and Sun-3 are trademarks of Sun Microsystems, Inc.
Tektronix is a registered trademark of Tektronix, Inc.
UniSite is a trademark of Data I/O Corporation.
UNIX is a registered trademark of American Telephone and Telegraph Company.
VAX is a registered trademark of Digital Equipment Corporation.

Introduction

INTRODUCTION

The RISC-based Am29000 Streamlined Instruction Processor from Advanced Micro Devices is the highperformance solution for your general-purpose embedded systems needs. As the heart of the 29K Family, this 32bit CMOS microprocessor delivers outstanding performance, yet offers flexible cost-effective solutions that can
quickly move your product to market.
This data book is your comprehensive guide to AMD's 29K Family of microprocessors and development tools.
These products have helped current developers create applications that fully exploit the power of the Am29000
microprocessor: laser printers of all types, real-time graphics systems, networks and bridges, and a host of other
peripheral and communication devices.
To provide a total system solution for you, AMD has taken the 29K Family's advantages of 17-MIPS performance,
flexible memory-configuration requirements, and outstanding development tools and coupled them with our
Fusion29KTM program. This program provides you with AMD and industry-standard third-party solutions, including
the application-specific solutions you need for successful system integration that can substantially shorten the
time-to-market factor of your design.
AMD is committed to the 29K Family, and will continue to apply substantial resources to ensure that the present
levels of high performance, cost and design flexibility, and rapid design cycles are maintained and further
enhanced. Qualified support is readily available for our customers-our highly trained field applications engineers
are backed by experts in the factory. For further details on how the 29K Family can be the solution to your deSign
needs, call your local AMD sales office or the authorized representative listed in the back of this publication.

/d(liGeoff Tate
Senior Vice President
Microprocessors & Peripherals Group

iii

29K Family Data Book

PREFACE

Advanced Micro Devices' 29J SRCB THEN DEST <-TRUE
ELSE DEST <-FALSE

CPLT

IF SRCA < SRCB THEN DEST <-TRUE
ELSE DEST <-FALSE

CPLTU

IF SRCA < SRCB (unsigned) THEN DEST <-TRUE
ELSE DEST <-FALSE

CPLE

IF SRCA <= SRCB THEN DEST <-TRUE
ELSE DEST <- FALSE

CPLEU

IF SRCA <.. SRCB (unsigned) THEN DEST <-TRUE
ELSE DEST <-FALSE

CPGT

IF SRCA > SRCB THEN DEST <-TRUE
ELSE DEST <-FALSE

CPGTU

IF SRCA > SRCB (unsigned) THEN DEST <-TRUE
ELSE DEST <-FALSE

CPGE

IF SRCA >= SRCB THEN DEST <-TRUE
ELSE DEST <-FALSE

CPGEU

IF SRCA >= SRCB (unsigned) THEN DEST <-TRUE
ELSE DEST <-FALSE

CPBYTE

IF (SRCA.BYTEO = SRCB.BYTEO) OR
(SRCA.BYTE1 =SRCB.BYTE1) OR
(SRCA.BYTE2 = SRCB.BYTE2) OR
(SRCA.BYTE3 = SRCB.BYTE3)THEN DEST <-TRUE
ELSE DEST <-FALSE

ASEO

IF SRCA = SRCB THEN Continue
ELSE Trap (VN)

ASNEO

IF SRCA <> SRCB THEN Continue
ELSE Trap (VN)

ASLT

IF SRCA < SRCB THEN Continue
ELSE Trap (VN)

ASLTU

IF SRCA < SRCB (unsigned) THEN Continue
ELSE Trap (VN)

ASLE

IF SRCA <= SRCB THEN Continue
ELSE Trap (VN)

ASLEU

IF SRCA <= SRCB (unsigned) THEN Continue
ELSE Trap (VN)

ASGT

IF SRCA > SRCB THEN Continue
ELSE Trap (VN)

ASGTU

IF SRCA > SRCB (unsigned) THEN Continue
ELSE Trap (VN)

ASGE

IF SRCA >= SRCB THEN Continue
ELSE Trap (VN)

ASGEU

IF SRCA >= SRCB (unsigned) THEN Continue
ELSE Trap (VN)

Figure 37. Compare Instructions

1·59

29K Family CMOS Devices

Mnemonic

Operation Description

AND

DEST <-SRCA & SRCS

ANDN

DEST <-SRCA & - SRCS

NAND

DEST <-- (SRCA & SRCS)

OR

DEST <-SRCA I SRCS

NOR

DEST <-- (SRCA I SRCS)

XOR

DEST <-SRCA ,.. SRCS

XNOR

DEST <-- (SRCA ,. SRCS)

Figure 38. Logical Instructions

Mnemonic

Operation Description

SLL

DEST <-SRCA « SRCS (zero fill)

SRL

DEST <-SRCA » SRCS (zero fill)

SRA

DEST <-SRCA » SRCS (sign fill)

EXTRACT

DEST <-high-order word of (SRCAlISRCS « FC)

Figure 39. Shift Instructions

Reserved Instructions
Sixteen Am29000 operation codes are reserved for
instruction emulation. These instructions cause traps,
much like the floating-point instructions, but currently
have no specified interpretation. The relevant operation
codes and the corresponding trap vectors are:
Operation Codes
(hexadecimal)
D8-DD
E7-E9
F8
FA-FF

1-60

Trap Vector
Numbers (decimal)

24-29
39-41

56
58-63

These instructions are intended for future processor
enhancements, and users desiring compatibility with future processor versions should not use them for any
purpose.

Am29000
Mnemonic

Operation Description

LOAD

DEST <-EXTERNAL WORD [SRCB]

LOADL

DEST <-EXTERNAL WORD [SRCB]
assert ·LOCK output during access

LOADSET

DEST <-EXTERNAL WORD [SRCB]
EXTERNAL WORD [SRCB] <-h'FFFFFFFF',
assert LOCK output during access

LOADM

DEST.. DEST + COUNT = SRCB (single-precision)
THEN DEST <-TRUE
ELSE DEST <-FALSE

DGE

IF SRCA (double-precision) >= SRCB (double-precision)
THEN DEST <-TRUE
ELSE DEST <-FALSE

FGT

IF SRCA (single-precision) > SRCB (single-precision)
THEN DEST <-TRUE
ELSE DEST <-FALSE

DGT

IF SRCA (double-precision) > SRCB (double-precision)
THEN DEST <-TRUE
ELSE DEST <-FALSE

SORT

DEST (single-precision, double-precision, extended-precision)
<-SORT[SRCA (single-precision, double-precision, extended-precision)]

CONVERT

DEST (integer, single-precision, double-precision)
<-SRCA (integer, single-precision, double-precision)

CLASS

DEST (single-precision, double-precision, extended-precision)
<-CLASS[SRCA (single-precision, double-precision, extended-precision)]

Figure 42. Floating-Point Instructions

1-62

Am29000
Mnemonic

Operation Description

CAll

DEST <-PCI/OO + 8
PC <-TARGET
Execute delay instruction

CALLI

DEST <-PCI/OO + 8
PC <-SRCB
Execute delay instruction

JMP

PC <-TARGET
Execute delay instruction

JMPI

PC <-SRCB
Execute delay instruction

JMPT

IF SRCA'" TRUE THEN PC <-TARGET
Execute delay instruction

JMPTI

IF SRCA = TRUE THEN PC <-SRCB
Execute delay instruction

JMPF

IF SRCA = FALSE THEN PC <-TARGET
Execute delay instruction

JMPFI

IF SRCA = FALSE THEN PC <-SRCB
Execute delay instruction

JMPFDEC

IF SRCA = FALSE THEN
SRCA <-SRCA -1
PC <-TARGET
ELSE
SRCA <-S·RCA -1
Execute delay instruction

Figure 43. Branch Instructions

Mnemonic

Operation Description

CLZ

Determine number of leading zeros in a word

SETIP

Set IPA, IPB, and IPC with operand register numbers

EMULATE

Load IPA and IPB with operand register numbers, and Trap (VN)

INV

Reset all Valid bits in Branch Target Cache to zeros

IRET

Perform an interrupt return sequence

IRETINV

Perform an interrupt return sequence, and reset all Valid bits
in Branch Target Cache to zeros

HALT

Enter Halt mode on next cycle

Figure 44. Miscellaneous Instructions

1-63

29K Family CMOS Devices

DATA FORMATS AND HANDLING
This section describes the various data types supported
by the Am29000, and the mechanisms for accessing
data in external devices and memories. The Am29000
includes provisions for the external access of bytes,
half-words, unaligned words, and unaligned half-words,
as described in this section.

Integer Data Types
Most Am29000 instructions deal directly with wordlength integer data; integers may be either signed or unsigned, depending on the instruction. Some instructions
(e.g., AND) treat word-length operands as strings of
bits. In addition, there is support for character, halfword, and Boolean data types.
Byte Operations
The processor supports character data through load,
store, extraction, and insertion operations on wordlength operands, and by a compare operation on bytelength fields within words. The format for unsigned and
signed characters is shown in Figure 45; for signed
characters, the sign bit is the most-significant bit of the
character. For sequences of packed characters within
words, bytes are ordered either left-to-right or right-toleft, depending on the BO bit of the Configuration Register (see Special Floating-Point Values section).

If the Data Width Enable (OW) bit of the Configuration
Register is 1, the Am29000 is enabled to load and store
byte data. On a load, an external packed byte is converted to one of the character formats shown in
Figure 45. On a store, the low,-order byte of a word is
packed into every byte of an external word. The External
Data Accesses section describes external byte accesses in more detail.
The Extract Byte (EXBYTE) instruction replaces the
low-order character of a destination word with an arbitrary byte-aligned character from a source word. Forthe
EXBYTE instruction, the destination word can be a zero'
word, which effectively zero-extends the character from
the source operand.
The Insert Byte (INBYTE) instruction replaces an arbitrary byte-aligned character in a destination word with

the low-order character of a source word. For the INBYTE instruction, the source operand can be a character constant specified by the instruction.
The Compare Bytes (CPBYTE) instruction compares
two word-length operands and gives a result of True if
any corresponding bytes within the operands have
equivalent values. This allows programs to detect characters within words without first having to extract individual characters, one at a time, from the word of interest.
Half-Word Operations
The processor supports half-word data through load,
store, insertion, and extraction operations on wordlength operands. The format for unsigned and signed
half-words is shown in Figure 46; for Signed half-words,
the sign bit is the most-significant bit of the half-word.
For sequences of packed half-words within words, halfwords are ordered either left-to-right or right-to-Ieft, depending on the Byte Order (BO) bit of the Configuration
Register (see Addressing and Alignment section).
If the Data Width Enable (OW) bit of the Configuration
Register is 1, the Am29000 is enabled to load and store
half-word data. On a load, an external packed half-word
is converted to one of the formats shown in Figure 46.
On a store, the low-order half-word of a word is packed
into every half-word of an external word.
The Extract Half-Word (EXHW) instruction replaces the
low-order half-word of a destination word with either the
low-order or high-order half-word of a source word. For
the EXHW instruction, the destination word can be a
zero word, which effectively zero-extends the half-word
from the source operand.
The Extract Half-Word, Sign-Extended (EXHWS) instruction is similar to the EXHW instruction, except that
it sign-extends the half-word in the destination word
(Le., it replaces the most-significant 16 bits of the destination word with the most-Significant bit of the source
half-word).
The Insert Half-Word (INHW) instruction replaces either
the low-order or high-order half-word in a destination
word with the low-order half-word of a source word.

Unsigned:
31

23

15

7

0

III I I I I I I I I I I I I I I I II I I I I II I I I I I I I
o

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Signed:
31

23

15

data

7

0

II I I I I I I I I I I I I I I I I I I I I I I III I I I I I I
s s s s s s s s s s s s s s s s s s s s s s s s s

Figure 45. Character Format
1-64

d~a

Am29000
Unsigned:
31

23

15

7

0

II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I
o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

d~a

Signed:

I" II II I I I I I I I I I III II ". I I II I I I I I I
31

5

23

5

5

5

5

5

5

5 5 5

15

5

5

5

5

5

5

7

0

d~a

5

Figure 46. Half-Word Format
Boolean Data
Some instructions in the Compare class generate wordlength Boolean results. Also, conditional branches are
conditional upon Boolean operands. The Boolean format used by the processor is such that the Boolean
values True and False are represented by a 1 or 0,
respectively, in the most-significant bit of a word. The
remaining bits are unimportant; for the compare instructions, they are reset. Note that twos-complement
negative integers are indicated by the Boolean value
True in this encoding scheme.

Floating-Point Data Typ~s
The Am29000 defines single-· and double-precision
floating-point formats that comply with the IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE
Std. 754-1985). These data types are not supported directly in processor hardware, but can be implemented
by a virtual floating-point interface provided in the
Am29000.
I n this section, the following nomenclature is used to denote fields in a floating-point value:
• s: sign bit
• bexp: biased exponent
• frac: fraction
• sig: significand

Typically, the value of a single-precision operand is expressed by:
(-1)**s * 1.frac * 2**(bexp-127).

The encoding of speCial floating-point values is given in
the Special Floating-Point Values section.

Double-Precision Floating-Point
The format for a double-precision floating-point value is
shown in Figure 48.
Typically, the value of a double-precision operand is expressed by:
(-1)**s * 1.frac * 2**(bexp-1023).
The encoding of speCial floating-point values is given in
the Special Floating-Point Values section.
In order to be properly referenced by a floating-point
instruction, a double-precision floating-point value must
be double-word aligned. The absolute register number
of the register containing the first word (labeled "0" in
Figure 48) must be even. The absolute register number
of the register containing the second word (labeled "1" in
Figure 48) must be odd. If these conditions are not met,
the results of the instruction are unpredictable. Note that
the appropriate registers for a double-precision value
in the local registers depend on the value of the Stack
Pointer.

Single-Precision Floating-Point
The format for a single-precision floating-point value is
shown in Figure 47.

Figure 47. Single-Precision Floating-Point Format

1-65

29K Family CMOS Devices
31

23

15

I I::::H: :::I:::. .~ra~
s

7

:

:

:

:

a

,H··: : : : : : : : I

0

Figure 48. Double-Precision Floating-Point Format

Special Floating-Point Values
The Am29000 defines floating-point values that are encoded for special interpretation. The values are described in this section.
Not-a-Number
A Not-a-Number (NaN) is a symbolic value used to report certain floating-point exceptions. It also can be
used to implement user-defined extensions to floatingpoint operations. A NaN comprises a floating-point number with maximum biased exponent and non-zero fraction. The sign bit can be either 0 or 1 and has no significance. There are two types of NaN: Signaling NaNs and
quiet NaNs. A signaling NaN causes an Invalid Operation exception if used as an input operand to a floatingpoint operation; a quiet NaN does not cause an exception. The Am29000 distinguishes Signaling and quiet
NaNs by the most-significant bit of the fraction: a 1 indicates a quiet NaN, and a 0 indicates 2 Signaling NaN.

An operation never generates a signaling NaN as a result. A quiet NaN result can be generated in one of two
ways:
• as the result of an invalid operation that cannot generate a reasonable result, or
• as the result of an operation for which one or
more input operands are either signaling or
quiet NaNs.
In either case, the Am29000 produces a quiet NaN having a fraction of 11000 ... 0; that is, the two most-significant bits of the fraction are 11, andthe remaining bits are
O. If desired, the Reserved Operand exception can be
enabled to cause a Floating-Point Exception trap. The
trap handler in this case can implement a scheme
whereby user-defined NaN values appear to pass
through operations as results, providing overall status
for a series of operations.
Infinity
Infinity is an encoded value used to represent a value
that is too large to be represented as a finite number in
a given floating-point format. Infinity comprises a floating-point number with maximum biased exponent and
zero fraction. The sign bit of an infinity distinguishes +00
from -<><>.
1·66

Denormallzed Numbers
The IEEE Standard specifies that, wherever possible, a
result that is too small to be represented as a normalized
number be represented as a denormalized number. A
denormalized number may be used as an input operand
to any operation. For single- and double-precision formats, a denormalized number comprises a floatingpoint number with a biased exponent of 0 and a nonzero fraction field; the sign bit can be either 1 or O. The
value of a denormalized number is expressed by:

(-1)**s· O.frac· 2*"'(-bias+1),

where "bias" is the exponent bias for the format in
question.
Zero
A zero comprises a floating-point number with a biased
exponent of 0 and a zero fraction field. The sign bit of a
zero can be eitherO or 1; however, positive and negative
zero are both exactly zero, and are considered equal by
comparison operations.

External Data Accesses
All processor external accesses occur between
general-purpose registers and external devices and
memories. Accesses occur as the result of the execution of load and store instructions. The load and store instructions specify which general-purpose register receives the data (for a load) or supplies the data (for a
store). The format of the load and store instructions is
shown in Figure 49.
Addresses for accesses are given either by the content
of a general-purpose register or by a constant value
specified by the load or store instruction. The load and
store instructions do not perform address computation
directly. Any required address computations are performed explicitly by other instructions.
In the load or store instruction, the Coprocessor Enable
(CE) bit (bit 23) determines whether or not the access is
directed to the coprocessor. If the CE bit is 0, the access
is directed to an external device or memory. If the CE bit
is 1, data is transferred to or from the coprocessor. The
CE bit affects the interpretation of the Control (CNTL)
field as well as the channel protocol. This section deals

Am29000
31

23

15

7

0

I I I I I I I III I I I I I I I I I I I I II I I I I I I

I xx x x x x

X M

..

CNTL

RA

.

RBor I

CE

Figure 49. Load/Store Instruction Format
with all external accesses other than coprocessor
accesses.

Current Processor Status Register. If the PA bit is 0, address translation depends on the PO bit.

The format of the instructions that do not perform
coprocessor data transfers (i.e., in which the CE bit is 0)
is shown in Figure 50.

The PA bit may be 1 only for Supervisor-mode instructions. If it is 1 for a User-mode instruction, a Protection
Violation trap occurs.

In load and store instructions, the "RB or I"field specifies
the address for access. The address is either the content of a general-purpose register, with register number
RB, or a constant with a value I (zero-extended to 32
bits). The M bit determines whether the register or the
constant is used.

Bit 20: Set Byte Pointer/Sign Bit (SB)-If the Data
Width Enable (OW) bit of the Configuration Register is 0
and the SB bit is 1, the Byte Pointer Register is written
with the two least-significant bits of the address for the
access. These address bits can control subsequent
character and half-word operations. If the BP bit is 0, the
Byte Pointer Register is not affected.

The data for the access is written into the generalpurpose register RA for a load, and is supplied by register RA for a store.
The definitions for other fields in the load or store instruction are given below:
Bit 23: Coprocessor Enable (CE)-The CE bit is 0 for
a non-coprocessor load or store.
Bit 22: Address Space (AS)-If the AS bit is 0 for an
untranslated load or store, the access is directed to instruction/data memory. If the AS bit is 1 for an untranslated load or store, the access is directed to inpuVoutput.
The AS bit must be 0 for a translated load or store; if the
AS bit is 1 for a translated load or store, a Protection Violation trap occurs. The address space for a translated
load or store is determined by the InpuVOutput (IO) bit of
the associated TLB entry. '
Bit 21: Physical Address {PA)-The PA bit may be
used by a Supervisor-mode program to disable address
translation for an access. If the PA bit is 1, then address
translation is not performed for the access, regardless of
the value of the Physical Addressing/Data (PO) bit in the
31

23

If the Data Width Enable (OW) bit of the Configuration
Register is 1 and the SB bit is 1 for a load, the loaded
byte or half-word is sign-extended in the destination register; if the SB bit is 0, the byte or half-word is zero-extended. If the OW bit is 1 and the SB bit is 1 for either a
load or store, then each bit of the Byte Pointer Register
is written with the complement of the Byte Order bit of
the Configuration Register. The Byte Pointer Register is
set in this case to provide software compatibility across
different types of memory systems. If the SB bit is 0, the
Byte Pointer Register is not affected.
Bit 19: User Access (UA)-The UA bit allows programs executing in the Supervisor mode to emulate
User-mode accesses. This allows checking of the
authorization of an access requested by a User-mode
program. It also causes address translation (if applicable) to be performed using the PID field of the MMU
Configuration Register, rather than the fixed Supervisor-mode process identifier zero.
If the UA bit is 1 for a Supervisor-mode load or store, the
access associated with the instruction is performed in
15

7

II I I I I I I
~X

0

IIII III

XXXXXM

I.
I

:

:

:

:

•

:

I

CE : PA:
I
I

AS

RB or I

I
:

UA

SB

Figure SO. Non-Coprocessor Load/Store Format

1-67

29K Family CMOS Devices
the User mode. In this case, the User mode affects only
TLB protection checking, the SUP/US output, and the
use of the PID field in translation; it has no effect on the
registers that can be accessed by the instruction. If the
UA bit is 0, the program mode for the access is controlled by the SM bit.

If the UA bit is 1 for a User-mode load or store, a Protection Violation trap occurs.
Bits 18-16: Option (OPT}-This field is placed on the
. OPT2-OPTo outputs during the address cycle of the access. There is a one-to-one correspondence between
the OPT field and the OPT2-0PTo outputs; that is, the
most-significant OPT bit is placed on OPT2, and so on.
The OPT field controls system functions as described
below.
.,'
Bits 15-8: (RA)-The data for the access is written into
the general-purpose register RA for a load, and is supplied by register RA for a store.
Bits 7-0: (RB or I)-In load and store instructions, the
"RB or I" field specifies the address for the access. The
address is either the content of a general-purpose register with register number RB, or a constant value I
(zero-extended to 32 bits). The M bit of the operation
code (bit 24) determines whether the registerorthe constant is used.
Load and store operations are overlapped with the execution of instructions that follow the load or store instruction. Only one load or store may be in progress on any
given cycle. If a load or store instruction is encountered
while another load or store operation is in progress, the
processor enters the Pipeline Hold mode until the first
operation is completed. However, the address for the
second operation may appear on the address bus if the
first operation is to a device or memory that supports
pipe lined operations (see Pipelined Accesses section).
Load Operations
The processor provides the following instructions for
performing load operations: Load (LOAD), Load and
Lock (LOADL), Load and Set (LOADSET), and Load
Multiple (LOADM). All of these instructions transfer data
from an external device or memory into one or more
general-purpose registers.
The LOADL instruction supports the implementation of
device and memory interlocks in a multiprocessor configuration. It activates the LOCK output during the address cycle of the access.
The lOADSET instruction implements a binary semaphore .It loads a general-purpose register and automatically writes the accessed location with a word that has 1
in every bit position (that is, the write is indivisible from
the read). The LOCK output is asserted during both the
read and write accesses. Note that, if address translation is enabled for the LOADSET instruction, the TLB
memory-protection bits must allow both the read and
1-68

write accesses. If either the read or write access is not
allowed, neither access is performed.
The LOADM loads a specified number of registers from
sequential addresses, as explained below.
Load operations are overlapped with the execution of instructions that follow the load instruction. The processor
detects any dependencies on the loaded data that subsequent instructions may have, and, if such a dependency is detected, enters the Pipeline Hold mode until
the data are returned by the external device or memory.
If a register that is the target of an incomplete load is
written with the result of a subsequent instruction, the
processor does not write the returning data into the register when the load is completed; the Not Needed (NN)
bit in the Channel Control Register is set in this case.
Store Operations
The processor provides the following instructions for
performing store operations: Store (STORE), Store and
Lock (STOREL), and Store Multiple (STOREM). All of
these instructions transfer data from one or more
,general-purpose registers to an external device or
memory.
The STOREL instruction supports the implementation
of device and memory interlocks in a multiprocessor
configuration. It activates the LOCK output during the
address cycle of the access.
The STOREM instruction stores a specified number of
registers to sequential addresses, as explained below.
Store operations are overlapped with the execution of
instructions that follow the store instruction. However,
no data dependencies can exist since the store prevents
any subsequent accesses until it is completed.
Multiple Accesses
Load Multiple (LOADM) and Store Multiple (STOREM)
instructions move contiguous words of data between
general-purpose registers and external devices and
memories. The numberof transfers is determined by the
Load/Store Count Remaining Register.
The Load/Store Count Remaining (CR) field in the Load/
Store Count Remaining Register specifies the number
of transfers to be performed by the next LOADM or
STOREM executed in the instruction sequence. The CR
field is in the range of 0 to 255 and is zero-based; a count
value of 0 represents one transfer, and a count value of
255 represents 256 transfers. The CR field also appears
in the Channel Control Register.
Before a LOADM or STOREM is executed, the CR field
is set by a Move To Special Register. A LOADM or
STOREM uses the most recently written value of the CR
field. If an attempt is made to alter the CR field and the
Channel Control Register contains information for an
external access that has not yet been completed, the
processor enters the Pipeline Hold mode until the

access is completed. Note that since the CR is set independently of the LOADM and STOREM, the CR field
may represent a valid state of an interrupted program
even if the Contents Valid (CV) bit of the Channel
Control Register is O.
Because of the pipelined implementation of LOADM
and STOREM, at least one instruction (e.g., the instruction that sets the CR field) must separate two successive LOADM and/or STOREM instructions.
After the CR field is set, the execution of a LOADM or
STOREM begins the data transfer. As with any other
load or store operation, the LOADM or STOREM waits
until any pending load or store operation is complete
before starting. The LOADM instruction specifies
the starting address and starting destination generalpurpose register. The STOREM instruction specifies the
starting address and the starting source generalpurpose register.
During the execution of the LOADM or STOREM
instruction, the processor updates the address and register number after every access, incrementing the
address by 4 and the register number by 1. This continues until either all accesses are completed or an interrupt or trap is taken.
For a Load Multiple or Store Multiple address sequence,
addresses wrap from the largest possible value (hexadecimal FFFFFFFC) to the smallest possible value
(hexadecimal 00000000).
The processor increments absolute register numbers
during the Load Multiple or Store Multiple sequence. Absolute register numbers wrap from 127 to 128, and from
255 to 128. Thus, a sequence that begins in the global
registers may make a transition to the local registers, but
a sequence that begins in the local registers remains in
the local registers. Also, note that the local registers are
addressed circularly.
The normal restrictions on register accesses apply for
the Load Multiple and Store Multiple sequences. Forexample, if a protected general-purpose register is encountered in the sequence for a User-mode program, a
Protection Violation trap occurs.
Intermediate addresses are stored in the Channel Address Register, and register numbers are stored in the
Target Register (TR) field of the Channel Control Register. For the STOREM instruction, the data for every
access is stored in the Channel Data Register (this
register also is set during the execution of the LOADM
instruction, but has no interpretation in this case). The
CR field is updated on the completion of every access so
that it indicates the number of accesses remaining in the
sequence.
Load Multiple and Store Multiple operations are indicated by the Multiple Operation (ML) bit in the Channel

Am29000
Control Register. This bit may be 1 even though the CR
field has a value of 0 (indicating that one transfer
remains to be performed). The ML bit is used to restart a
multiple operation on an interrupt return; if it is set
independently by a Move To Special Register before a
load or store instruction is executed, the results are
unpredictable.
While a multiple load orstore is executing, the processor
is in the Pipeline Hold mode, suspending any subsequent instruction execution until the multiple access is
completed. If an interrupt or trap is taken, the Channel
Address, Channel Data, and Channel Control registers
contain the state of the multiple access at the point of interruption. The multiple access may be resumed at this
point, at a later time, by an interrupt return.
The processor attempts to complete multiple accesses
using the burst-mode capability of the channel (see
Burst-Mode Accesses section). Forthis reason, multiple
accesses of individual bytes and half-words are not supported. If the burst-mode access is preempted, the processor retransmits the address at the point of preemption. If the external device or memory cannot support
burst-mode accesses, the processor transmits an address for every access. If the address sequence causes
a virtual page-boundary crossing, the processor
preempts the burst-mode access, translates the address for the new page, and reestablishes the burstmode access using the new physical address.
The last load or store is executed as a simple access.
The processor will preempt burst-mode transfer immediately prior to the last word of the transfer.
Option Bits
The Option field in the load and store instructions supports system functions, such as byte and half-word accesses. The definition of this field for a load or store, depending on the AS bit of the instruction, is as follows:
AS

OPTz

x
x

0
0
0
1

x
0
0
0

OPT1

OPTo

0
0
1
0

0
1
0
0

0
1
-all others-

1
0

Meaning
Word-length access
Byte access
Half-word access
Instruction ROM
access (as data)
Cache control
ADAPT29K accesses
Reserved

Note that some of these encodings do not affect processor operation, and could have other interpretations in a
particular system. For example, the OPT values 000,
001, and 010 affect processor operation only if the OW
bit of the Configuration Register is 1. However, nonstandard uses of the OPT field have an implication on
the portability of software between different systems.

1-69

29K Family CMOS Devices

Addressing and Alignment
Address Spaces
External instructions and data are contained in one of
four 32-bit address spaces:

1. Instruction/Data Memory
2. Input/Output
3. Coprocessor
4. Instruction Read-Only Memory (Instruction
ROM).
An address in the instruction/data memory address
space may be treated as virtual or physical, as determined by the Current Processor Status Register. Address translation for data accesses is enabled separately from address translation for instruction accesses.
A program in the Supervisor mode may temporarily disable address translation for individual loads and stores;
this permits load-real and store-real operations.
It is possible to partition physical instruction and data addresses into two separate physical address spaces.
However, virtual instruction and data addresses appear
in the same virtual address space (Le., instruction/data
memory).
The coprocessor address space is not an address
space in the strictest sense. The coprocessor address
space is defined so that transfers of operands and operation codes to the coprocessor do not interfere with
other external devices and memories.
The processor does not directly support the access of
the instruction ROM address space using loads and
stores; this capability is defined as a system option requiring external hardware.
For untranslated data accesses, bits contained in load
and store instructions distinguish between the instruction/data memory, inpUt/output, and coprocessor address spaces. For translated data accesses, the Input/
Output bit of the associated TLB entry distinguishes
between the instruction/data memory and input/output
address spaces.
For instruction fetches, the ROM Enable (RE) bit of the
Current Processor Status Register distinguishes between the instruction/data and instruction ROM address
spaces.
Byte and Half-Word Addressing
The Am29000 generates word-oriented byte addresses
for accesses to external devices and memorie's. Addresses are word-oriented because loads, stores, and
instruction fetches access words. However, addresses
are byte addresses because they are sufficient to select
bytes packed within accessed words. For load and store
operations, the processor provides means for using the
least-significant address bits to access bytes and halfwords within external words.
1-70

The selection of a byte within a word is determined by
the two least-significant bits of an address and the Byte
Order (BO) bit of the Configuration Register. The selection of a half-word within a word is determined by the
next-to-Ieast-significant bit of an address and the BO bit.
Figure 51 illustrates the addressing of bytes and halfwords when the BO bit is 0, and Figure 52 illustrates the
addressing of bytes and half-words when the BO bit is 1.
In Figure 51 and Figure 52, addresses are represented
in hexadecimal notation.
In the processor, the two least-significant bits of an external address can be reflected in the Byte Pointer (BP)
field of the ALU Status Register when the OW bit of the
Configuration Register is O. Alternatively, the two leastsignificant bits of the address can be used to control byte
and half-word accesses when the OW bit is 1. The BO bit
affects only the interpretation of the BP field and the two
least-Significant address bits.
If the BO bit is 0, bytes are ordered within words such
that a 00 in the BP field or in the two least-significant address bits selects the high-order byte of a word, and a 11
selects the low-order byte. If the BO bit is 1, a 00 in the
BP field or in the two least-significant address bits selects the low-order byte of a word, and a 11 selects the
high-order byte.
If the BO bit is 0, half-words are ordered within words
such that a 0 in the most-significant bit of the BP field or
the next-to-Ieast-significant address bit selects the highorder half-word, and a 1 selects the low-order half-word.
If the BO bit is 1, a 0 in the most-significant bit of the BP
field or the next-to-Ieast-significant address bit selects
the low-order half-word of a word, and a 1 selects the
high-order half-word. Note that since the least-significant bit of the BP field or an address does not participate
in the selection of half-words, the alignment of halfwords is forced to half-word boundaries in this case.
Alignment of Words and Half-Words
Since only byte addressing is supported, it is possible
that an address for the access of a word or half-word is
not aligned to the desired word or half-word. The
Am29000 either ignores or forces alignment in most
cases. However, some systems may require that unaligned accesses be supported for compatibility reasons. Because of this, the Am29000 provides an option
that creates a trap when a nonaligned access is attempted. This trap allows software emulation of the nonaligned accesses in a manner that is appropriate for the
particular system.

The detection of unaligned accesses is activated by a 1
in the Trap Unaligned Access (TU) bit of the Current
Processor Status Register. Unaligned access detection
is based on the data length as indicated by the OPT field
of a load or store instruction, and on the two least-significant bits of the specified address. Only addresses for
instruction/data memory accesses are checked; align-

Am29000

31

o

7

15

23

Word 00000000
Half-Word 00000002

Half-Word 00000000
Byte 00000000

Byte 00000001

Byte 00000002

Byte 00000003

Word 00000004
Half-Word 00000006

Half-Word 00000004
Byte 00000004

Byte 00000005

Byte 00000006

Byte 00000007

Word FFFFFFFC
Half-Word FFFFFFFC
Byte FFFFFFFC

Half-Word FFFFFFFE

Byte FFFFFFFD

Byte FFFFFFFE

Byte FFFFFFFF

Figure 51. Byte and Half-Word Addressing with BO = 0

o

31
Word 00000000
Half-Word 00000002
Byte 00000003

Half-Word 00000000

Byte 00000002

Byte 00000001

Byte 00000000

Word 00000004
Half-Word 00000006
Byte 00000007

Half-Word 00000004

Byte 00000006

Byte 00000005

Byte 00000004

Word FFFFFFFC
Half-Word FFFFFFFE
Byte FFFFFFFF

Byte FFFFFFFE

Half-Word FFFFFFFC
Byte FFFFFFFD

Figure 52. Byte and Half-Word Addressing with BO

Byte FFFFFFFC

=1
1·71

29K Family CMOS Devices
ment is ignored for input/output accesses and coprocessor transfers.
An Unaligned Access trap occurs only if the TU bit is 1
and any of the following combinations of OPT field and
address bits is detected for a load or store to instructionl
data memory:

o
o
o
o
o

0
0
0

o
o
o

o

o

o

o

1

1

1

0
1
1

Unaligned
word access'

Unaligned
half-word access

The trap handler for the Unaligned Access trap is
responsible for generating the correct sequence of
aligned accesses and performing any necessary shifting, masking andlor merging. Note that a virtual pageboundary crossing also may have to be considered.
Alignment of Instructions
Inthe Am29000, all instructions are 32 bits in length, and
are aligned on word-address boundaries. The processor's Program Counter is 30 bits in length, and the leastsignificant 2 bits of pro<:essor-generated instruction addresses are always 00. An unaligned address can be
generated by indirect jumps and calls. However, alignment is ignored by the processor in this case, and it expects the system to force alignment (Le., by interpreting
the two least-significant address bits as 00, regardless
of their values).

half-word accesses, but hardware accesses require that
the system be able to selectively write individual byte
and half-word positions within external devices and
memories. The software-only technique is compatible
with systems designed to provide hardware support for
byte and half-word accesses.
This section describes the operation of both software
and hardware byte and half-word accesses. Byte and
half-word accesses operate as described here for memory and input/output accesses, but not for coprocessor
transfers. Coprocessor transfers are unaffected by the
OW bit.
The OW bit is cleared by a processor reset. It must explicitly be set to 1 by software before hardware byte and
half-word accesses can be performed.
Software Byte and Half-Word Accesses
If the OW bit is 0, the Am29000 allows the Byte Pointer
Registerto be set with the least-significant bits of an ad-'
dress specified by any load or store instruction, except
those that transfer information to and from the coprocessor. Insert and extract instructions can then be used to
access the byte or half-word of interest, after the external-word has been accessed. This provides a general-'
purpose mechanism for manipulating external byte and
half-word data, without the need for external hardware
support.
To load a byte or half-word, a word load is first performed. This load sets the BP field with the two leastsignificant bits of the address. A subsequent EXBYTE,
EXHW, or EXHWS instruction extracts the byte or halfword of interest from the accessed word.

Accessing Instructions as Data
To aid the external access of instructions and data on
separate buses, the processor distinguishes between
instruction and data accesses. However, it does not
support a logical distinction between instruction and
data address spaces (except in the case of instruction
read-only memory). In particular, address translation in
the Memory Management Unit is in no way affected by
this distinction (although memory protection is).

To store a byte or half-word, a load is first performed,
setting the BP field with the two least-significant bits of
the address. A subsequent INBYTE or INHWinstruction
inserts the byte or half-word of interest into the accessed
word, and the resulting word is then stored.

In systems where it is necessary to access instructions
as data, this function should be performed via the
shared address space. The OPT field provides a means
for loads to access instructions in the instruction readonly memory (ROM) address space. The Am29000
does not take any action to prevent a store to the instruction ROM address space.

Hardware Byte and Half-Word Accesses
If the OW bit is 1 on a load, the Am29000 selects a byte
or half-word from the loaded word depending on the Option (OPT) bits of the load instruction, the Byte Order
(BO) bit of the Configuration Register, and the two leastsignificant bits of the address (for bytes) or the next-toleast-significant bit of the address (for half-words). The
selected byte or half-word is right-justified within the
destination register. If the SB bit of the load instruction is
0, the remainder of the destination register is zeroextended. If the SB bit is 1, the remainder of the destination register is sign-extended with the sign bit of the selected byte or half-word.

Byte and Half-Word Accesses
The Am29000 can perform byte and half -word accesses
in either software or hardware under control of the Data
Width Enable (OW) bit of the Configuration Register.
Software byte and half-word accesses are selected by a
OW bit of 0, and hardware byte and half-word accesses
are selected by a OW bit of 1. Software byte and halfword accesses are less efficient than hardware byte and

Software that relies on loads and stores setting the BP
field cannot operate correctly when the Freeze (FZ) bit
of the Current Processor Status Register is 1, because
the ALU Status Register is frozen.

If the OW bit is 1 on a store, the Am29000 replicates the
low-order byte or half-word in the source register into

Am29000
every byte and half-word position of the stored word.
The system is responsible for generating the appropriate byte and/or half-word strobes, based on the OPT2OPT0 signals and the two least-significant bits of the address, to write the appropriate byte or half-word in the
selected device or memory (the system byte order must
also be considered). The SB bit does not affect the operation of a store, except for setting the BP field as described below.

If the SB bit is 1 for either a load or store and the OW bit is
also 1, both bits of the BP field are set to the complement
of the BO bit when the load or store is executed. This
does not directly affect the load or store access, but
supports compatibility for software developed for wordwrite-only systems. Hardware byte and half-word
accesses-in contrast to software byte and half-word
accesses-<;an be performed when the FZ bit is 1, because these accesses do not rely on the BP field.

System Alternatives and Compatibility
The two mechanisms for performing byte and half-word
accesses create the possibility of two types of systems.
These are named for convenience:
.. Type 1: simple, word-only accesses in external devices and memories; software byte and
half-word accesses.
.. Type 2: byte/half-word strobes in external devices and memories; hardware byte and halfword accesses by the Am29000.

2. Perform a byte extract on the loaded word.
• Type 1 system: The byte selected by the BP
field is aligned to the low-order byte of the destination register and the remainder of the word
is zero-extended. The selected byte may be in
any byte position.
II

Type 2 system: The byte selected by the BP
field (set to point to the low-order byte) is
aligned to the low-order byte of the destination
register and the remainder of the word is zeroextended. (Note that the selected byte was already in the low-order byte position. This operation does not change the program state
but merely allows software compatibility.)

The recommended instruction sequences for all types of
byte and half-word accesses and for both types of systems are enumerated below. Compatibility between
these systems follows the above example, but for brevity, compatibility is not described in detail here.

Byte read, unsigned:
Comments
load O,17,temp,addr
exbyte temp,temp,O

Comments
load O,1,temp,addr

The provision for hardware byte and half-word accesses
encourages Type 2 systems. Software for Type 1 systems can execute on Type 2 systems, but the reverse is
not true. Software compatibility is possible primarily because of the OW bit and because the Am29000 sets the
BP field with an appropriate byte pointer even when it
performs byte and half-word accesses with internal
hardware. Also, the system must return a full word in
either type of system, regardless of the access datawidth. The OW bit must be 0 in Type 1 systems and must
be 1 in Type 2 systems. To illustrate compatibility between systems, consider the following steps of an unsigned byte load compiled for a Type 1 system, but executing on a Type 2 system:
1. Perform a load with OPT =001 and SB =1.
II

Type 1 system: The addressed word is accessed and placed into the destination register. The BP field is set with the two least-significant bits of the address.

.. Type 2 system: The addressed byte is accessed, aligned, padded, and placed into the
destination register. The BP field is set to point
to the low-order byte, reflecting the alignment
that has been performed (the pointer depends
on the value of the BO bit).

; OPT =001, SB =1
; get byte

; OPT =001, SB =0

Byte read, signed:
Comments
load O,17,temp,addr
exbyte temp,temp,O
sll temp,temp,24
sra temp,temp,24

; OPT =001, SB =1
; get byte
; sign extend

Comments
load O,17,temp,addr

; OPT =001, SB =1
(sign extended)

Byte Write:
Comments
load O,17,temp,addr
inbyte temp,temp,
data
store O,1,temp,addr

; OPT =001, SB =1
; insert byte

IY.l2tl

Comments

store O,1,data,addr

; OPT =001, S8 =0

; store

1-73

29K Family CMOS Devices
Half-word read, unsigned:

Half-word write:
Comments
; OPT = 010, S8 = 1
; get half-word unsigned

load 0,18,temp,addr ; OPT = 010, S8 = 1
inhw temp,temp,data ; insert half-word
store 0,2,temp,addr ; store

Imtl

Comments

Imtl

Comments

load 0,2,temp,addr

; OPT= 010, S8=0

store 0,2,data,addr

; OPT = 010, S8 =

Half-word read, signed:
Comments
load 0,18,temp,addr
exhws temp,temp

; OPT = 010, S8= 1
; get half-word signextend

Comments
load 0,18,temp,addr

1-74

Comments

load 0,18,temp,addr
exhw temp,temp,O

; OPT =010, S8 = 1,
(sign-extend)

°

Am29000

INTERRUPTS AND TRAPS
Interrupts and traps cause the Am29000 to suspend the
execution of an instruction sequence and to begin the
execution of a new sequence. The processor mayor
may not later resume the execution of the original instruction sequence.

Current Processor Status; a 1 in the OA bit disables
traps, and a 0 enables traps. It is not possible to selectively disable individual traps.

The distinction between interrupts and traps is largely
one of causation and enabling. Interrupts allow external
devices and the Timer Facility to control processor execution, and are always asynchronous to program execution. Traps are intended to be used for certain exceptional events that occur during instruction execution,
and are generally synchronous to program execution.

A wait-for-interrupt capability is provided by the Wait
mode. The processor is in the Wait mode whenever
the Wait Mode (WM) bit of the Current Processor Status
is 1. While in Wait mode, the processor neither fetches
nor executes· instructions and performs no external
accesses. The Wait mode is exited when an interrupt or
trap is taken.

Throughout this manual, a distinction is made between
the point at which an interrupt or trap occurs and the
point at which it is taken. An interrupt or trap is said to
occur when all conditions that define the interrupt or trap
are met. However, an interrupt or trap that occurs is not
necessarily recognized by the processor, either because of various enables or because of the processor's
operational mode (e.g., Halt mode). An interrupt ortrap
is taken when the processor recognizes the interrupt or
trap and alters its behavior accordingly.

Note that the processor can take only those interrupts or
traps for which it is enabled, even in the Wait mode. For
example, if the processor is in the Wait mode with a OA
bit of 1, it can leave the Wait mode only via the Reset
mode or a WARN trap.

Interrupts
Interrupts are caused by signals applied to any of the external inputs INTIb-INTRo, or by the Timer Facility. The
processor may be disabled from taking certain interrupts by the masking capability provided by the Oisable
All Interrupts and Traps (OA) bit, Oisable Interrupts (01)
bit, and Interrupt Mask (1M) field in the Current Processor Status Register.
The OA bit disables all interrupts and most traps. The 01
bit disables external interrupts without affecting the recognition of traps and Timer interrupts. The 2-bit 1M field
selectively enables external interrupts as follows:

1M Value
00
01
10
11

Result

IN"fRo enabled
IN~-IN"fRo enabled
INTR:z-IN"fRo enabled
IN1B:,-INlRo enabled

Note that the INTRo interrupt cannot be disabled by the
1M field. Also, note that no external interrupt is taken if
either the OA or 01 bit is 1. The Interrupt Pending bit in
the Current Processor Status indicates that one or more
of the signals INTIb-INTRo is active, but that the corresponding interrupt is disabled due to the value of either
OA, 01, or 1M.

Traps
Traps are caused by signals applied to one of the inputs
TRAP1-TRAPo, or by exceptional conditions such as
protection violations. Except for the Instruction Access
Exception, Oata Access Exception, and Coprocessor
Exception traps, traps are disabled by the OA bit in the

Wait Mode

Vector Area
Interrupt and trap processing rely on the existence of a
user-managed Vector Area in external instruction/data
memory or instruction read-only memory (instruction
ROM). The Vector Area begins at an address specified
by the Vector Area Base Address Register, and provides for as many as 256 different interrupt and trap handling routines. The processor reserves 24 routines for
system operation and 40 routines for instruction emulation. The number and definition of the remaining 192
possible routines are system-dependent.
The Vector Area has one of two possible structures as
determined by the Vector Fetch (VF) bit in the Configuration Register. The first structure, as described below,
requires less external memory than the second, but
imposes the performance penalty of the vector-table
lookup.

If the VF bit is 1, the structure of the Vector Area is a table of vectors in instruction!data memory. The layout of
a single vector is shown in Figure 53. Each vector gives
the beginning word-address of the associated interrupt
or trap handling routine, and specifies, by the R bit,
whether the routine is contained in instruction/data
memory (R = 0) or instruction ROM (R = 1).
If the VF bit is 0, the structure of the Vector Area is a segment of contiguous blocks of instructions in instruction!
data memory or instruction ROM. The ROM Vector Area
(RV) bit of the Configuration Register determines
whether the Vector Area is in instruction!data memory
(RV = 0) or instruction ROM (RV = 1). A 64-instruction
block contains exactly one interrupt or trap handling routine, and blocks are aligned on 64-instruction address
boundaries.
Vector Numbers
When an interrupt or trap is taken, the processor determines an 8-bit vector number associated with the interrupt or trap. The vector number gives either the number
1-75

29K Family CMOS Devices
31

23

15

7

0

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I III
Handler Starting Address

R 0

Figure 53. Vector Table Entry
of a vector table entry or the number of an instruction
block, depending on the value of the VF bit.
If the VF bit is 1, the physical address of the vector table
entry is generated by replacing bits 9-2 of the value in
the Vector Area Base Address Register with the vector
number.
If the VF bit is 0, the physical address of the first instruction of the handling routine is generated by replacing bits
15-8 of the value in the Vector Table Base Address
Register with the vector number.
Vector numbers are either predefined or specified by an
instruction causing the trap. The assignment of vector
numbers is shown in Figure 54 (vector numbers are in
decimal notation). Vector numbers 64 to 255 are for use
by trapping instructions; the definition of the routines associated with these numbers is system-dependent.

Interrupt and Trap Handling
Interrupt and trap handling consists of two distinct operations: taking the interrupt or trap, and returning from
the interrupt or trap handler. If the interrupt or trap
handler returns directly to the interrupted routine, the
interrupt or trap handler need not save and restore
processor state.
Taking an Interrupt or Trap
The following operations are performed in sequence by
the processor when an interrupt or trap is taken:
1. Instruction execution is suspended.
2. Instruction fetching is suspended.
3. Any in-progress load or store operation is completed. Any additional operations are canceled
in the case of Load Multiple and Store Multiple.
4. The contents of the Current Processor Status
Register are copied into the Old Processor
Status Register.
5. The Current Processor Status register is modified as shown in Figure 55 (the value "u" means
unaffected). Note that setting the Freeze (FZ) bit
freezes the Channel Address, Channel Data,
Channel Control, Program Counter 0, Program
Counter 1, Program Counter 2, and ALU Status
Registers.
6. The address of the first instruction of the interrupt or trap handler is determined. If the VF bit of
1-76

the Configuration Register is 1, the address is
obtained by accessing a vector from instruction!
data memory, using the physical address obtained from the Vector Area Base Address Register and the vector number. This access appears on the channel as a data access, and the
OPT2-0PTo signals indicate a word-length access. If the VF bit is 0, the instruction address is
given directly by the Vector Area Base Address
Register and the vector number.
7. If the VF bit is 1, the R bit in the vector fetched in
Step 6 is copied into the RE bit of the Current
Processor Status Register. If the VF bit is 0, the
RV bit of the Configuration Register is copied
into the RE bit. This step determines whether or
not the first instruction of the interrupt handler is
in instruction ROM.
8. An instruction fetch is initiated using the instruction address determined in Step 6. At this point,
normal instruction execution resumes.
Note that the processor does not explicitly save the contents of any registers when an interrupt is taken. If register saving is required, it is the responsibility of the interrupt or trap-handling routine. For proper operation, registers must be saved before any further interrupts or
traps may be taken. The FZ bit must be reset at least two
instructions before interrupts or traps are reenabled to
allow the program state to be reflected properly in processor registers if an interrupt or trap is taken.
Returning from an Interrupt or Trap
Two instructions are used to resume the execution of an
interrupted program: Interrupt Return (IRET), and Interrupt Return and Invalidate (IRETINV). These instructions are identical except in one respect: the IRETINV
instruction resets all Valid bits in the Branch Target
Cache, whereas the IRET instruction does not affect the
Valid bits.
In some situations, the processor state must be set
properly by software before the interrupt return is executed. The following is a list of operations normally performed in such cases:
1. The Current Processor Status is configured as
shown in Figure· 55 (the value "x" is a "don't
care"). Note that setting the FZ bit freezes the
registers listed below so that they may be set for
the interrupt return.

Am29000
Number
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24-29
30
31
32
33
34
35
36
37
38
39-41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58-63
64-255

Type of Trap or Interrupt
Illegal Opcode
Unaligned Access
Out of Range
Coprocessor Not Present
Coprocessor Exception
Protection Violation
Instruction Access Exception
Data Access Exception
User-Mode Instruction TLB Miss
User-Mode Data TLB Miss
Supervisor-Mode Instruction TLB Miss
Supervisor-Mode Data TLB Miss
Instruction TLB Protection Violation
Data TLB Protection Violation
Timer
Trace
INlRo
INlR,
INm
INm
TRAPo
TRAP1
Floating·Point Exception
reserved
reserved for instruction emulation
(op codes 08-00)
MULTM
MULTMU
MULTIPLY
DIVIDE
MULTIPLU
DIVIDU
CONVERT
SORT
CLASS
reserved for instruction emulation
(op codes E7-E9)
FEO
DEO
FGT
DGT
FGE
DGE
FADD
DADO
FSUB
DSUB
FMUL
DMUL
FDIV
DDIV
reserved for instruction emulation
(op code F8)
FDMUL
reserved for instruction emulation
(op codes FA-FF)
Assert and EMULATE instruction traps
(vector number specified by instruction)

Cause
executing undefined instruction
access on unnatural boundary, TU = 1
overflow or underflow
coprocessor access, CP =0
coprocessorDERRresponse
invalid User-mode operation
IERR response
DERRresponse,notcoprocessor
no TLB entry for translation

"
"
"
TLB UE/SE=O
TLB URISR = 0, UW/SW =0 on write
Timer Facility
Trace Facility
INlRo input
INlR, input
INminput
INminput
TRAPo input
TRAP1 input
unmasked floating-point exception

MULTM instruction
MULTMU instruction
MULTIPLY instruction
DIVIDE instruction
MULTIPLU instruction
DIVIDU instruction
CONVERT instruction
SORT instruction
CLASS instruction

FEO instruction
DEO instruction
FGT instruction
DGT instruction
FGE instruction
DGE instruction
FADD instruction
DADO instruction
FSUB instruction
DSUB instruction
FMUL instruction
DMUL instruction
FDIV instruction
DDIV instruction

FDMUL instruction

Figure 54. Vector Number Assignments
1·77

29K Family CMOS Devices

"~-------""'V"----_..J~
Reserved

I

I:
I
I
I
I

I
I
I

I
I
I

I
I
I

:
I
I

:
I

IP : TP : FZ: RE
I

CA

:- I
I:
I
I

PO : SM

I

TE TU

:
I
I

:
I

:
I
I

1M

: OA

I
I

I

LK

WM

PI

Figure 55. Current Processor Status after an Interrupt or Trap

2. The Old Processor Status is set to the value of
the Current Processor Status for the target
routine.

Current Processor Status, for Steps 3 through
10.
3. If the interrupt return instruction is an IRETINV,
all Valid bits in the Branch Target Cache are
reset.

3. The Channel Address, Channel Data, and
Channel Control registers are set to restart or resume uncompleted channel operations of the
target routine.

4. The contents of the Old Processor Status Register are copied into the Current Processor Status
Register. This normally resets the FZ bit allowing the Program Counter 0, 1,2, Channel Address, Data, Control, and ALU Status registers
to update normally. Since certain bits of the Current Processor Status Register always are updated by the processor, this copy operation may
be irrelevant for certain bits (e.g., the Interrupt
Pending bit).

4. The Program Counter 1 and Program Counter 0
registers are set to the addresses of the first and
second instructions, respectively, to be executed in the target routine.

S. Other registers are set as required. These may
include registers such as the ALU Status, 0, and
so forth, depending on the particular situation.
Some of these registers are unaffected by the
FZ bit, so they must be set in such a manner that
they are not modified unintentionally before the
interrupt return.

5. If the Contents Valid (CV). bit of the Channel
Control Register is 1, and the Not Needed (NN)
and Multiple Operation (ML) bits are both 0, an

external access is started. This operation is
based on the contents of the Channel Address,
Channel Data, and Channel Control registers.
The Current Processor Status Register conditions the access-as is normally the case. Note
that Load Multiple and Store Multiple operations
are not restarted at this point.

Once the processor registers are configured properly,
as described above, an interrupt return instruction
(IRET or IRETINV) performs the remaining steps necessary to return to the target routine. The following operations are performed by the interrupt return instruction:
1. Any in-progress load or store operation is completed. If a Load Multiple or Store Multiple sequence is in progress, the interrupt return is not
executed until the sequence is completed.

6. The address in Program Counter 1 is used to
fetch an instruction. The Current Processor
Status Register conditions the fetch. This step is
treated as a branch in the sense that the proces-

2. Interrupts and traps are disabled, regardless of
the settings of the OA, 01, and 1M fields of the
31

23

15

CA

7

TE TU

LK

WM

Figure 56. Current Processor Status Before Interrupt Return

1-78

o

Am29000

sor searches the Branch Target Cache for the
target of the fetch.
7. The instruction fetched in Step 6 enters the decode stage of the pipeline.

8. The address in Program Counter 0 is used to
fetch an instruction. The Current Processor
Status Register conditions the fetch. This step is
treated as a bra!1ch in. the sense that the processor searches the Branch Target Cache for the
target of the fetch~
9. The instruction fetched in Step 6 enters the execute stage of'the pipeline, and the instruction
fetched in Step 8 enters the decode stage.
10. ,If the CV bit in the Channel Control Register is a
1, the NN bit is 0, and the ML bit is 1, a Load Multiple or Store Multiple sequence is started,
based on the contents of the Channel Address,
Channel Data, and Channel Control registers.
11. Interrupts and traps are enabled per the appropriate bits in the Current Processor Status
Register.

. 12. The processor resumes normal operation.
Fast Interrupt Processing
The registers affected by the FZ bit of the Current Processor Status Register are those that are modified by almost any usual sequence of instructions. Since the FZ
bit is set by an interrupt or trap, the interrupt or trap handier is able to execute while not disturbing the state of
the interrupted routine, though its execution is somewhat restricted. Thus, it is not necessary in many cases
for the interrupt or trap handler to save the registers that
are affected by the FZ bit.
The processor provides an additional benefit if the Program Counter 0 and Program Counter 1 registers are
not modified by the interrupt or trap handler. If Program
Counters 0 and 1 contain the addresses of sequential instructions when an interrupt or trap is taken, and if they
are not modified before an interrupt return iS'executed,
Step 8 of the interrupt return sequence above occurs as
a sequential fetch-instead of a branch-for the interrupt return. The performance impact of a sequential
fetch is normally less than that of a nonsequential fetch.
Because the registers affected by the FZ bit are sometimes required for instruction execution, it is not possible
for the interrupt or trap handler to execute all instructions unless the required registers are first saved elsewhere (e.g., in one or more global registers). Most of the
restrictions due to register dependencies are obvious
(e.g., the Byte Pointer for byte extracts), and will not be
discussed here. Other less obvious restrictions are
listed below:

1. Load Multiple and Store Multiple. The Channel
Address. Channel Data. and Channel Control
registers are used to sequence Load Multiple

and Store Multiple operations, so these instructions cannot be executed while the registers are
frozen. However, note that other external
accesses may occur; the Channel Address,
Channel Data, and Channel Control registers
are required only to restart an access after an
exception, and the interrupt ortrap handler is not
expected to encounter any exceptions.
2. Loads and stores that set the Byte Pointer. If the
Set Byte Pointer (SB) of a load or store instruction is 1 and the FZ bit is also 1 , there is no effect
on the Byte Pointer. Thus, the execution of external byte and half-word accesses using this
mechanism is not possible.

3. Extended arithmetic. The Carry bit of the ALU
Status Register is not updated while the FZ bit
is 1.
4. Divide step instructions. The Divide Flag of the
ALU Status Register is not updated when the FZ
bitis 1.
If the interrupt or trap handler does not save the state of
the interrupted routine, it cannot allow additional interrupts and traps. Also, the operation of the interrupt or
trap handler cannot depend on any trapping instructions (e.g., Floating;Point instructions, illegal operation
codes, arithmetic overflow, etc.) since these are disabled. There are certain cases, however, where traps
are unavoidable; these are discussed in the Arithmetic
Exceptions section.

WARN Trap
The processor recognizes a special trap, caused by the
activation of the WARN input. that cannot be masked.
The WARN trap is intended to be used for severe systern-error or deadlock conditions. It allows the processor
to be placed in a known, operable state, while preserving much of its original state for error reporting and possible recovery. Therefore. it shares some features in
common with the Reset mode as well as features common to other traps described in this section.
The major differences between the WARN trap and
other traps are:
1. The processor does not wait for an in-progress
external access to be completed before taking
the trap, since this access might not be completed. However, the information related to any
outstanding access is retained by the Channel
Address, Channel Data, and Channel Control
registers when the trap is taken.

2. The vector-fetch operation is not performed, regardless of the VF bit of the Configuration Register, when the WARN trap is taken. Instead. the
ROM Enable (RE) bit in the Current Processor
Status is set, and instruction fetching begins immediately at Address 16 in the instruction ROM.
1-79

29K Family CMOS Devices
The trap handler executes directly from the instruction ROM without the need to access
external (and possibly nonfunctional or invalid)
instruction/data memory.
Note that WARN trap may disrupt the state of the routine
that is executing when it is taken, prohibiting this routine
from being restarted.

Sequencing of Interrupts and Traps
On every cycle, the processor decides eitherto execute
instructions or to take an interrupt or trap. Since there
are multiple sources of interrupts and traps, more than
one interrupt or trap may be pending on a given cycle.
To resolve conflicts, interrupts and traps are taken according to the priority shown in Figure 57. In this table,
interrupts and traps are listed in order of decreasing priority. This section discusses the first three columns of
Figure 57. The last two columns are discussed in the
Exception Reporting and Restarting section.
In Figure 57, interrupts and traps fall into one of two
categories depending on the timing of their occurrence
relative to instruction execution. These categories are
indicated in the third column by the labels "inst" and
"async." These labels have the following meanings: .
1. Inst-Generated by the execution or attempted
execution of an instruction.
2. Async-Generated asynchronous to and independent of the instruction being executed, although it may be a result of an instruction executed previously.
The principle for interrupt and trap sequencing is that the
highest priority interrupt or trap is taken first. Other
interrupts and traps remain active until they can be
taken, or are regenerated when they can be taken. This
is accomplished, depending on the type of interrupt or
trap, as follows:
1. All traps in Figure 57 with Priority 13 or 14 are regenerated by the re-execution of the causing instruction.
2. Most of the interrupts and traps of Priorities 4
through 12 must be held by external hardware
until they are taken. The exceptions to this are
listed in (3) below.
3. The exceptions to (2) above are the Data Access
Exception trap, the Coprocessor Exception trap,
the Timer interrupt, and the Trace trap. These
are caused by bits in various registers in the
processor and are held by these registers until
taken or cleared. The relevant bits are: the
Transaction Faulted (TF) bit of the Channel Control Register for Data Access Exception and
Coprocessor Exception traps, the Interrupt (IN)
bit of the Timer Reload Register for Timer inter1-80

rupts, and the Trace Pending (TP) bit of the Current Processor Status Register for Trace traps.
4. All traps of Priorities 2 and 3 in Figure 57, except
for the Unaligned Access trap, are not regenerated. These traps are mutually exclusive and are
given high priority because they cannot be regenerated; they must be taken if they occur. If
one of these traps occurs at the same time as a
reset or WARN trap, it is not taken, and its occurrence is lost.
5. The Unaligned Access trap is regenerated internally when an extemal access is restarted by the
Channel Address, Channel Data, and Channel
Control registers. Note that this trap is not necessarily exclusive to the traps discussed in (4)
above.
Note that the Channel Address, Channel Data, and
Channel Control registers are set for a WARN trap only if
an external access is in progress when the trap is taken.

Exception Reporting and Restarting
When an instruction encounters an exceptional condition, the Program Counter 0, Program Counter 1, and
Program Counter 2 registers report the relevant instruction address(es), and allow the instruction sequence to
be restarted once the exceptional condition has been
remedied (if possible). Similarly, when an external access or coprocessor transfer encounters an exceptional
condition, the Channel Address, Channel Data, and
Channel Control registers report information on the access or transfer, and allow it to be restarted. This section
describes the interpretation and use of these registers.
The "PC 1" column in Figure 57 describes the value held
in the Program Counter 1 Register (PC 1) when the interrupt ortrap is taken. For traps in the "inst" category, PC1
contains either the address of the instruction causing
the trap, indicated by "curr," or the address of the instruction following the instruction causing the trap,indicated by "next."
For interrupts and traps in the "async" category, PC1
contains the address of the first instruction, which was
not executed due to the taking of the interrupt or trap.
This is the next instruction to be executed upon interrupt
return, as indicated by "next" in the PC1 column.
Instruction Exceptions
Fortrapscaused by the execution of an instruction (e.g.,
the Out of Range trap), the Program Counter 2 Register
contains the address of the instruction causing the trap.
In all of these cases, PC1 is in the "next" category. The
Exception Opcode Register contains the operation code
of the instruction causing the trap.
The traps associated with instruction fetches (Le., those
of Priority 13) occur only if the processor attempts the
execution of the associated instruction. An exception

Am29000
Priority
1
(highest)

Type Of Interrupt Or Trap
WARN

InstlAsync

PC1

Channel Regs

async

next

see Note 1

2

User-Mode Data TLB Miss
Supervisor-Mode Data TLB Miss
Data TLB Protection Violation

inst
inst
inst

next
next
next

all
all
all

inst
inst
inst
inst
inst
inst
inst
inst
inst
inst
inst
inst
inst

next
next
next
next
next
next
next
next
next
next
next
next
next

all
all

3

Unaligned Access
Coprocessor not Present
Out of Range
Floating-Point Exceptions
Assert Instructions
Floating-Point Instructions
MULTIPLY
MULTM
DIVIDE
MULTIPLU
MULTMU
DIVIDU
EMULATE

N/A
N/A
N/A
N/A
N/A
N/A
N/A

NlA

N/A
N/A
N/A

4

Data Access Exception
Coprocessor Exception

async
async

next
next

all
all

5

TRAPo

async

next

multiple

6

'fRAP,

async

next

multiple

7

INTR"

async

next

multiple

8

IN~

async

next

multiple

9

INTR.z

async

next

multiple

10

INTR.,

async

next

multiple

11

Timer

async

next

multiple

12

Trace

async

next

multiple

13

User-Mode Instruction TLB Miss
Supervisor-Mode Instr. TLB Miss
Instruction TLB Protection Violation
Instruction Access Violation

inst
inst
inst
inst

curr
curr
curr
curr

N/A
N/A
N/A
N/A

Illegal Opcode
Protection Violation

inst
inst

curr
curr

N/A
N/A

14
(lowest)

Note: The Channel Address, Channel Data, and Channel Control registers are set for a WARN trap
only if an external access is in progress when the trap is taken.

Figure 57. Interrupt and Trap PrlorHy Table

may be detected during an instruction prefetch, but the
associated trap does not occur if a nonsequential fetch
occurs before the processor attempts the execution of
the invalid instruction. This prevents the spurious indication of instruction exceptions.

Data Exceptions
The "Channel Regs" column of Figure 57 indicates the
cases for which the Channel Address, Channel Data,
and Channel Control registers contain information re-

1·81

29K Family CMOS Devices
lated to an external access or coprocessor transfer
(these registers collectively are termed "channel registers" in the following discussion). For the cases indicated, the access or transfer was not completed because of some exceptional condition. Note that the
Channel Data Register contains relevant information
only in the case of a store.
Forthe WARN trap, the channel registers are valid only if
a load or store were in progress when the trap was
taken. Recall that the WARN trap does not wait for any
in-progress access to be completed.
For the traps with an "all" in the "Channel Regs" column
of Figure 57, the channel registers contain information
relevant to the trap in all cases. These traps are associated with exceptional events during external accesses
or coprocessor transfers.
For the traps with a "multiple" in the "Channel Regs" column, the channel registers might contain information for
restarting an interrupted Load Multiple or Store Multiple
operation. In these cases, the operation did not encounter an exception, but was simply canceled for latency
considerations.
The information contained in the channel registers allows the processor to restart the related operation during an interrupt return sequence, without any special assistance by software. Software must only ensure that
the relevant information is retained in, or restored to, the
channel registers before an interrupt return is executed.

Arithmetic Exceptions

trap unless the divisor is O. If the divisor is 0, an Out of
Range trap always occurs, regardless of the DO bit.
In addition to the operations described in the Interrupt
and Trap Handling section, the following operations are
performed when an Out of Range trap is taken:
1. The operation code of the instruction causing the
exception is placed in the lOP field of the Exception Opcode Register.
2. For the MULTIPLY, MULTIPLU, DIVIDE, and
DIVIDU instructions, the absolute register numbers of the excepting instruction's source and
destination registers are placed into the Indirect
Pointer A,lndirect PointerB, and Indirect Pointer
C registers.
3. For the MULTIPLY, MULTIPLU, DIVIDE, and
DIVIDU instructions, the destination register or
registers are unchanged.
Floating-Point Exceptions
A Floating-Point Exception trap occurs when an exception is detected during a floating-point operation, and the
exception is not masked by the corresponding bit of the
Floating-Point Mask Register. In this context, a floatingpoint operation is defined as any operation that accepts
a floating-point number as a source operand, that produces a floating-point result, or both. Thus, for example,
the CONVERT instruction may create an exception
while attempting to convert a floating-point value to an
integer value.

Integer and floating-point instructions can cause Out of
Range or Floating-Point Exception traps, respectively, if
an exception is detected during the arithmetic operation.
This section describes the conditions under which these
traps occur and the additional operations performed beyond those described in the Interrupt and Trap Handling
section.

In addition to the operations described in the Interrupt
and Trap Handling section, the following operations are
performed when a Floating-Point Exception trap is
taken:

Integer Exceptions
Some integer add and subtract instructions-ADDS,
ADDU, ADDCS, ADDCU, SUBS, SUBU, SUBCS,
SUBCU, SUBRS, SUBRU, SUBRCS, and SUBRCUcause an Out of Range trap upon overflow or underflow
of a 32-bit signed or unsigned result, depending on the
instruction.

2. The status of the trapping operation is written
into the trap status bits of the Floating-Point
Status Register. The status bits that are written
do not depend on the values of the corresponding mask bits in the Floating-Point Environment
Register.

Two integer multiply instructions-MULTIPLY and
MULTI PLU-cause an Out of Range trap upon overflow
of a 32-bit signed or unsigned result, respectively, if the
MO bit of the Integer Environment Register isO. If the
MO bit is 1, these multiply instructions cannot cause an
Out of Range trap.
Two integer divide instructions-DIVIDE and DIVIDUtake the Out of Range trap upon overflow of a 32-bit
Signed or unsigned result, respectively, if the DO bit of
the Integer Environment Register is O. If the DO bit is 1,
the divide instructions cannot cause an Out of Range
1·82

1. The operation code of the instruction causing the
exception is placed in the lOP field of the Exception Opcode Register.

3. The absolute register numbers of the excepting
instruction's source and destination registers
are placed into the Indirect Pointer A, Indirect
Pointer B, and Indirect Pointer C registers. If the
RB or RC fields specify a function code, that
code is transferred to the corresponding indirect
pointer. Note that if the most-significant bit of the
this function code is 1, the value of the Stack

Am29000

Pointer has been added to the RS field and must
be subtracted to recover the original field.
4. The destination register or registers are left unchanged.

Exceptions During Interrupt
and Trap Handling
In most cases, interrupt and trap handling routines are
executed with the DA bit in the Current Processor Status
having a value of 1. It is assumed that these routines do
not create many of the exceptions possible in most other
processor routines, so most of these are ignored.
If the assumption of no exceptions is not valid for a particular interrupt or trap handler, it is important that the
handler save the state of the processor and reset the FZ
bit of the Current Processor Status, 50 that the handler
itself may be restarted properly. This must be accomplished before any interrupts or traps can be taken. In
this case, the state (or the state of some other process)
must be restored before an interrupt return is executed.

It is possible that errors reported via the IERR and DERR
signals are associated with hardware errors, independent of any routine being executed. For this reason, the
Instruction Access Exception, Data Access Exception,
and Coprocessor Exception traps cannot be disabled by
the DA bit, and the processor may take one of these
traps even while handling another interrupt or trap.
If the processor does take an unmaskable trap while
handling another interrupt or trap, and the state of the
interrupt ortrap handler is not reflected in processor registers, it is not possible to return to the point at which the
unmaskable trap is taken. When the unmaskable trap is
taken, the processor state saved is that state associated
with the original interrupt or trap, not with the unmaskable trap; however, the Old Processor Status Register is
modified to reflect the Current Processor Status Register of the interrupt or trap handler. This situation, indicated by the DA bit being.1 in the Old Processor Status
Register, may not be recoverable.

1·83

29K Family CMOS Devices

MEMORY MANAGEMENT
The Am29000 incorporates a Memory Management
Unit (MMU) for performing virtual-to-physical address
translation and memory access protection. This section
describes the logical operation of the Memory Management Unit.
Address translation can be performed only for instruction/data memory accesses. No address translation is
performed for instruction ROM, input/output, coprocessor, or interrupt/trap vector accesses. However, an instruction/data memory access can be redirected to input/output by· the address-translation process.

Translation Look-Aside Buffer
The MMU stores the most recently performed address
translations in a special cache, the Translation LookAside Buffer (TLB). All virtual addresses generated by
the processor are translated by the TLB. Given a virtual
address, the TLB determines the corresponding physical address.
The TLB reflects information in the processor system
page tables, except that it specifies the translation for
many fewer pages; this restriction allows the TLB to be

Entry
#

Line 0

o

TLB Set 0

incorporated on the processor chip where the performance of address translation is maximized.
A diagram of the TLB is shown in Figure 58. The TLB is a
table of 64 entries, divided into two equal sets, called Set
oand Set 1. Within each set, entries are numbered 0 to
31. Entries in different sets that have equivalent entry
numbers are grouped into a unit called a line; there are
thus 32 lines in the TLB, numbered 0 to 31.
Each TLB entry is 64 bits long and contains mapping
and protection information for a single virtual page. TLB
entries may be inspected and modified by processor instructions executed in the Supervisor mode. The layout
of TLB entries is described in the Register Description
section.
The TLB stores information about the ownerShip of the
TLB entries in an 8-bit Task Identifier (TID) field in each
entry. This makes it possible for the TLB to be shared by
several independent processes without the need for invalidation of the entire TLB as processes are activated.
It also increases system performance by permitting
processes to warm-start (i.e., to start execution on the

Entry

TLB Set 1

#

o

--------------

~-------------------------+----------~------------------------~

Line 1

______________
Line 2

2

2

-------------Line 3

3

~------------------------_+----------i-----------------------------~

~-------------------------+----------~-------------------------~
3
~-------------------------+----------i-----------------~

Line 4

4

4

~-------------------+----------~---------------~

---------------~---------------------+----------~--------------------~
Line 31

31

31

---------------~----------------.----------~------------------~
..-...... 64 bits - - .
..-...... 64 bits
--.

Figure 58. translation Look-Aside Buffer Organization

1-84

Am29000

Address Translation Controls

processor with a certain number of TLB entries remaining in the TLB from a previous execution).

The processor attempts to perform address translation
for the following external accesses:

Each TLB entry contains a Usage bit to assist managementof the TLB entries. The Usage bit indicates which
set of the entry within a given line was least recently
used to perform an address translation. Usage bits for
two entries in the same line are equivalent.

1. Instruction accesses, if the Physical Addressing/
Instructions (PI) and ROM Enable (RE) bits of
the Current Processor Status are both O.

2. User-mode accesses to instruction/data mem-

The TLB contains other fields, described in the following
sections.

ory if the Physical Addressing/Data (PO) bit of
the Current Processor Status is O.

Address Translation

3. Supervisor-mode accesses to instruction/data
memory if the Physical Address (PA) bit of the
load or store instruction performing the access is
0, and the PO bit of the Current Processor Status
is O.

For the purpose of address translation, the virtual
instruction/~ata address space of a process is partitioned into regions of fixed size, called pages. Pages are
mapped by the address-translation process into equivalent-sized regions of physical memory, called page
frames. All accesses to instructions or data contained
within a given page use the same virtual-to-physical
address translation.

Address translation also is controlled by the MMU Configuration Register. This register specifies the virtual
page size and contains an a-bit Process Identifier (PID)'
field. The PID field specifies the process number associated with the currently running program, if this is a Usermode program. Supervisor-mode programs are assigned a fixed process number of o. The process number is compared with Task Identifier (TID) fields of the
TLB entries during address translation. The TID field of
a TLB entry must match the process number for the
translation to be valid.

Virtual addresses are partitioned into three fields forthe
address-translation process, as shown in Figure 59.
The partitioning of the virtual address is based on the
page size. Page sizes may be of 1, 2, 4, or a kb, as
specified by the MM U Configuration Register. The fields
shown in Figure 59 are described in the following
'
discussion.
1-kb Page Size:
31

23

15

2-kb Page Size:
31

23

15

4-kb Page Size:
31

23

15

7

8-kb Page Size:
31

23

15

7

7

Figure 59. Virtual Address for 1-, 2-, 4-, and 8-kb Pages

1-85

29K Family CMOS Devl~es
Address Translation Process
The address-translation process is diagrammed in
Figure 60. Address translation is performed by the following fields in the TLB entry: the Virtual Tag (VTAG),
the Task Identifier (TID), the Valid Entry (VE) bit, the
Real Page Number (RPN) field, and the Input/Output
(10) bit. To perform an address translation, the processor accesses the TLB line whose number is given by
certain bits in the virtual address. The bits used depend
on the page size as follows:
Page Size
1 kb
2kb
4kb
8kb

Virtual Address Bits
(for Line Access)

bit-numbers are relative to the VTAG field, not the TLB
entry):
Page Size

Virtual Address Bits

1 kb
2kb
4kb
8 kb

31-15
31-16
31-17
31-18

VTAG Bits
16-0
16-1
16-2
16-3

Certain bits of the VTAG field do not participate in the
comparison for page sizes largerthan 1 kb. These bits of
the VTAG field are required to be O.
For an address translation to be valid, the lollowing conditions must be met:

14-10
15-11
16-12
17-13

1. The virtual address bits match corresponding
bits of the VTAG field as specified above.

The accessed line contains two TLB entries, which in
turn contain two VTAG fields. The VTAG fields are both
compared to bits in the virtual address. This comparison
depends on the page size as follows (note that VTAG

2. For a User-mode access, the TID field in the TLB
entry matches the PIO field in the MMU Configu-

Virtual Address

TLB Set 1

: Number

:U, 10

VirtuaW,
I Task Real Page I PGM
Tag :PROlID : Number : U, 10

,~--~--~~------~--~

Protection
Violation
MPGMo-1

Physical Address

Figure 60. Address Translation Process

1·86

Am29000

ration Register. For a Supervisor-mode access,
the TID field is O.
3. The VE bit in the TLB entry is 1.
4. Only one entry in the line meets conditions 1, 2,
and 3 above. If this condition is not met, the results of the translation may be treated as valid by
the processor, but the results are unpredictable.
If the address' translation is valid fo r one TLB entry in the
selected line, the RPN field in this entry is used to form
the physical address of the access. The RPN field gives
the portion of the physical address that depends on
the translation; the remaining portion of the virtual address, called the Page Offset, is invariant with address
translation.

The Page Offset comprises the low-order bits of the virtual address, and gives the location of a byte (because
of byte addressing) within the virtual page. This byte is
located at the same position in the physical page frame,
so the Page Offset also comprises the low-order bits of
the physical address.
The 32-bit physical address is the concatenation of certain bits of the RPN field and Page Offset, where the bits
from each depend on the page size as follows (note that
RPN bit numbers are relative to the RPN field, not the
TLB entry):
Page Size

1 kb
2kb
4kb
8kb

RPN Bits

21-0
21-1
21-2
21-3

Virtual Address Bits
for Page Offset

9-0
10-0

11-0
12-0

Note that certain bits of the RPN field are not used in
forming the physical address for page sizes greater than
1 kb. These bits of the RPN are required to be O. In addition, for certain instruction accesses, the Page Offset is
incremented by 16.
The address space of the physical address is determined by the InpuVOutput (10) bit of the TLB entry. If the
10 bit is 0, the address is in the instruction/data memory
address space. If the 10 bit is 1, the address is in the inpuVoutput address space.
Successful and Unsuccessful Translations
If an address translation is successful, the TLB entry is
further used to perform protection checking for the access. Bits in the TLB make it possible to restrict accesses-independently for Supervisor-mode and Usermode accesses-to any combination of load, store, and
instruction accesses, or to no access.

If the address translation is valid and no protection violation is detected, the physical address from the translation is placed on the processor's address bus and the
access is initiated. If the translation is not valid or a protection violation is detected, a trap occurs. Depending

on the state of the channel interface, the access reguest
may be placed on the address bus with the signal BINV
asserted, even though the trap occurs.
Also, if the address translation is successful and there is
no protection violation, the PGM bits from the TLB entry
used for translation are placed on the MPGM1-MPGMo
outputs during the address cycle for the access. If address translation is not performed, these pins are both
Low for the address cycle.
If the TLB cannot translate an address, a TLB miss occurs. The MMU causes a trap if either a TLB miss occurs, or the translation is successful and a protection
violation is detected. The processor distinguishes between traps caused by instruction and data accesses,
and between traps caused by User and Supervisormode accesses, as follows:
Trap Vector
Number

8
9
10
11

12
13

Type of Trap
User-Mode Instruction TLB Miss
User-Mode Data TLB Miss
Supervisor-Mode Instruction
TLB Miss
Supervisor-Mode Data TL Miss
Instruction TLB Protection
Violation
Data TLB Protection Violation

The distinction between the above traps is made to
assist trap handling,' particularly the routines that load
TLB entries.

Reload
So that the MMU may support a large variety of memorymanagement architectures, it does not directly load TLB
entries that are required for address translation. It simply causes a TLB miss trap when address translation is
unsuccessful. The trap causes a program-called the
TLB reload routine-to execute. The TLB reload routine
is defined according to the structure and access method
of the page table contained in an external device or
memory.
When a TLB miss trap occurs, the LRU Recommendation Register is written with the TLB register number for
Word 0 of the TLB entry to be used by the TLB reload
routine. For instruction accesses, the Program Counter
1 Register contains the instruction address that was not
successfully translated. Fordata accesses, the Channel
Address Register contains the data address that was
not successfully translated.
The TLB reload routine determines the translation for
the address given by the Program Counter 1 Register or
Channel Address Register, as. appropriate. The TLB
reload routine uses an external page table to determine
the required translation, and loads the TLB entry indicated by the LRU Recommendation Register so that the
entry may perform this translation. In a demand-paged
1-87

29K Family CMOS Devices
in a system during process switching. However, it is important to manage TLB entries so that an invalid match
cannot occur between the PID field and the TID field of
an old TLB entry.

environment, the TLB reload routine may additionally invoke a page-fault handler when the translation cannot
be performed.
TLB entries are written by the Move To TLB (MTILB)
instruction, which copies the contents of a generalpurpose register into a TLB register. The TLB register
number is specified by bits 6-0 of a general-purpose
register. TLB entries are read by the Move From
TLB (MFTLB) instruction, which copies the contents of
a TLB register into a general-purpose register. Again,
the TLB register number is specified by a generalpurpose register.

Protection
If an address translation is performed successfully, the
TLB entry used in address translation is used to perform
protection checking for the access. There are 6 bits in
the TLB entry for this purpose: Supervisor Read (SR),
Supervisor Write (SW), Supervisor Execute (SE), User
Read (UR), User Write (UW), and User Execute (UE).
These bits restrict accesses, depending on the program
mode of the access, as shown in Figure 61 (the value "x"
is a "don't care").

Entry Invalidation
There are two methods for invalidating TLB entries that
are no longer required at a given point in program execution. The first involves resetting the Valid Entry bit of a
single entry (this is done by a Move To TLB instruction).
The second involves changing the value of the Process
Identifier (PID) field of the MMU Configuration Register;
this invalidates all entries whose Task Identifier (TID)
fields do not match the new value.

Note that for the Load and Set (LOADSET) instruction,
the protection bits must be set to allow both the load and
store access. If this condition does not hold, neither access is performed.
If protection checking indicates that a given access is
not allowed, a Data TLB Protection Violation or Instruction TLB Protection Violation trap occurs. The cause of
the trap is determined by inspection of the Program
Counter 1 Register for an Instruction TLB Protection
Violation, or by inspection of the contents of the Channel
Address and Channel Control registers for a Data TLB
Protection Violation.

If an entry is invalidated by changing the PID field, the
TLB entry still remains valid in some sense. If the PID
field is changed again to match the TID field, the entry
may once again participate in address translation. This
ability can be used to reduce the number of TLB misses
SR

SW

SE

UR

UW

x
x
x
x
x
x
x
x

x
x
x
x
x
x
x
x

x
x
x
x
x
x
x
x

0
0
0
0
1

0
0
1
1
0
0
1
1

0

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0

x
x
x
x

x
x
x
x
x
x
x
x

x
x
x
x
x
x
x
x

1

0
1

1
1
1

x
x
x
x

UE
1

0
1
0
1
0
1

Type of Access Allowed

No user access
User instruction
User store
User store or instruction
User load
User load or instruction
User load or store
Any user access
No supervisor access
Supervisor instruction
Supervisor store
.Supervisor store or instruction
Supervisor load
Supervisor load or instruction
Supervisor load or store
Any supervisor access

Figure 61. TLB Access Protection

1-88

Am29000

CHANNEL DESCRIPTION
The processor channel provides the bandwidth required
for performance, while permitting the connection of
many different types of devices. This section describes
the channel and methods of connecting devices and
memories to the processor.
The channel consists of three 32-bit synchronous buses
with associated control and status signals: the Address
Bus, Data Bus, and Instruction Bus. The Address Bus
transfers addresses and control information to devices
and memories. The Data Bus transfers data to and from
devices and memories. The Instruction Bus transfers instructions to the processor from instruction memories.
In addition, a set of signals allows control of the channel
to be relinquished to an external master.
There are five logical groups of signals performing five
distinct functions, as follows (since some signals perform more than one function, a signal may appear in
more than one group):

1. Instruction Address Transfer and Instruction Access Requests: A:Jl-Ao, SUP/US, MPGM1MPGMo, PEN, IREO, IREOT, PIA, BINV
2. Instruction Transfer: 131-10, IBREO, IRDY, IERR,
IBACK
3. Data Address Transfer and Data Access Requests: A31-AD, R/W, SUP/US, LOCK, MPGMIMPGMo, PEN, DREO, DREOT1-DREOTo,
OPT2-0PTo, PDA, BINV
4. Data Transfer: D31-Do, DB REO, DRDY, DERR,
DBACK, CDA
5. Arbitration: BREO, BGRT, BINV

User-Defined Signals
There are two types of user-defined outputs on the processorto control devices and memories directly in a system-dependent manner. Each of these outputs is valid
simultaneously with-and for the same duration asthe address for an access.
The first set of user-defined signals, MPGM1-MPGMo,
is determined by the PGM bits in the Translation LookAside Buffer entry used in address translation. If address translation is not performed, these outputs are
both Low.
The second set of signals, OPT2-OPTo, is determined
by bits 18-16 of the load 0 r store instruction that initiates
an access. These signals are valid only for data accesses, and have a predefined interpretation for
coprocessor data transfers.
Standard interpretations of OPT2-OPTo are given in the
Pin Description section. Since the OPT2-0PTo signals
are determined by instructions, they have an impact on
application-software compatibility, and system hardware should use the given definitions of OPT2-0PTo.

The OPT2-OPTo signals are used to encode byte and
half-word accesses. However, for a load, the system
should return an entire aligned word, regardless of the
indicated data width.
Note that the standard interpretations of OPT2-0PTo
apply only to accesses to instruction/data memory and
inpuVoutput. Other interpretations may be used ior
coprocessor transfers.
For interrupt and trap vector fetches, the MPGMIMPGMo and OPT2-0PTo outputs are all Low.

Instruction Accesses
Instruction accesses occur to one of two address
spaces: instruction/data memory and instruction readonly memory (instruction ROM). The distinction between these address spaces is made by the I REOT signal, which is in turn derived from the ROM Enable (RE)
bit of the Current Processor Status Register. These are
truly distinct address spaces; each may be populated independently based on the needs of a particular system.
Instruction/data memory contains both instructions
and data. Although the channel supports separate
instruction and data memories, the Memory Management Unit does not. In certain systems, it may be required to access instructions via loads and stores, eVl3n
though instructions may be contained in physically
separate memories. For example, this requirement
might be imposed because of the need to load instructions into memory. Note also that the OPT2-0PTo signals may be used to allow the access of instructions in
instruction ROM, using loads; the Am29000 does not
prevent a store to the instruction ROM, and protection
against stores to the instruction ROM must be provided
externally, if required.
All processor instruction fetches are read accesses, and
the R/W signal is High for all instruction fetches.

Data Accesses
Data accesses occur to one of three address spaces:
instruction/data memory, inpuVoutput (liD), and the
coprocessor. The distinction between these spaces is
made by the DREOT1-DREOTo Signals, which are in
turn determined by the load or store instruction that initiates a data access. Each of these address spaces is distinct from the others.
The protocol for data transfers to and from the coprocessor is slightly different than the protocol for instruction/
data memory and I/O accesses.
Data accesses may occur either from a slave device or
memory to the processor (for a load), or from the processor to a slave device or memory (for a store). The direction of transfer is determined by the RiW signal. In
the case of a load, the processor requires that data on
the data bus be held valid only for a short time before the
end of a cycle. In the case of a store, the processor
1-89

29K Family CMOS Devices
drives the data bus as soon as the bus is available and
holds the data valid until the slave device or memory signals that the access is complete.

Reporting Errors
The successful completion of an instruction access is indicated by an active level on the IRDY input, and the successful completion of a data access is indicated by an
active level on the DRDY input. If there are exceptional
conditions for which an instruction or data access cannot be completed successfully, the unsuccessful completion is indicated by an active level on the IERR or
DERR input, as appropriate.
If the processor receives an IERR or DERR in response
to an instruction or data access, it ignores the content of
the instruction or data bus and the value of IRDY or
DRDY. An IERR response causes an Instruction Access
Exception trap, unless it is associated with an instruction
that the processor does not ultimately execute (because
of a nonsequential instruction fetCh). A DERR response
always causes either a Data Access Exception trap or a
Co-processor Exception Trap.

The processor supports the restarting of unsuccessful
accesses upon an interrupt return. In the case of an unsuccessful instruction access, the restart is performed
by the Program Counter 0 and Program Counter 1 registers. In the case of an unsuccessful data access, the restart is performed by the Channel Address, Channel
Data, and Channel Control registers. In any event, the
control program must determine whether or not an access can and/or should be restarted.
The Instruction Access Exception and Data Access Exception traps cannot be masked. If one of these traps
occurs within an interrupt or trap handler, the processor
state may not be recoverable.

Access Protocols
Figure 62 shows a control flowchart for accesses performed by the Am29000. This control flow applies independently to both instruction and data accesses. Since
the processor performs concurrent instruction and data
accesses, these accesses may be at different points in
the control flow at any given point in time.

Simple Accesses
For a simple access, the processor holds the address
valid throughout the entire access. This protocol is used
for single-cycle accesses, and for accesses to simple
devices and memories.
On any cycle before the completion of the access, a simple access may be converted to a pipe lined access (by
the assertion of PEN) or to a burst-mode access (by the
assertion of IBACK or DBACK, if the processor is asserting IBREQ or DBREQ). Thus, the protocol for simple accesses also may be used during the initial cycles of
pipelined and/or burst-mode accesses. This is advantageous, for example, in cases where the slave device or
memory either. requires the address to be held for mUltiple cycles at the beginning of the pipelined or burstmode access, or cannot respond to the pipelined or
burst-mode request within one cycle.

Pipelined Accesses
A pipe lined access is one that starts before an earlier inprogress accesses completed. The in-progress access
is called a primary access and the second access is
called a pipelined access. A pipe lined access is of the
same type as the primary access. For example, an instruction access that begins before the completion of a
data access is not'considered to be a pipe lined access,
whereas a second data access is.
The ,Am29000 allows only one pipelined access at any
given time.
Tradeoffs
For accesses that require more than one cycle to complete, pipe lined accesses perform better than simple accesses because they allow the overlap of portions of two
accesses. In addition, the ability to latch addresses in
support of pipe lined accesses reduces utilization of the
address bus, thereby reducing contention between instruction and data accesses. However, devices and
memories that support pipe lined accesses are somewhat more complex than devices and memories that
support only simple accesses.

Note that the items on the flowchart of Figure 62 do not
represent actual states and have no particular relationship to processor cycles. The flowchart provides only a
high-level understanding of the control flow. Also, exceptions and error conditions are not shown.

Support for pipe lined operations is required for both the
primary access and the pipelined access. The slave performing the primary access must contain some means
for storing the address and other information about the
access. The slave performing the pipe lined access must
be able to restrict its use of the instruction bus or data
Bus, and must be prepared to cancel the access (as explained below).

The channel supports three protocols for accesses: simple, pipelined, and burst-mode. These are described in
the following sections. The various protocols are defined to accommodate minimum-latency accesses as
well as maximum-transfer-rate accesses. The protocols
allow an access to complete in a single cycle, although
they support accesses requiring arbitrary numbers of
cycles. Address transfers for accesses may be independent of instruction or data transfers.

Plpellned Operation
Pipelined accesses are controlled by the signals PEN,
PIA, and PDA. Because of internal data-floW constraints, the Am29000 does not perform a pipelined
store operation while a load is in progress. However, the
protocol does not restrict pipelined operations. Other
channel masters may perform a pipe lined store during
a load.

1-90

Am29000
PROCESSOR
SLA VE DEVICE
------------------------,---------------------------------------NO ACCESS

-----------~---------------------------------------PRIMARY ACCESS
Assert ~.

t:rnrn

Drive result and
TROY or t5'ImV
Primary
Access
Complete

---.--------~----------------------,

PIPELINED ACCESS

Assert J5iA. J5[5A

Figure 62. Channel Flowchart

1-91

29K Family CMOS Devices
Except as noted above, the processor attempts to perform pipe lining for every access; the input PEN indicates
whether or not pipelining is supported for a given access. The PEN input can be driven by individual devices,
or can be tied active or inactive to enable or disable system-wide pipelined accesses. The processor ignores
the value of PEN unless it is performing an access.
The processor samples PEN on every cycle during a primary access. If PEN is active on any cycle, the processorceases to drive the address and associated controls
forthe primary access inthe next cycle. If the processor
requires another access before the primary access is
completed, it drives the address and controls for the
second access, asserting PIA or PDA to indicate that the
second access is a pipelined access.
The output IREO or DREO, as appropriate, is not asserted for a pipe lined access. Devices and memories
that cannot ~port pi~elined accesses should therefore ignore PIA and/or PDA, and base their operation
upon IREO and/or DREO.
A device or memory that receives a request for a
pipe lined access may treat it as any other access, with
one exception: the pipelined access cannot use the Instruction and data buses or the associated controls
(e.g., IRDY or DRDY). In the case of a data read or instruction access, the results of the pipe lined access
cannot be driven on the appropriate bus. In the case of a
data write, the data do not appear on the data bus. Any
other operations forthe access, such as address decoding, can occur.
When the primary access is completed (as indicated by
IRDYor DRDy), the pipelined access becomes a primary access. The processor indicates this by asserting
IREO or DREO, depending on the type of access. The
device or memory performing the pipelined access may
complete the access as soon as IREO or DREO is asserted (poSSibly in the same cycle). When the access
becomes a primary access, it controls the channel as
any other primary access. For example, it may determine whether or not another pipelined access can be
performed.
When the ~Iined access becomes a primary access,
the output PIA or PDA remains asserted for one cycle to
ensure continuity of control within the slave device or
memory. In the cycle after IREO or DREO is asserted,
PIA or PDA is deasserted unless the processor initiates
another pipelined access, in which case PIA or PDA remains asserted for the new access.
Cancellation of Plpellned Accesses
If the processor takes an interrupt or trap before a
pipelined access becomes a primary access, the request for the pipe lined access is removed from the
channel. This may occur, for example, when IERR or
DERR is signaled for the primary access.

1-92

If the pipe lined access is removed from the channel, the
slave device or memory does not receive an IREO or
DREO forthe pipelined access. Hence, the pipelined access does not become a primary access, and cannot be
completed. A pipelined access may be canceled in this
manner at any time before it becomes a primary access.
Because of this, a pipelined access should not change
the state of a slave device or memory until the pipelined
access becomes a primary access.

Burst-Mode Accesses
A burst-mode access allows multiple instructions or
data words at sequential addresses to be accessed with
a single address transfer. The number of accesses performed and the timing of each access within the sequence are controlled dynamically by the burst-mode
protocol. Burst-mode accesses take advantage of sequential addressing patterns, and provide several benefits over simple and pipelined accesses:
1. Simultaneous instruction and data acc,esses.
Burst-mode accesses reduce the utilization of
the address bus. This is especially important for
instruction accesses, which are normally sequential. Burst-mode instruction accesses eliminate most of the address transfers for instruc~
tions, allowing the address bus to be used for simultaneous data accesses.
2. Faster access times. By eliminating the address-transfer cycle, burst-mode accesses allow addresses to be generated in a manner that
improves access times.
3. Faster memory access modes. Many memories
have special high~bandwidth access modes
(e.g., fast page mode DRAM). These modes
generally require a sequential addressing pattern, even though addresses may not be presented explicitly to the memory for all accesses.
Burst-mode accesses allow the use of these access modes without hardware to detect sequential addressing patterns.
Burst-Mode Overview
The control-flow diagrams in Figure 63 and Figure 64 illustrate the operation of the processor and an instruction memory during a burst-mode instruction access.
The control-flow diagrams in Figure 65 and Figure 66 illustrate the operation of the processor and a data memory or device during a burst-mode data access. These
diagrams are for illustration only; nodes on these diagrams do not necessarily correspond to processor or
slave states, and transitions on these diagrams do not
necessarily correspond to processor cycles.

Am29000

IPB(1)
location
available

SUSPENDED

If no exception
retransmit address

TLB miss or
protection violation

(1) IPB = Instruction Prefetch Buffer

Figure 63. Processor Burst-Mode Instruction Accesses: Control Flow
A burst-mode access is in one of the following operational conditions at any given time:
1. Established:

2. Active:

3. Suspended:

The processor and slave device
have successfully initiated the
burst-mode access. A burstmode access that has been established is either active or suspended. An established burstmode access may become
preempted, terminated or canceled.
Instruction or data accesses and
transfers are being performed
as the result of the burst-mode
access. An active burst-mode
access may become suspended.
No accesses ortransfers are being performed as the result of

the burst-mode access, but the
burst-mode access remains established. Additional accesses
and transfers may occur at
some later time (Le., the burstmode access may become active) without the retransmission
of the address for the access.
4. Preempted:

The burst-mode access can no
longer continue because of
some condition, but the burstmode access can be reestablished within a short
amount of time.

5. Terminated:

All required accesses have
been performed.

6. Canceled:

The burst-mode access can no
longer continue because of
1-93

29K Family CMOS Devices
~, mArnERR Active,
or interrupVtrap taken

If no exception
retransmit address

TLBmiss or
protection violation

Note: The Am29000 does not suspend burst-mode data accesses.

Figure 65. Processor Burst-Mode Data Accesses: Control Flow
mode access on each subsequent address transfer, as
long as there are more accesses yet to be performed.
During any subsequent access, the addressed device or
memory may establish a burst-mode access by asserting IBACK or DBACK. If the burst-mode access is never
established, the default behavior is to have the processor transmit an address for every access.
Active and Suspended Burst-Mode Accesses
After the burst-mode access is established, IBREO and
DBREQ are used during subsequent accesses to indicate that the pro~equires at least one more access. If IBREQ or DBREQ is active at the end of the cycle
in which an access is successfully completed (Le., when
IROY or DRDY is active), the processor requires another
access. If the slave device or memory previously has
not preempted the burst-mode acCess, and does not

preempt (by deasserting IBACK or DBACK) or cancel
(by asserting IERR or DERR) the burst-mode access in
the cycle that the access completes, the additional access must be performed.
The execution rate of instructions is known only dynamically, so that in certain situations, a burst-mode instruction access must be suspended. If IBREQ is inactive
during the cycle in which an instruction access is completed, the burst-mode access is suspended (if it is neither preempted nor canceled at the same time). The
burst-mode access remains suspended unless the
processor requests a new instruction access (in which
case IREO is asserted). or unless the instruction memory preempts the burst-mode access.
A suspended burst-mode instruction access becomes
active wheneverthe processor can accept more instruc1-95

29K Family CMOS Devices
mmm,~Active
ACTIVE

Cannot continue burst

Inactive

Tenninated,
Preempted, or
Canceled by
Processor

Figure 66. Slave Burst-Mode Data Accesses: Control Flow
tions. The processor activates the burst-mode access
by asserting IBREO. If the instruction memory does not
pree~pt the burst-mode access during this cycle, an instruction access must be performed.
When a suspended burst-mode instruction acCess is activated, the resulting instruction access is not permitted
to be completed in the cycle in which IBREO is asserted,
but may be completed in the next cycle. The reason for
this restriction is that the burst-mode protocol is defined
such that the combination of an active level on IBREQ
and IRDY causes an instruction access (as previously
discussed).lfthe instruction access is completed immediately in the cycle where a suspended burst-mode access is activated, there is an ambiguity in the protocol: it
is possible to interpret a single-cycle assertion of IBREO
as a request for two instructions.
The above ambiguity is resolved by delaying the instruction access resulting from a reactivated burst-mode access for a cycle. Since this restriction applies only when
the Instruction Prefetch Buffer is full and the instruction
memory is capable of a very fast access, the delayed instruction response has no performance impact.
The Am29000 does not suspend burst-mode data accesses because the data transfers occur to and from
general-purpose registers, which are always available.
However,other channel masters may suspend burstmode data accesses (during direct memory accesses,
1-96

for example). The principles for suspending burst-mode
accesses are the same as those for instruction accesses discussed above.

Processor Preemption, Termination,
and cancellation
The processor may preempt, terminate or cancel a
burst-mode access by deasserting IBREO or DBREQ
and asserting IREQ or DREQ at some later point. Normally, the processor receives one more instruction or
data word after IBREO or DBREQ is de asserted. How"
ever, this access may be completed in the same cycle
that IBREQ or D~REQ is deasserted. During the period
after IBREQ or DBREQ is deasserted and before IREO
or DREO is asserted, the burst-mode access is in a suspended condition.
The slave device or memory cannot distinguish between preempted, terminated, and canceled burstmode accesses, when these are caused b1..!b!J?rocessor, until the processor asserts IREO or DREQ. If the
slave continues to assert IBACK or DBACK after IBREQ
or DBREQ is deasserted, the slave should be prepared
to accept any new request during the cycle in which
IREO or DREO is asserted to begin the new access. The
reason for this is that the processor may attempt to establish a burst-mode access for the new access: if the
slave is asserting IBACK or DBACK because of a previ-

ously preempted, terminated, or canceled burst-mode
access, the processor interprets the active IBACK or
DBACK as establishing the new burst-mode access and
removes the request in the following cycle.
The processor preempts a burst-mode access when an
external channel master arbitrates for the channel, or
when a burst-mode fetch crosses a potential virtualpage boundary. Since the minimum page size is 1 kb,
burst-mode instruction and data accesses are preempted whenever the address sequence crosses a 1-kb
address boundary. The burst is reestablished as soon
as a new address translation is performed (if required).
A new physical address is transmitted when the burstmode access is reestablished.
Note that the preemption resulting from page boundaries is advantageous for devices or memories that
require counters to follow the burst-mode address
sequence. Since all burst-mode accesses are word
accesses and the processor retransmits an address at
every 1-kb address boundary, an 8-bit counter in the
slave device or memory is sufficient to follow the burstmode. address sequence. Additional address bits are
simply latched.
The processor terminates a burst-mode access whenever all required instructions or data have been accessed. In the case of instruction accesses, the burstmode access is terminated when a nonsequential fetch
occurs. In the case of data accesses, the burst-mode
access is terminated when the count indicates a Single
load or store remains. The last load or store is executed
as a simple access:
The processor cancels a burst-mode access when an
interrupt ortrap is taken. Note that a trap may be caused
by the burst-mode access, for example when a Translation Look-Aside Buffer miss occurs on an address in the
burst-mode sequence. If the processor cancels a burstmode access when an access in the sequence remains
to be completed, this access must be completed in spite
of the cancellation.
Canceled burst-mode data accesses may be restarted
at some (possibly much later) point in execution via the
Channel Address, Channel Data, and Channel Control
registers. In this case, the burst-mode access is restarted at the point at which it was canceled, rather than
at the beginning of the original address sequence.
Slave Preemption and Cancellation
The slave device or memory involved in a burst-mode
access may preempt the access by deasserting IBACK
or DBACK. The processor samples IBACK and DBACK
when IRDY and DRDY are active so that IBACK and
DBACK may be deasserted as the last supported access is completed. However, IBACK and DBACK also
may be de asserted in any cycle before the access i~
completed. If IBACK or DBACK is deasserted when the
processor is in a state where it expects an access, the
access must be completed.

Am 29000
In general, the slave device or memory preempts the
burst-mode access whenever it cannot support any further accesses in the burst-mode sequence. This normally occurs whenever an implementation-dependent
address boundary is encountered (e.g., a cache-block
boundary), but may occur for any reason. By preempting the burst-mode access, the slave receives a new request with the address of the next instruction or data
word required by the processor.
The slave device or memory may cancel a burst-mode
access by asserting IERR or DERR in response to a requested access. The signals IBACK or DBACK need not
be deasserted at this time, but should be de asserted in
the next cycle.
Note that the IERR and DERR Signals cat,Jse non-maskable traps, except in the case where IERR is asserted for
an instruction that the processor does not execute.

Arbitration
External masters can gain access to the address, data,
and instruction buses by asserting the BREQ input. The
processor completes any pending acce~eempts
any burst-mode access, and asserts the BGRT output.
At this time, the processor places all channel outputs associated with the address, data, and instruction buses in
the high-impedance state.
For the first cycle in which BGRT is asserted, the output
BINV is also asserted. If the external master cannot control the address bus and associated controls in the cycle
where BGRT is asserted, the active level on BINV may
be used to define an idle cycle forthe channel (Le., any
spurious access requests are ignored). The BINV signal
is asserted only for a single cycle, so the external master
must take control of the channel in the cycle after BGRT
is asserted.
While the BREQ input remains asserted, the processor
continues to assert BGRT. The external master has control over the channel during this time.
To release the channel to the processor, the external
master deasserts BREQ, but must continue to control
the channel for the first cycle in which BREQ is
deasserted. In the cycle after BREa is deasserted, the
processor asserts BINV and deasserts BGRT;the external master should release control of the channel at this
time. On the following cycle, the processor deasserts
BINV and is able to use the channel. The processor
reestablishes any burst-mode access preempted by
arbitration.
The processor does not relinquish the channel when the
LOCK signal is active. This prevents external masters
from interfering with exclusive accesses.

1-97

29K Family CMOS Devices

Use of BINV to Cancel an Access
Besides using the BINV signal to transfer control of the
channel from one masterto another, the Am29000 uses
the BINV signal to cancel accesses after they have been
initiated. To cancel an access, BINVis asserted during a
cycle in which IREO or DREO also is asserted. If an ac~is canceled, the~mpanying response (using
IROY, IERR, DRDY or DERR) is ignored during the cycle
where BINV is asserted; thereafter, the system should
not respond to the canceled access.
The BINV Signal is used to cancel an instruction access
in the following situations:
• when an interrupt or trap is taken
• when an instruction fetch-ahead is canceled
because a target block is only partially present
in the Branch Target Cache
• when an instruction TLB miss or protection
violation occurs on an instruction access
• when a branch instruction is the delay instruction of another branch, and the targets of both
branches are in the Branch Target Cache (in
this case, the external fetch for the target of
the first branch is not required)
• when the processor enters the Load Test Instruction Mode, and there is an active instruction request on the channel
The BINV Signal is used to cancel a data access in the
following situations:
• when a data TLB miss or protection violation
occurs on the data access
• when an interrupt or trap is taken in the cycle
where a pipelined data access becomes a primary access
If, for data accesses, address translation is not performed and pipe lined accesses are not implemented,
the BINV signal can be ignored by the system during the
access.

When a LOADSET instruction encounters a protection
violation because store access is not permitted, the
processor cancels the load access with BINV.

Bus Sharing-Electrical Considerations
When buses are shared among multiple masters and
slaves, it is importantto avoid situations where these devices are driving a bus at the same time. This may occur
when more than one master or slave is allowed to drive a
bus in the same cycle if bus arbitration is incompletely or
incorrectly performed. However, it also occurs when a
1-98

master or slave releases a bus in the same cycle that another master or slave gains control, and the first master
or slave is slow in disabling its bus drivers, compared to
the point at which the second master or slave begins to
drive the bus. The latter situation is called a bus COllision
in the following discussion.
In addition to the logical errors that can occur when multiple devices drive a bus simu Itaneously, such situations
may cause bus drivers to carry large amounts of electrical current. This can have a Significant impact on driver
reliability and power dissipation. Since .bus collisions
usually occurfor a small amount of time, they are of less
concern, but may contribute to high-frequency electromagnetic emissions.
The Am29000 channel is defined to prevent all situations where multiple drivers are driving a bus simultaneously. However, bus collisions may be allowed to occur, depending on the system deSign.
In the case of the Am29000 channel, arbitration for the
channel prevents the processor from driving the address and data buses at the same time as another channel master. If there is more than one external master,
the system design must include some means for ensuring that only one external master gains control of the
channel, and that no external master gains control of the
channel at the same time as the processor.
When the processor relinquishes control of the channel
to an external master, bus collisions may be prevented
by not allowing the external master to drive any bus
while BINVisactive. This ensures that all processor outputs are disabled by the time the external master takes
control of the channel. However, there is nothing in the
channel protocol to prevent the external master from
taking control as soon as BGRT is asserted.
Slave devices and memories are prevented from simultaneously driving the instruction bus or data bus by
allowing only the device or memory performing a primary access to drive the appropriate bus. When a
pipe lined access becomes a primary access, it may
drive the instruction or data bus immediately, so there is
a potential bus collision if the pipe lined access is
performed by a slave other than the slave performing
the original primary access. This bus collision may be
prevented by restricting all slaves to driving the instruction and data buses in the second half-cycle (using
SYSCLK, for example). Since the processor samples
data only at the end of a cycle, this restriction does not
affect perfonnance.
When the processor performs a store immediately following a load, it drives the data bus for the store in the
second cycle following the cycle in which the data forthe
load appears on the data bus. This provides a complete
cycle for the slave involved in the load to disable its data
drivers. The processor continues to drive the data bus
until it receives a DRDY or DERR in response to the
store; it ceases to drives the data bus in the cycle following the response.

Am29000
the channel, an individual device or memory, or a location within a device or memory.

Channel Behavior for Interrupts
and Traps

When a resource is locked, it is available for access only
by the processor with the appropriate access privilege.
The mechanisms for restricting accesses and the methods for reporting attempted violations of the restrictions
are system-dependent.

If an interrupt ortrap is taken, any burst-mode accesses
are canceled. If a request for a pipe lined access is on the
address bus, this request is removed. Any other accesses are completed and no new accesses are started,
other than those required for the interrupt or trap. Note
that any accesses that the processor expects to complete must be completed, even though burst-mode and
pipelined accesses are canceled.

Initialization and Reset
When power is first applied to the processor, it is in an
unknown state and must be placed in a known state.
Also, under certain circumstances, it may be necessary
to place the processor in a defined state. This is accomplished by the Reset mode, which is invoked by activating the RESET pin for the required duration. The Reset
mode configures the processor state as follows:

When interrupt or trap processing is complete, any canceled burst-mode access transactions are reestablished using the address of the access that was to be
performed next when the interrupt or trap was taken.
Uncompleted pipelined accesses are restarted, either
by the interrupt return sequence in the case of an instruction access, or by restarting the initiating instruction
in the case of a data access.

1. Instruction execution is suspended.
2. Instruction fetching is suspended.

Note that the restarting of a pipe lined access is not performed by the Channel Address, Channel Data, and
Channel Control registers, since these registers may be
required to restart the primary access. The instruction
initiating the pipelined access is not allowed to be completed until the primary access is completed, so that the
Program Counter 1 {PC1} register contains the address
of the initiating instruction when a pipelined access is
canceled. The address in PC1 can restart this instruction on interrupt return.

3. Any interrupt or trap conditions are ignored.
4. The Current Processor Status Register is set as
shown in Figure 67.

5. The Cache Disable bit of the Configuration Register is set.
6. The Data Width Enable bit of the Configuration
Register is reset.

7. The Contents Valid bit of the Channel Control

Effect of the LOCK Output

Register is reset.

The LOCK output provides synchronization and exclusion of accesses in a multiprocessor environment.
LOCK has no predefined effect for a system, other than
the fact that the Am29000 does not grant the channel to
an external master while LOCK is active.

Except as previously noted, the contents of all generalpurpose registers, special-purpose registers, and TLB
registers are undefined. The contents of the Branch Target Cache are also undefined.
The Reset mode also configures the processor to initiate an instruction fetch using an address of O. Since the
ROM enable {RE} bit of the Current Processor Status is
1, this fetch is directed to external instruction read-only
memory. This fetch occurs when the Reset mode is
exited {Le., when the RESET input is deasserted}.

The LOCK output is asserted for the address cycle of the
Load-and-Lock and Store-and-Lock instructions, and is
asserted for both the read and write accesses of a Load
and Set instruction. LOCK may also be active for an extended period of time under control of the Lock bit in the
Current Processor Status Register {this capability is
available only to Supervisor-mode programs}.

The Reset mode is invoked by asserting the RESET input and can be entered only if the SYSCLK pin is operating normally, whether or not the SYSCLK pin is being

LOCK may be defined to provide any level of resource
locking for a particular system. For example, it may lock

I ~

1:11 0 10 10 10 10 10 10 1:1 0 10 10 10 10 10 10 1:1 0 10 10 10 11 10 11 1: 11 11 11 10 0 11 1 1
,

v
Reserved

~

:
I

•
I

:
I

I
I

I
I

i
I

I
I

I

I

I

I

I

I

I

: IP: TP: FZ: RE: PO: SM:

01

I

I

I

I

I

I

I

I

I
I

I

I

I

I

I

I

I

I

I

CA

••

TE

I

I

TU

••

LK

WM

PI

1M

OA

Figure 67. Current Processor Status Register In Reset Mode

1·99

29K Family CMOS Devices
driven by the processor. The Reset mode is entered
within four· processor cycles after RESET is asserted.
The RESET Input must be asserted for at least four processor cycles to accomplish a processor reset.
The Reset mode can be entered from any other processor mode (e.g., the Reset mode can be entered from the
Halt mode). If the RESEf input is asserted at the time
that power is first applied to the processor, the processor enters the Reset mode only after four cycles have
occurred on the SYSCLK pin.
The Reset mode is exited when the RESET Input is deasserted. Either three or four cycles after RESET Is deasserted (depending on internal synchronization time),
the processor performs an initial instruction access on
the channel. The initial instruction access is directed to
Address 0 in the instruction read-only memory (instruction ROM). If instruction ROM is not implemented in a
particular system, another device or memory must respond to this instruction fetch.
If the CNTL1-CNTLo Inputs are 10 or01 when RESET is
deasserted, the processor enters the Halt or Step mode,

1-100

respectively. If the processor enters the Halt mode immediately after reset, the protection checking that normally applies to the Halt instruction is disabled so that
the Halt instruction can be used as an instruction breakpoint in a User-mode program. The Load Test Instruction mode cannot be directly entered from the Reset
mode. If the CNTL1-CNTLo inputs are 00 immediately
after RESEr Is deasserted, the effect on processor operation is unpredictable. If the CNTL l-CNTLo inputs are
11, the processor enters the Executing mode.
The ~rocessor samples the STATo output internally
when RESET is asserted. A High level on STATo in this
case is used to enable a special test configuration and
causes the processor to be inoperable. When RESET is
asserted, the processor drives STAT0 Low in order to
disable this test configuration. However, if processor
outputs are disabled by the Test mode, the processor is
not able to drive STATo. Thus, if RESET is asserted
When the processor is in the Test mode, the STATo pin
must be driven Low externally. (In a master/slave configuration, STATo is driven Low by the master processor
when RESET Is asserted.)

Am29000

ABSOLUTE MAXIMUM RATINGS

OPERATING RANGES

Storage Temperature
Voltage on any Pin
with Respect to GND

Commercial (C) Devices
Case Temperature (Tc)
Supply Voltage (Vee)

-0.5 to Vee +0.5 V

Stresses above those listed under ABSOL UTE MAXIMUM RA TINGS may cause permanent device failure.
Functionality at or above these limits is not implied. Exposure to absolute maximum ratings for extended periods may affect device reliability.

oto +85°C
+4.75 to +5.25 V

MIlHary Devices
Case Temperature (Tc)*
Supply Voltage (Vcc)

-55 to +125°C
+4.5 to +5.5 V

Operating ranges define those limits between which the
functionality of the device is guaranteed.
*measured "instant on"

DC CHARACTERISTICS over COMMERCIAL and MILITARY operating ranges
Parameter
Symbol

Parameter
Description

Test Conditions

-0.5
2.0
-0.5
2.0
-0.5
Vee-O.8

VIL
VIH
VILINCLK
VIHINCLK
VILSYSCLK
VIHSYSCLK
Va.
VOH
lu
!Lo
Iccop

Min.

Output low Voltage for
All Outputs except SYSClK
Output High Voltage for
All Outputs except SYSClK

IOL=3.2 rnA
.... ~.ll\r;h.

Max.
0.8
Vee +0.5
0.8
Vee +0.5
0.8
Vee +0.5

V
V
V
V
V
V

0.45

V

2.4

V
±10
±10

Output leakage Current
Operating Power-Supply
Current

22 for
Commercial
25 for
Military

VOLC
VOHC

Unit

O.S
Vee-O.S

~

mNMHz

V
V

losGNO

100

rnA

100

rnA

losvcc

Circuit Current

CAPACITANCE
Parameter
Symbol

Parameter
Description

CIN
CINCLK
CSYSCLK
COUT
Coo

Input Capacitance
INClK Input Capacitance
SYSClK Capacitance
Output Capacitance
VO Pin Capacitance

Test Conditions

fC=1 MHz (Note 1)

Min.

Max.
15
20
90
20
20

UnH
pF
pF
pF
pF
pF

Note: 1. Not 100% tested.

1·101

29K Family CMOS Devices

SWITCHING CHARACTERISTICS over COMMERCIAL operating range
No.
1
1A
2
3
4
5
6
6A
7
8

SA
9
9A

98
10
11
12
12A
12B
13
14
15
16
17
18

19
20

1-102

Parameter
Description
System Clock (SYSCLK)
Period (T)
SYSCLK at 1.5V to SYSCD<
at 1.5V when used as an output
SYSCLK High Time when used as input
SYSCLK Low Time when used as input
SYSCLK Rise Time
SYSCLK Fall Time
Synchonous SYSCLK Output
Valid Delay
Synchronous SYSCLK Output
Valid Delay for 031-00
Three-State Synchronous SYSCLK
Output Invalid Delay
Synchronous ~
Output Valid Delay
Three·State SYSCIJ<
Synchronous Output Invalid Delay
Synchronous Input Setup Time
Synchronous Input Setup Time
for Ds,-Oo' 13,-10
Synchronous Input Setup Time
forlmDY
Synchronous Input Hold Time
Asynchronous Input Minimum
Pulse Width
INCLI< Period
INCLI< to SYSCLK Delay
';'~,~
INCLI< to SYSCO< Delay
INCLI< Low Time
INCLI< High Time
INCLK Rise Time
INCLI< Fall Time
INCLI< to Deassertion of ~
(for phase synchronization of SYSCLK)
WARN Asynchronous Deassertion
Hold Minimum Pulse Width
BiNV Synchronous Output Valid
Delay from SYSCIJ(
Three-State synchronous SYSCLK
output invalid delay for 0 31-00

25 MHz
Min.
Max.

UnH

Note 1

40

1000

ns

Note 13
Note 13
Note 13
Note 2
Note 2

0.5T-1
19
17

0.5T +1

5
5

ns
ns
ns
ns
ns

Notes 3. 12

3

14

ns

Note 12
Notes 4,
14.15

4

18

ns

3

30

ns

3

14

ns

3
12

30

ns
ns

Test
CondHlons

33 MHz
Min.
Max.

Notes 5. 1.2.
Notes
:~
1~"
.i\,
~
""'"

30

ns

15

15

ns

8

8

ns

9A

Synchronous Input Setup Time
for 0 31-00 , 131-10

98

Synchronous Input Setup Time
for DRDY

16

16

ns

10

Synchronous Input Hold~

Note 6

2

2

ns

11

Asynchronous Input Minimu
Pulse Width
INCLK Period

Note 8

T +10
25

500

T +10
30

500

ns

2

12

2

15

ns

12

2
12

15

ns

INCLK Low Time

2
10

14

INCLK High Time

10

15

INCLK Rise Time

5

5

ns

16

INCLK Fall Time

5

5

ns

17

INCLK to Deassertion of RESET
(for phase synchronization of SYSCLK)

5

ns

18

WARN Asynchronous Oeassertion

19

BiNV Synchronous Output Valid

20

Three-State synchronous SYSCLK
output invalid delay for 0 31 -00

12
12A

INCLK to SYSCLK Delay

128
13

INCLK to SYSCLK Delay

.

v

v
",%li~.,.

Hold Minimum Pulse Width
Delay from SYSCLK

ns

ns

12

5

0

ns

Note 9

0

Note 10

4T

Note 12
Notes 11,
14,15

1

8

1

9

ns

3

25

3

25

ns

4T

ns

1·103

29K Family CMOS Devices

SWITCHING CHARACTERISTICS over MILITARY operating range
No.
1

Parameter

Test

Description

Conditions

System Clock (SYSCLK)
Period (T)

20 MHz

16 MHz

Min.

Max.

Min.

Max.

Unit

Note 1

50

1000

60

1000

ns

SYSCLK at 1.5V to SYSCI:R'
at 1.5V when used as an output

Note 13

0.5T -1

0.5T +1

0.5T-2

0.5T +2

ns

2

SYSCLK High Time when used as input

Note 13

22

27

ns

3
4

SYSCLK Low Time when used as input
SYSCLK Rise Time

Note 13

19

22

5

SYSCLK Fall Time

6

Synchonous SYSCLK Output
Valid Delay

1A

6A
7
8
8A

ns

3

16

ns

4

20

ns

30

3

30

ns

16

3

16

ns

30

3
15

30

15
8

8'

16

16

ns

2

2

ns

Notes 5, 12{&f~
Notes
14,
Na~7.,~\"\",,> :>I,i,>

Three-State SYSCD<
Synchronous Output Invalid Delay
Synchronous Input Setup Time
for 0 31-00 , 131-10
Synchronous Input Setup Time
forl5RDY

12A
128

5

3

Jt~!' !:~"~5)
''\~J'

I~:::~t~~ i'>,

3

ns
ns
ns

~'i

1~~iP'

Asynchronous Input Minimum
Pulse Width

Note 8

INCLK Period


--+~

\::;Y

Synchronous Inputs
1.5 V

Relative to SYSCLK

1-106

.-

Am29000

SWITCHING WAVEFORMS

INCLK

Jr---1.5-4~81-----~
V

1~4~----------~~r-----------'~1
Asynchronous
Inputs

1.5 V

1.5 V

INCLK and Asynchronous Inputs

1-107

29K FamIlY'CMOS Devices

SWITCHING WAVEFORMS

~----~3r-----~
~------~2r-------~

SYSCLK Definition

1.5 V

SYSCLK

INCLK

1.--------;12r-------~

INCLK to SYSCLK Delay

1·108

Am29000

Capacitive Output Delays
For loads greater than 80 pF
This table describes the additional output delays for capacitive loads greater than 80 pF. Values in the Maximum
Additional Delay column should be added to the value listed in the SWitching Characteristics table. For loads less
than or equal to 80 pF, refer to the delays listed in the SWitching Characteristics table.

No.
6

6A

Total
External
capacitance

Parameter Description
Synchronous SYSCLK Output Valid Delay

100 pF
150 pF
209;,

Synchronous SYSCLK Output Valid Delay for 0 31-0

B

19

BINV Synchronou~~utput Valid Delay from SYSCLK

Maximum
Additional
Delay
+1 ns
+2 ns
+4ns
+6 ns
+B ns
+1 ns
+6 ns
+10 ns
+15 ns
+19 ns
+1 ns
+2 ns
+4ns
+6ns
+B ns
+1 ns
+3 ns
+4ns
+6ns
+7ns

FJ.

O'fjF
OOpF
250pF
300pF
100pF
150 pF
200pF
250pF
300pF
100 pF
150 pF
200pF
250pF
300 F

SWITCHING TEST CIRCUIT

10&.

= 3.2mA

Am29000
Pin Under Test

080751HlO1A

ICOO 1030

CL is guaranteed to BO pF. For capacitive loading greater
than BO pF, refer to the Capacitive Output Delay table.

1-109

29K Family CMOS Devices

Am29000 Thermal Characteristics
Pln-Grld-Array Package

Thermal Resistance - °ClWatt

700
(3.58)

900
(4.61)

2

2

2

13

11

10

Parameter

OJC Junction-to-Case
SCA Case-to-Ambient (no Heat~i,~l~
SCA Case-to-Ambient (w

''0

Heatsink, Thermall'7

6

3

2

2

2

6

3

2

2

2

700
(3.58)

900
(4.61)

OCA Case-to-Ambient (witttLnidiredional Pin Fin
Heatsink, Wakefield 840-20)

10

Ceramlc-Quad-Flat-Pack Package

,

I

II· . .

I

SJC

I

OCA

r

SJA

IC001040

Thermal Resistance - °ClWatt

Alrflow-ft./mln. (m/sec)
Parameter

Ox Junction-ta-Case
SCA Case-to-Ambient
Note: This is for reference only.

1-110

0
(0)

150
(0.76)

300
(1.53)

480

(2.45)

Am29027

Advanced
Micro
Devices

Am29027
Arithmetic Accelerator
DISTINCTIVE CHARACTERISTICS
•
•

•
•
•

High-speed floating-point accelerator for the
Am29000™ processor
Comprehensive floating-point and Integer
Instruction sets, Including addition,
subtraction, and multiplication
Single-, double-, and mixed-precision
operations
Performs conversions between precisions and
between data formats

•

Complies with seven Industry-standard
floating-point formats:
-IEEE Standard for Binary Floating-Point
Arithmetic (ANSI/IEEE std 754-1985), single- and
double-precision

•

•
•

•
•

Exact IEEE compliance for denormallzed
numbers with no speed penalty
Simple Interface requires no glue logic
between Am29000 and Am29027 ™
Eight-deep register file for Intermediate results and on-Chip 64-bit data path facilitate
compound operations, for example, NewtonRaphson division, sum-of-products, and
transcendentals
Supports plpellned or flow-through operation
Full complier and assembler support for IEEE
format
Fabricated with Advanced Micro Devices' 1.2micron CMOS process

-DECTM F, DEC 0, and DEC G Standards
-IBM~

Systeml370 single- and double-precision

SIMPLIFIED SYSTEM DIAGRAM

Data

32

09114-OO1C

Publication' 09114

Rev. C

Amendment 10

Issue Date: October 1989

1-111

29K Family CMOS Devices

TABLE OF CONTENTS
DISTINCTIVE CHARACTERISTICS •.•.••••.•.••••..••••••••••.•••••.••••••••••••••..• 1-111
SIMPLIFIED SYSTEM DIAGRAM •••.•••••.••••••••••••.••••.••••••••.••••••.••••••••• 1-111
GENERAL DESCRIPTION •••••••.••••••.•.•••••••.•••••••••.••••••.•••.••••••••.••• 1-114
CONNECTION DIAGRAMS ••••••.••••••••••••.•••.•.•••••••.••••••••••••••••••••••• 1-115
PIN DESIGNATIONS ••••••••.••••••••••••••.•••.•••••••.••••••••••.••••••.••••..•• 1-117
LOGIC SYMBOL ••••••.•••.••.••••.••••••••.•••.••. , •••••••••••••••••••••••••••••• 1-121
ORDERING INFORMATION ••••.••••..••••.•••••.•••••••••.••••.•••••••••••••••••.• 1-122
PIN DESCRIPTION ••••••••••••.•••...•••••••••••..•••••••••••••.•••••.•.••••••••• 1-124
FUNCTIONAL DESCRIPTION •••••••••..•••••••••••••••••.•••••••••••••••••••••.••••
Overview •••••••••••••••••••••.••••...••••••••••••••••.•••••.•••••••••..•••
Architecture ....... ; ............................................................
Instruction Set ............................................................. ; .....
Performance .................................................... ,' ........ ; ......
Interface .......................................................................
Master/Slave ....................................................................
Support ...................................................................... "
Block Diagram Description ••.••••••.•.••••.•••...••••••••.••••..•••••••••••.•••
Input Registers ..................................................................
Operand Selection Multiplexers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Instruction Register ..............................................................
ALU ..........................................................................
Output Register/Register File .......................................................
Flag Register '...................................................................
Status Register ..................................................................
Output Multiplexer ...............................................................
Mode Register ....................................................' ..............
Control Unit ....................................................................
Master/Slave Comparator .........................................................
System Interface •••.••...••••.••.•••.•..•••••••••••.•••..••.••.•.•••••• '•..•••
Special-Purpose Registers •••••.••••••.•..••••••.•••..•••.••••...••••••••.•••••
Mode Register ..................................................................
Status Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Flag Register ...................................................................
Precision Register ...............................................................
Instruction Register, I-Temp Register ................................................
Operand Registers ••••.••.••••.•••.•.•.••••••.••••••••.•••••...••••.•••••..••
Accelerator Transaction Requests ..•.•••.••••••..•••••••.••..••..•••••.••• ,. ••..•
Write Transaction Requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Read Transaction Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Coprocessor Data Accept .........................................................
Data Ready ....................................................................
Data Error .. '.' ........................................................... '. . . . . . ..
Accelerator Instruction Set .•...••••...••..•..••.••••..•.••.....•.•••..•••.•••••
Instruction Word .................................................................
Base Operation Code .............................................................
Sign-Change Selects ...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Operand Precision Selects .........................................................
Operand Source Selects ..........................................................
Register File Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Accelerator Operations ...........................................................
1-112

1-125
1-125
1-125
1-125
1-125
1-125
1-126
1-126
1-126
1-127
1-127
1-127
1-127
1-127
1-127
1-127
1-127
1-127
1-127
1-128
1-128
1-129
1-129
1-131
1-131
1-132
1-132
1-132
1-133
1-133
1-134
1-135
1-135
1-135
1-136
1-136
1-136
1-136
1-136
1-139
1-139
1-139

Am29027

Base Operation Code Description ...................................................
Primary and Alternate Floating-Point Formats ..........................................
Operation Precision ............ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Operation Flags ..................................................................
Updating the Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..

1-143
1-145
1-145
1-145
1-148

Operatlon.Sequenclng •••••••••••••••••••••••••••••••••••••••.•••••••••••••••• 1-148

Operation in Flow-Through Mode ................................... : . . . . . . . . . . . . . . ..
Operation in Pipeline Mode ........................................................
Pipeline Advance ......................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Performing Operations ............................................................

1-148
1-153
1-153
1-153

Master/Slave Operation ••••••••••••••••••••••••••••••••••••••••••.•••••••••••• 1-158
Initialization and Reset •••••••••••••••••••••••••••••••••••••••••••••••••••••••• 1-158
Applications •••• .' ••••••••••••••••••••••••••••••••••••••••••••••.••••••••.••• 1-158
ABSOLUTE MAXIMUM RATINGS ••••••••••••••••••••••••••••••••••••••••.•.••••••••• 1-161
OPERATING RANGES •••••••••••••••.••••••••••••••••••••••••••••••.••••••••••••• 1-161
DC CHARACTERISTICS

•••••••••••••••••••••••••••••••••••••••••••.••...•.•.•.••• 1-162

CAPACiTANCE ••••••••••••••••••••••••••••••••.•••••• .' •.••••••••••••••••.•••••• 1-162
SWITCHING CHARACTERISTICS •••••••••••••••••.•• .' ••••••••••••••••••••••.•••...•• 1-163
SWITCHING WAVEFORMS ••• .' •••••••••••••••••••••••••••••••••••••••••.••••.••.••. 1-165
SWITCHING TEST CIRCUIT ••••••••••••••••••••••••••••• .' ••.••••••••••••••••••.. '. ~ • 1-169
TEST PHILOSOPHY AND METHODS ••••••••••••••••••••••••••••••••.•• .' •..•••••••••• 1-170
APPENDIX A-DATA FORMATS •••••••••••••••••••••••••••••••••••••• .' .•••••••••••• 1-172
APPENDIX B--ROUNDING MODES ••••••••.• .' ••• .' ••••• .'.'.'.'.' ••••• .' ••••...•.•.••••••• .' 1-177
APPENDIX C-ADDITIONAL OPERATION bETAILS .' • .' • .' • .' • .' •• .' • .' ••.••.•. .' ......... .' ••• .' •. 1-180
APPENDIX D-TRANSACTION REQUEST/OPERATION TIMING • .' •••••• .' • .'.' .'.' • .' •.• .' •••.•... .' 1-182

1-113

29K Family CMOS Devices

GENERAL DESCRIPTION
The Am29027 Arithmetic Accelerator is a highperformance computational unit intended for use with
the Am29000 Streamlined Instruction Processor. When
added to an Am29000-based system, the Am29027
improves floating-point performance by an order of
magnitude or more.
The Am29027 implements an extensive floating-point
and integer instruction set, and can perform operations
on single-, double-, or mixed-precision operands. The
three most widely used floating-point formats-IEEE,
DEC, and IBM-are supported. IEEE operations fully
comply with the IEEE Standard for Binary Floating-Point
Arithmetic (ANSI/IEEE standard 754-1985), with direct
implementation of special features such as gradual underflow and exception handling.
The Am29027 consists of a 64-bit ALU, a 64-bit data
path, and a control unit. The ALU has three data input
ports, and can perform operations requiring one, two, or
three input operands. The data path comprises two
64-bit input operand registers. an 8-by-64-bit register
file for storage of intermediate results, three operand selection multiplexers that provide for orthogonal selection
of input operands, and an output multiplexer that
allows access to Jhe result data, the operation status,
the flags, or the accelerator state. The control unit interprets transaction requests from the Am29000, and
sequences the ALU and data path.
Operations can be performed in either of two modes:
flow-through or pipeline. In flow-through mode, the ALU
is completely combinatorial; this mode is best suited
to scalar operations. Pipeline mode divides the ALU
into twO or three pipe lined stages for use in vector

1·114

operations, such as those found in graphics or signal
processing.
The Am29027 connects directly to Am29000 system
buses and requires no additional interface circuitry.
Fabricated with AMD's 1.2-micron CMOS technology,
the Am29027 is housed in' two packages: a 169lead pin-grid-array (PGA) package, and a 164-lead
ceramic-quad-flat-pack (CQFP) package for military
applications.
Related AMD Products

Part No.

Description

Am29000

Streamlined Instruction Processor

29KTM Family Development Support Products
Contact your local AMD representative for information
on the complete set of development support tools.
Software development products on several hosts:
•

Optimizing compilers
languages

for common

•

Assembler and utility packages

high-level

•

Source- and assembly-level software debuggers

•

Target-resident development monitors

•
Simulators
Hardware Development:
•

ADAPT29KTM' Advanced Development and Prototyping Tool

Am29027

CONNECTION DIAGRAMS
169-Lead PGA *
Bottom View

ABC D E F G H
1

2
3

4
5
6

7
8
9
10

11

12
13
14
15
16

17

J

K L MN P R T U

@000000000000000®
00000000000000000
00000000000000000
0000··
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
00000000000000000
00000000000000000
@000000000000000@

• Pinout observed from pin side of package .
•• Alignment pin (not connected internally).

CD009761

1-115

29K Family CMOS Devices

CONNECTION DIAGRAMS (continued)
164·Lead CQFp·
Top View
(Lid Facing Viewer)

164

124

123

41

83

L

42

1-116

82

Am29027

PGA PIN DESIGNATIONS (sorted by Pin No.)
Pin Na.

Pin Name

PinNa.

Pin Name

PinNa.

Pin Name

PinNa.

Pin Name

A-1
A-"2

S31
F4
Fs
Fa
Flo
F12
F14
FIS
Fla
F21
F22
F24
F27
F2a
F31
SLAVE
It
S30
Fl
F3
F5
F7
F9
F13
F15
F17
F19
F23
F25
F26
F30
GND
MSERR
15
S27
S28
Fo
F2
Veeo
GNDO
Fl1
GNDO
Veeo

C-10
C-11

F20
Veeo
GNDO
F29
GNDO
Veco
12
Is
S24
S25
S29
(see note)
10
13
18
S21
S23
S2S
14
17
19
S18
S20
S22
Vee
Ito
1t2
SIS
S17
S19
GND
hI
1t4
S13
S14
SIS
GND
1t3
Its
S11
S12
Vee
117

J-16
J-17

lIS
Ita
S9
SIO
GND
121
120
1t9
S8
S7
S6
GNDO
123
122
S5
S4
S2
Veeo
DRDY
CDA
S3
SI
R30
NC
EXCP
DERR
So
R29
R26
126
NC
NC
R31
R27
R24
R20
Vee
GND
R12
R8
GND
Vee
ClK

R-12
R-13

DREOTo
RESET
DREO
129
127
124
R28
R23
R21
R18
R16
R13
RIO
R7
R5
R3
Ro
OPTI
DREOTI
BINV
131
128
125
R25
R22
R19
R17
R15
R14
Rll
R9
Rs
R4
R2
Rl
OPTo
OPT2
R/W
OE
130

A-3
A-4
A-5
A-6
A-7
A-8
A-9
A-10
A-11
A-12
A-13
A-14
A-15
A-16
A-17
B-1
B-2
B-3
B-4
B-5
B-6
B-7
B-8
B-9
B-10
B-11
B-12
B-13
B-14
B-15
B-16
B-17
C-1
C-2
C-3
C-4
C-5
C-6
C-7
e-8
e-g

C-12
C-13
C-14
C-15
C-16
C-17
0-1
0-2
0-3
0-4
0-15
0-16
0-17
E-1
E-2
E-3
E-15
E-16
E-17
F-1
F-2
F-3
F-15
F-16
F-17
G-1
G-2
G-3
G-15
G-16
G-17
H-1
H-2
H-3
H-15
H-16
H-17
J-1
J-2
J-3
J-15

K-1
K-2
K-3
K-15
K-16
K-17
L-1
L-2
L-3
L-15
L-16
L-17
M-1
M-2
M-3
M-15
M-16
M-17
N-1
N-2
N-3
N-15
N-16
N-17
P-1
P-2
P-3
P-15
P-16
P-17
R-1
R-2
R-3
R-4
R-5
R-6
R-7
R-8
R-9
R-10
R-11

R-14
R-15
R-16
R-17
T-1
T-2
T-3
T-4
T-5
T-6
T-7
T-8
T-9
T-10
T-11
T-12
T-13
T-14
T-15
T-16
T-17
U-1
U-2
U-3
U-4
U-5
U-6
U-7
U-S
U-9
U-10
U-11
U-12
U-13
U-14
U-15
U-16
U-17

Note: Pin Number 0-4 =Alignment Pin.
Veeo and GNOO are power and ground pins for the output buffers.
Vee and GNO are power and ground pins for the rest of the logic.

1-117

29K Family CMOS Devices

PGA PIN DESIGNATIONS (sorted by Pin Name)
Pin No.

Pin Name

Pin No.

Pin Name

Pin No.

Pin Name

Pin No.

Pin Name

T-14
M-17
R-11
N-17

BINV
CDA

G-15
H-15

8-16

MSERR

So

NC
NC
NC

M-16
R-14

DRDY
DREQ

K-3
R-6
R-9
C-6

N-15
P-16
P-17

P-1
N-2

CLK
DERR

GND
GND
GND
GND
GND
GNDO

U-16
U-13

R-12
T-13
N-16

DREQTo
DREQTl
EXCP

C-8
C-12
C-14

GNDO
GNDO
GNDO

C-3
B-2
C-4
8-3
A-2
8-4
A-3
8-5
A-4
8-6
A-5
C-7
A-6
8-7
A-7
8-8
A-8
8-9
A-9
8-10
C-10
A-10
A-11
8-11

Fo
Fl
F2
F3
F4
F5
F6
F7
Fs
F9
Flo
F11
F12
F13
F14
F15
F16
F17
F18
F19
F20
F21
F22
F23

L-15
0-15
A-17
C-16
0-16
E-15
8-17
C-17
E-16
0-17
E-17
F-16
G-16
F-17
H-16
G-17
H-17
J-16
J-15
J-17
K-17
K-16
K-15
L-17

GNDO
10
11

A-12
8-12
8-13

L-16
R-17
T-17
P-15

A-14
C-13
8-14
A-15

F24
F25
F26
F27
F28
Fn
F30
F31

8-15

GND

A~13

"R-16
T-16
R-15
U-17
T-15

M-3
N-1

SI
S2
S3

OE
OPTo

M-2
M-1

S4
S5

T-12
U-14
T-11

OPTI
OPT2
Ro

L-3
L-2
L-1

S6
S7
Ss

Is
19
110
111
112
113
114
115
116
117
lIs
119
120
121
122

U-12
U-11
T-10
U-10
T-9
U-9
T-8
R-8
U-8
T-7
U-7
R-7
T-6
U-6
U-5
T-5
U-4
T-4
U-3
R-4
T-3
U-2
T-2
R-3

Rl
R2
R3
R4
R5
R6
R7
Rs
R9
RIo
R11
R12
R13
R14
R15
RIB
R17
RIS
R19
R20
R21
R22
R23
R24

K-1
K-2
J-1
J-2
H-1
H-2
G-1
H-3
G-2
F-1
G-3
E-1
F-3
E-2
0-1
0-2
E-3
C-1
C-2
0-3
8-1
A-1
A-16

S9
S10
S11
S12
S13
S14
SIS
S16
S17
SIS
S19
S20
S21
S22
S23
S24
S25
S26
S27
S28
S29
S30
S31
SLAVE

123
124
125
126
127
128
In
130

U-1
P-3
R-2
T-1
P-2
N-3
R-1
R-13

R25
R26
R27
R28
Rn
R30
R31
RESET

F-15
J-3
R-5
R-10
C-5
C-9
C-11
C-15

Vee
Vee
Vee
Vee
Veeo
Veeo
Veeo
Veeo

131

U-15

R/W

M-15

Veeo

b
13
14
15
16

17

Note: Pin Number D-4 = Alignment Pin.
Vcco and GNDO are power and ground pins for the output buffers.
Vee and GND are power and ground pins for the rest of the logic.

'-"8

F-2

Am29027

CQFP PIN DESIGNATIONS (sorted by Pin No.)
Pin No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

Pin Name
Fo
FI
F2
F3
F4
Veeo
GNDO
F5
Fe
F7
Fs
F9
Flo
FII
FI2
FI3
FI4
FIs
GNDO
Vcco
FIe
F17
FIB
FI9
F20
F21
F22
F23
F24
F2S
F26
Vcco
GNDO
F27
F2S
F29
F30
F31
GND
SLAVE
M8ERR

Pin No.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

58
59
60
61
62
63
64

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82

Pin Name
Vee
GND
10
11
12
13
14
15
16
b

Is
19
110
III
112
113
GND
114
115
116
117
lIs
119
120
121
122
123
CDA
DRDY
DERR
GNDO
Vcco
EXCP
NC
NC
NC
124
125
126
127
I~

Pin No.
83
84
85
86
87
88

89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123

Pin Name
129
128
131
DREQ
OE
BINV
RE8ET
R/W
DREQTI
DREQTo
OPT2
OPTI
OPTo
ClK
Ro
RI
R2
R3
R4
Vee
GND
Rs
R6
R7
Rs
R9
RIo
RII
RI2
RI3
RI4
Rls
RI6
RI7
RI8
RI9
R20
R21
R22
R23
R24

Pin No.
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164

Pin Name
R25
R26
R27
R2S
R29
R30
R31
80
81
82
83
84
85
86
87
8s
89
810
811
GND
Vee
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
82S
829
830
831

1·119

29K Family CMOS Devices
CQFP PIN DESIGNATIONS (sorted by Pin Name)
Pin No.
88
69
96
71
86
92
91
70
74
1
2
3
4
5
8
9
10
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27
28
29
30
31
34
35
36
37
38

1·120

Pin Name
BINV
CDA
ClK
DERR
DREQ
DREQTo
DREQT1
DRDY
EXCP
Fo
F1
F2
F3
F4
Fs
F&
F7
Fa
F9
F10
Fll
F12
F13
F14
F15
F1&
F17
F18
F19
F20
F21
F22
F23
F24
F25
F26
F27
F28
F29
FlO
F31

Pin No.
39
43
58
103
143
7
19
33
72
44
45
46
47
48
49
50
51
52
53
54
55
56
57
59
60
61
62
63
64
65
66
67
68

78
79
80
81
84
83
82
85

Pin Name
GND
GND
GND
GND
GND
GNDO
GNDO
GNDO
GNDO
10
11
12
b

14
15
16
17

18
III
110
In
112
113
114
115
116
117
11a
119
120

121
122
123
124
125

12&
127
128
129
130
131

Pin No.
41
75
76
77
87
95
94
93
89
90
97
98
99
100
101
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

Pin Name
MSERR
NC
NC
NC
OE
OPTo
OPT1
OPT2.
RESET
RtW·
Ro
R1
R2
R3
R4
Rs
R6
R7
Ra
RII
R10
Rll
R12
R13
R1.
R15
R16
R17
R18
R111
R20
R21
R22
R23
R24
R2S
R26
R27
R28
R29
R30

Pin No. Pin Name
130
R31
SLAVE
40
131
So
132
S1
133
82
S3
134
135
S4
136
Ss
137
S6
138
S7
139
Sa
140
S9
141
S10
142
S11
145
S12
. S13
146
147
S14
S15
148
149
S16
150
S17
151
S18
S19
152
S20
153
154
S21
155
822
156
S23
S24
157
S25
158
159
S26
S27
160
S28
161
162
S29
163
S30
164
S31
Vee
42
102
Vee
Vee
144
Veeo
6
Veeo
20
Veeo
32
Veeo
73

Am29027

LOGIC SYMBOL

RESET

Transaction
Request

2

CDA

RIW

DRDY

DREO

DERR

). Transact;on

Status

DREOT,-DREOTo
F31-Fo
OPTrOPTo
BIN V

MSERR

R31-Ro
EXCP
8 31 -8 0

bl-Io
OE

09114B-002C

1-121

29K Family CMOS Devices

ORDERING INFORMATION

Standard Products
AMD standard products are available in several packages and operating ranges. The ordering number
(Valid Combination) is formed by a combination of:
a. Device Number
b. Speed Option (if applicable)
c. Package Type
d. Temperature Range
e. Optional Processing

AM29027

-25

.
L=
C

G

B

..

e. OPTIONAL PROCESSING
Blank.. Standard Processing
B - Burn-in
d. TEMPERATURE RANGE
C ... Commercial (0 to +85°C)

~-------------------c.PACKAGETYPE

G .. 169-Lead Pin Grid Array without Heatsink
(CGX169)

L -_ _ _

a. DEVICE NUMBER/DESCRIPTION
Am29027
Arithmetic Accelerator

Valid Combinations
AM29027-25
AM29027-20
AM29027-16

1-122

GC,GCB

b. SPEED OPTION
-25 =25 MHz
-20 .. 20 MHz
-16 16 MHz

=

Valid Combinations
Valid Combinations list configurations planned to
be supported in volume for this device. Consult
the local AMD sales office to confirm availability of
specific valid combinations, to check on newly
released combinations, and to obtain additional
data on AM D's standard military grade products.

Am29027

MILITARY ORDERING INFORMATION
APL Products
AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL
(Approved Products List) products are fully compliant with MIL-STD-883C requirements. The order number (Valid Combination) is formed by a combination of
8. Device Number
b. Speed Option (if applicable)
c. Device Class
d. Package Type
e. Lead Finish

AM29027

-20

z

IB

C

L==

e. LEAD FINISH
C = Gold
d. PACKAGE TYPE
Z = 169-Lead Pin Grid Array without Heatsink
(CGX169)
Y = 164-Lead Ceramic Quad Flat Pack without Heatsink

' - - - - - - - - - - - - c. DEVICE CLASS
IB = Class B
b. SPEED OPTION
-20 = 20 MHz
-16 = 16 MHz
' - - - - 8. DEVICE NUMBER/DESCRIPTION

Am29027
Arithmetic Accelerator

Valid Combinations
AM29027-20
AM29027-16

I
I

ISZC,/BYC

Valid Combinations
Valid Combinations list configurations planned to
be supported in volume for this device. Consult
the local AMD sales office to confirm availability of
specific valid combinations or to check on newly
released valid combinations.

Group A Tests
Group A tests consist of Subgroups
1, 2,3, 7, 8, 9, 10, 11.

1-123

29K Family CMOS Devices

PIN DESCRIPTION
BINV
Bus Invalid (Synchronous Input)
A logic Low indicates that the Am29000 address bus
and related control signals are invalid. The Am29027
will ignore signal DREOTl when BINV is Low.

CDA
Coprocessor Data Accept (Three-State Output)
A logic Low indicates that the Am29027 is ready to accept data from the Am29000. This signal is normally
driven by the Am29027, and assumes a high-impedance state only if input signal OE is High or input signal
SLAVE is Low.

ClK
Clock (Input)

DERR
Data Error (Three-State Output)
A logic Low indicates that an unmasked exception occurred during or preceding the current transaction request. This signal is normally driven by the Am29027,
and assumes a high-impedance state only if input signal
OE is High or input signal SLAVE is Low.

DRDY
Data Ready (Three-State Output)
A logic Low indicates that data is available on Port F.
This signal is normally driven by the Am29027, and assumes a high-impedance state only if input signal OE is
High or input signal SLAVE is Low.

DREQ
Data Request (Synchronous Input)
A logic Low indicates that the Am29000 is making a data
access. The Am29027 will ignore signal DREOTl when
DREQ is High.

DREQTo
Start Instruction/Suppress Errors
(Synchronous Input)
This signal, when accompanied by a valid write operand
R, write operand S, write operands R, S, or write instruction transaction request, commands the Am29027 to
begin a new operation. When accompanying a valid
read result LSBs, read result MSBs, read flags, or read
status transaction request, DREOTo suppresses the reporting of operation errors. DREOTo also modifies the
action of the write status transaction request to retime
an operation in flow-through mode, or to invalidate the
ALU pipeline in pipeline mode.

DREQT1
Accelerator Transaction Request
(Synchronous Input)
A logic High indicates that the Am29000 is making an
accelerator transaction request. This signal is consid-

1-124

ered valid only when signal BINV is High and signal
DREO is Low.

EXCP
Exception (Three-State Output)
Indicates that the status register contains one or more
unmasked exception bits. This signal can be used as
an interrupt or trap signal by the Am29000. EXCP is
normally driven by the Am29027, and assumes a highimpedance state only if input signal OE is High or input
signal SLAVE is Low.

F31-Fo
F Output Bus (Three-State Output)

h1-lo
Instruction Bus (Synchronous Input)
Used to specify the operation to be performed by the
accelerator.

MSERR
Master/Slave Error (Output)
Reports the result of the comparison of processor outputs with the signals provided internally to the off-chip
drivers. If there is a difference for any enabled driver,
MSERR assumes the logic High state.

OE
Output Enable (Asynchronous Input)
A logic High forces all accelerator outputs except
MSERR to assume a high-impedance state unconditionally; master/slave comparison Circuitry is also disabled. This signal is provided for test purposes.

o PTrOPTo
Transaction Type (Synchronous Input)
These signals, in conjunction with RNi, specify the type
of accelerator transaction, if any, currently being requested by the Am29000.

R31-Ro
R Data Bus (Synchronous Input)

RESET
Reset (Asynchronous Input)
Resets the Am29027. When RESET is a logic Low, the
state of internal sequencing circuitry is initialized, and
the status register is cleared. RESET must be connected
to the signal line used to reset the Am29000.

R/W
Read/Wrlte (Synchronous Input)
Determines the direction of a transaction. When R/W is
High, data is transferred from the Am29027 to the
Am29000; when Rm is Low, data is transfe rred from the
Am29000 to the Am29027.

Am29027

S31-80

•

multiplication-accumulation

S Data Bus (Synchronous Input)

•
•

comparison
selecting the maximumor minimum of two numbers

•

rounding to integral value

•

absolute value, negation, pass

SLAVE
Master/Slave Mode Select
(Synchronous Input)
A logic Low selects Slave mode; in this mode all outputs
except MSERR assume a high-impedance state. A logic
High selects Master mode.

FUNCTIONAL DESCRIPTION
Overview
The Am29027 is a high-performance, single-chip arithmetic accelerator for the Am29000 Streamlined Instruction Processor.
Architecture
The Am29027 comprises a high-speed ALU, a 64-bit
data path, and control circuitry.
The core of the Am29027 is a 64-bit floating-point/integer ALU. The ALU takes operands from three 64-bit
input ports and performs the selected operation, placing
the result on a 54-bit output port. Seven ALU flags report
operation status. The ALU is completely combinatorial
for minimum latency; optional pipelining is available to
boost throughput for vector operations.
The data path consists of two 32-bit input buses, Rand
S; two 64-bit input registers; two 64-bit temporary input
registers; a 64-bit result register; an 8-word-by-64-bit
register file for storage of intermediate results; three operand selection multiplexers that provide for orthogonal
selection of input operands; an output multiplexer that
selects data, operation flags, operation status, or other
accelerator state; and a 32-bit output bus, F.lnput operands enter the floating-point accelerator through the R
and S buses, and are then demultiplexed and buffered
for subsequent storage in the input registers. The operand selection multiplexers route the operands to the
ALU; operation results and status leave the device on
Output Bus F. Operation results also can be stored in
the register file for use in subsequent operations.
On-board control circuitry sequences the ALU and data
path during operations, and manages the transfer of
data between the accelerator and the Am29000. A
32-bit instruction register and a 32-bit temporary instruction register hold the instruction words for current
and pending operations.
Instruction Set
The Am29027 implements 57 arnhmetic and logical instructions. Thirty-five instructions operate on floatingpoint numbers; these instructions fall into the following
categories:
•

additiOn/subtraction

•

multiplication

•
•

reciprocal seed generation
conversion between any of the supported
floating-point formats, including conversions
between precisions
• conversion of a floating-point number to an integer
format, with an optional scale factor
By concatenating these operations, the user can also
perform division, square-root extraction, polynomial
evaluation, and other functions not implemented
directly.
Twenty-two instructions operate on integers, and belong to the following general categories;
•

additiOn/subtraction

•

multiplication

•
•

comparison
selecting the maximum or minimum of two numbers

•

absolute value, negation, pass

•
•

logical operations, e.g., AND, OR, XOR, NOT
arithmetic, logical, and funnel shifts

•

conversion between single- and double-precision
integer formats
conversion of an integer number to a floating-point
format, with an optional scale factor

•
•

pass operand

One special instruction is provided to move data.
Performance
The Am29027 provides operation speeds several times
greater than conventional floating-point processors
by virtue of its extensive use of combinatorial, rather
than sequential, logic. Most floating-point operations,
whether single, double, or mixed precision, can be
performed in as few as six system clock cycles. Performance is further enhanced by the presence of the
on-board register file that can be used to hold intermediate results, thus reducing the amount of time needed to
transfer operands between the Am29027 and the
Am29000. The input operand registers and the instruction register are double-buffered, so that a new operation can be specified while the current operation is being
completed.
Interface
The Am29027 connects directly to the Am29000 system
buses. Am29027 operations are specified by a series of

1·125

29K Family CMOS Devices
operand and instruction transactions issued by the
Am29000. Eight input signals specify the transaction to
be performed; three output signals report transaction
status.
Master/Slave
The Am29027 contains special comparison hardware to
allow the operation of two accelerators in parallel, with
one accelerator (the slave) checking the results produced by the other (the master). This feature is of
particular importance in the design of high-reliability
systems.
Support
The Am29027 IEEE format is fully supported by those
hardware and software tools available forthe Am29000,
including:
•

•

ASM29K Cross-Development Toolkit

•

ADAPT29K, a general-purpose hardware development system. The ADAPT29K permits single-step
operation, break-point insertion, and other standard
debugging techniques.

Block Diagram Description
A block diagram of the Am29027 is shown in Figure 1.
The Am29027 comprises the input registers, the operand selection multiplexers, the instruction register, the
ALU, the output register/register file, the flag register,
the status register, the output multiplexer, the mode register, the control unit, and the master/slave comparator.

HighC29K Cross-Development Toolkit

32

r-----------+----------~

32

; - - - - - - - - - - - - - - - - ~ ~ Prec.

R!W
DREQ
DREOT,
DREOTo
OPT2
OPT1
OPTo

Control
Unit

BINV

CDA
DRDY

"DEAR
OE
RESET
elK
SLAVE

~

0---.

D----+
D----+

Figure 1. Am29027 Block Diagram

1-126

09114-OO3C

Input Registers
Operands are loaded into the accelerator via the 32-bit
Rand S buses, and are demultiplexed and buffered for
subsequent storage in 54-bit registers Rand S; input operands may be either single-precision (32-bit) or doubleprecision (54-bit). Two single-precision or one doubleprecision operand may be written to the input registers
in a single system clock cycle. Accompanying the input
registers are two 54-bit temporary registers, R-Temp
and S-Temp, that permit the overlapping of operand
transfers and ALU operations.
Operand Selection MUltiplexers
The operand selection multiplexers route operands
to the ALU. These multiplexers, as well as selecting
operands from input registers Rand S and register file
locations RF7-RFo, also have access to a set of floatingpoint and integer constants. These constants are
double-precision preprogrammed numbers for use in
ALU operations, and are automatically provided in the
appropriate format.
Instruction Register
The instruction register stores a 32-bit word specifying
the current accelerator operation. Included in the instruction word are fields that specify the core operation
to be performed by the ALU, operand format (integer or
floating-point), sign-change selects for ALU input and
result operands, operand precisions, operand sources,
and register file controls. The instruction register is
preceded by the 32-bit temporary register,l-Temp, permitting the overlapping of instruction transfers and ALU
operations. Instructions enter the accelerator via 32-bit
Instruction Bus I.
ALU
The ALU is a combinatorial arithmetic/logic unit that
performs a large repertoire of floating-point and integer
operations. The ALU has three operand inputs. Some
operations require a single input operand, for example,
conversion operations. Others, such as addition or mUltiplication, require two input operands. The multiplication-accumulation and funnel shift operations require
three input operands. Most ALU operations allow the
user to modify operand signs, thus greatly increasing
the numberof arithmetic expressions that can be evaluated in a single ALU pass.
The ALU can be configured in either flow-through mode,
for which the ALU is completely combinatorial, or pipeline mode, for which ALU operations are divided into one
or two pipeline stages.
Output Register/Register File
Operation results are stored in 64-bit output register F;
results can also be stored in the 8-by-64-bit register
file for use in subsequent operations. A precision register, part of the register file, contains bits indicating the
preciSions of the operands stored in each register file ,
location, thus permitting the ALU to correctly process
these operands in later operations.

Am29027
Flag Register
The 32-bit flag register stores flags pertaining to the
most recently performed operation. The flags indicate
error conditions, such as underflow or overflow, and
also report results for operations that produce result
flags, such as comparisons.
Status Register
The 32-bit status register contains information regarding the status of past, current, and pending operations.
Six exception bits report operation error conditions.
These exception bits are individually latched; once a
given bit is set, it remains set until reset by the Am29000
or by system reset. The exception bits indicate error
conditions of overflow, underflow, zero result, reserved
operand, invalid operation, and inexact result. At the user's option, the presence of an exception can be used to
report a data error to the Am29000, or to halt Am29027
operation; exception bits can be individually enabled or
disabled by programming the corresponding mask bit in
the mode register.
Exception bit activity is summarized by a seventh bit,
Exception Status, which indicates that one or more unmasked status bits are set. If deSired, the state of this bit
can be placed on signal EXCP, which can be used to in~
terrupt the Am29000.
The status register contains four additional bitsR-Temp Valid, S-Temp Valid, I-Temp Valid, and Operation Pending-that pertain to the state of pending operands and operations.
Output Multiplexer
The output multiplexer routes operation results and accelerator's internal state to the Am29000 through the
32-bit F bus. This multiplexer can select Register F, the
flag register, status register, instruction register, mode
register, or precision register.
Mode Register
The 54-bit mode register contains accelerator control
parameters that change infrequently or not at all, such
as floating-point format, round mode, and operation
timing information. These parameters are initialized by
the Am29000 during system start-up, and are modified
as required during operation.
Control Unit
The control unit manages the transfer of data between
the Am29000 and the Am29027, as well as the timing of
operation execution. The Am29000 oversees operation
of the Am29027 by issuing one of thirteen commands, or
transaction requests, to the control unit via eight signal
lines. Each transaction request specifies an action on
the part of the Am29027, such as writing an operand to
an input register or returning a result to the Am29000.
The control unit interprets the transaction request and
sequences the Am29027 to produce the desired action.
Three transaction status lines are generated by the con1-127

29K Family CMOS Devices
tral unit to indicate transaction completion, orto indicate
the existence of an accelerator error condition.
Master/Slave Comparator
Each Am29027 output signal has associated logic that
compares that signal with the signal that the accelerator
provides internally to the output driver; any discrepancies are indicated by assertion of signal MSERR.

dress Bus A31-Ao. Through these connections, the
Am29000 can transfer to the Am29027 a 32-bit instruction, two 32-bit operands, or a 64-bit operand in a single
cycle, or can receive a 32-bit result from the Am29027 in
a single cycle.

System Interface

Twelve additional signals govern communication between the Am29000 and Am29027. Eight Am29000 output signals-Rm, OREa, DREQT1, DREQTo, OPT2OPTo, and BINV-are connected to the corresponding
Am29027 signals and are used to issue transaction
requests to the Am29027. Three Am29027 signals-COA, DRDY, and OERR-report transaction
status. COA is directly connected to the corresponding
input of the Am29000, while ORDY and OERR must be
ORed with like signals from other resources. A fourth
Am29027 signal, EXCP, may be connected to an
Am29000 trap or interrupt input to signal the presence of
Am29027 operation exceptions at the user's option.

Am29000/Am29027 signal interconnects are depicted
in Figure 2.

The Am29027 takes its clock input from the Am29000
SYSCLK system clock output.

Three Am29027 buses-R31-Ro, 131-10, and F31-Fo-are
connected to Am29000 Oata Bus 031-00; the remaining
Am29027 bus, S31-S0, is connected to Am29000 Ad-

The signal used to reset the Am29000 must also. be
connected to the Am29027 RESET input.

For a single accelerator, this output comparison detects
short circuits in output signals or defective output drivers, but does not detect open circuits. It is possible to
connect a second accelerator in parallel with the first,
with the second accelerator's outputs disabled by assertion of signal SLAVE. The second accelerator detects
open-circuit signals, and provides a check of the outputs
of the first accelerator.

RESET

Am29000
RESET

Am29027
RESET

RfiJ

RfiJ

DREQ

DREQ

DREQT,

DREQT,

DREQTo

DREQTo

OPT2

OPT2

OPT,

OPT,

OPTo
BINV

OPTo
BINV
CDA

CDA
DRDY

DRDY

DERR

DERR

Interrupt
or Trap
A3,-Ao
D3t-Do

----------------32

EXCP
S3''-SO
R3,-RO
b,-Io

OE

F3t-Fo
SYSClK

ClK

INClK
09114-004C

System
Clock

Figure 2. Am29000/Am29027 Hardware Interface

1-128

Am29027

Special-Purpose Registers
The Am29027 contains six special-purpose registers:
the mode register, status register, flag register, precision register, instruction register, and I-Temp register.
Mode Register
The 64-bit mode register stores 24 infrequently changed
parameters pertaining to accelerator operation; its format is shown in Figure 3. The Am29000 modifies the accelerator parameter set by issuing a write mode register
transaction request.
The mode register should be initialized after hardware
reset, and may be written with new parameters when a
new mode of accelerator operation is required; mode
changes take effect immediately. The Am29027 does
not alter the contents of the mode register in the course
of operation.
Bits 63-47-Reserved for future use. This field must
be set to 0 to assure future compatibility.
Bit 46-EXCP Enable (EX): When EX is High, reporting of unmasked exceptions via signal EXCP is enabled.
When EX is Low, signal EXCP is forced inactive (logic
High).
Bit 45-Halt On Error Enable (HE): When HE is High,
the Am29027 will halt operation in the presence of an
unmasked exception.
Bit 44-Advance DRDY (AD): When AD is High, signal
DRDYis advanced one cycle in flow-through mode. This
bit has no effect in pipeline mode.
Bits 43-40-Timer Count for the MOVE P Operation
(MVTC): In flow-through mode, MVTC specifies the
number of clock cycles needed for data to traverse the
ALU for base operation code MOVE P; in pipeline mode,
it has no effect. This field can assume values between 3
and 15, inclusive.
Bits 39-36-Timer Count for the Multiply-Accumulate Operation (MATC): In flow-through mode,
MATC specifies the number of clock cycles needed for
data to traverse the ALU for base operation code
F' =(P'x 01 + T'; in pipeline mode, it has no effect. This
field can assume values between 3 and 15, inclusive.
Bits 35-32-Plpellne Timer Count (PLTC): In flowthrough mode, PLTC specifies the number of clock cycles needed for data to traverse the ALU for any base
operation code except F' = (P' x 01 + T' or MOVE P; in
pipeline mode, it specifies the number of cycles needed
for data to traverse a single pipeline stage for any base
operation code. This field can assume values between 3
and 15, inclusive, in flow-through mode, and between 2
and 15, inclusive, in pipeline mode.
Bits 31-28-Reserved for future use. This field must
be set to 0 to assure future compatibility.
Bit 27-Zero Result Exception Mask (ZMSK): When
ZMSK is High, the status register zero result exception

bit is masked and will not contribute to the detection of
an error condition.
Bit 26-lnexact Result Exception Mask (XMSK):
When XMSK is High, the status register inexact result
exception bit is masked and will not contribute to the detection of an error condition.
Bit 25-Underflow Exception Mask (UMSK): When
UMSK is High, the status register underflow exception
bit is masked and will not contribute to the detection of
an error condition.
Bit 24-0verflow Exception Mask (VMSK): When
VMSK is High, the status register overflow exception bit
is masked and will not contribute to the detection of an
error condition.
Bit 23-Reserved Operand Exception Mask (RMSK):
When RMSK is High, the status register reserved operand exception bit is masked and will not contribute to the
detection of an error condition.
Bit 22-lnvalld Operation Exception Mask (IMSK):
When IMSK is High, the status register invalid operation
exception bit is masked and will not contribute to the
detection of an error condition.
Bit 21-Reserved for future use. This bit must be set
to 0 to assure future compatibility.
Bit 20-Plpellne Mode Select (PL): When PL is High,
pipeline mode is selected; when PL is Low, flow-through
(unpipelined) mode is selected.
Bits 19-17-Reserved for future use. This field must
be set to 0 to assure future compatibility.
Bits 16-14-Round Mode Select (RMS): Selects one
of six rounding modes as follows:
RMS

Round Mode

o00
o0 1
o10
o11
100

Round to nearest (IEEE)
Round to minus infinity
Round to plus infinity
Round to zero
Round to nearest (DEC)
Round away from zero
Illegal value

101

11X

Additional information on round modes can be found in
Appendix B.
Bits 13-12-lnteger Multiplication Format Adjust
(MF): Selects the output format for integer multiplica. tion. The user may select either the MSBs orthe LSBs of
. an integer multiplication result, with optional format
adjust. MF is encoded as follows:
MF

Output Format

00
01

LSBs
LSBs. format-adjusted
MSBs
MSBs. format-adjusted

10
11

1-129

29K Family CMOS Devices
"Format-adjusted" indicates that the product is shifted
left one place before the MSBs or LSBs are selected.
Bit 11-lnteger Multiplication Signed/Unsigned
Select (MS): If MS is High, input operands for integer
multiplication operations are treated as two's complement numbers. If MS is Low, the input operands are
treated as unsigned numbers.
Bit 1O-Reserved for future use. This bit must be set
to 0 to assure future compatibility.
Bit 9-IBM Underflow Mask Enable (BU): If BU is
High, certain underflowed IBM operations will produce a
normalized result with a biased exponent increased by
128. If BU is Low, these operations will produce a final
result of true zero. BU affects only those operations that
produce a result in IBM format and that use the following
base operation codes:

F' = P' + T'
F' = P' x Q'
Compare P, T
F' = (P' x a') + T'

Convert Tto Alternate F.P. Format
Convert T from Alternate F.P.
Format
Scale Tto Floating-point by a

Bit a-IBM Significance Mask Enable (BS): If BS is
High, certain IBM operations having intermediate results of 0 will produce a final result of 0 with the
biased exponent unchanged. If BS is Low, these operations will produce a final result of true zero. BS affects
only those operations that produce a result in IBM
format and that use the F' =P' + a' and COM PAR E P, T
base operation codes.
Bit 7-IEEE Sudden Underflow Enable (SU): If SU is
High, all IEEE denormalized results are replaced by a 0
of the same sign; if SU is Low, the appropriate denormalized number will be produced. If IEEE traps are enabled (mode register bit TRP High), sudden underflow is
disabled.
Bit 6-1EEE Trap Enable (TRP): If TRP is High, IEEE
trapped operation is enabled; the Saturate Enable
(SAT) and Sudden Underflow (SU) bits are ignored. For
an underflowed result, the biased exponent is increased
by 192 (single precision) or 1536 (double precision),
with the significand unchanged. For an overflowed result, the biased exponent is decreased by a like amount

with the significand unchanged. If TRP is Low, IEEE
trapped operation is disabled. This bit affects only those
operations that produce a result in IEEE floating-point
format.
Bit 5-IEEE Affine/Projective Select (AP): If AP is
High, IEEE addition or subtraction operations having
infinite input operands are performed in affine mode; if
AP is Low, these operations are performed in projective
mode: In affine mode, it is permissible to add infinities of
like sign or subtract infinities of opposite sign, producing
an infinite result with the appropriate sign. In projective
mode these operations will produce an invalid operation
exception. This bit affects only those operations that
produce a result in IEEE floating·point format.
Bit 4-Saturate Enable (SAT): If SAT is High, overflowed results are replaced by the largest representable
value in the selected format of the same sign as the
overflowed result; if SAT is Low, the result produced depends on the overflow conventions for the selected
floating-point format. If IEEE traps are enabled (mode
register bit TR High), saturation is disabled for any
operation that produces a result in IEEE floating-point
format.
Bits 1-0 Primary Floating-Point Format (PFF),
Bits 3-2 Alternate Floating-Point Format (AFF): The
primary format is used as the source and destination format for all floating-point operations except conversions;
and as the'source or destination format for operations
that convert between floating-point and integer formats.
The alternate format is used as a source or destination
format in operations that convert one floating-point
format to another. Both the PFF and AFF fields are encoded as follows:
High
Bit

Low
Bit

0

0

IEEE

0

1

DEC F (Single), DEC D (Double)

0

DEC F (Single), DEC? G (Double)

1

IBM

Floating-point formats are discussed in further detail in
AppendixA.

63

47

31282726252423222120191716

.

z

X U V R I

M M M M M M
S S S S S S

K K K K K K

P
•

L

.

46

45

141312111098

R
M
S

M

F

M
S

.

Figure 3. Mode Register
1-130

Format

44

7

43

40 39

36 35

6543

B B S T A S
U S U R p A

P

T

32

o

2 1
A

F
F

P

F
F
09114-005C

Status Register
The status register contains operation exception status,
as well as the status of pending operands and operations; its format is shown in Figure 4. The Am29000 can
initialize or modify the contents of the status register by
issuing a write status transaction request, and can read
current status register contents by issuing a read status
transaction request or as part of a save state sequence.
All status register bits are initialized to a logic Low after
hardware reset.

,,\

("

ed
,,\

11 10 9 8

7

6 5

4

3

2

1 0

0 I S R E Z X U V R I
P V V V S E E E E E E
P A A A
X X X X X X

("

09114-OO6C

Figure 4. Status Register
Bits 31-11-Reserved for future use. This field must
be set to 0 when written to assure future compatibility.
Bit 10-0peratlon Pending (OPP): A logic High indicates that an operation awaits execution.
Bit 9-1-Temp Valid (IVA): A logic High indicates that
register I-Temp contains an instruction for a pending
operation.
Bit 8-S-Temp Valid (SVA): A logic High indicates that
register S-Temp contains an operand for a pending
operation.
Bit 7--R-Temp Valid (RVA): A logic High indicates that
register R-Temp contains an operand for a pending
operation.
Bit 6-Exceptlon Status (ES): A logic High indicates
that status register bits 0-5 contain an unmasked
exception.

Am29027
Bit 3-Underflow Exception Bit (UEX): A logic High
indicates that an operation result has underflowed the
destination format. Latches until cleared.
Bit 2-Overflow Exception Bit (VEX): A logic High in·
dicates that an operation result overflowed the destination format. Latches until cleared.
Bit 1--Reserved Operand Exception Bit (REX): A
logic High indicates that a reserved operand appeared
as an input operand to an operation orwas generated as
a result. Latches until cleared.
Bit O-Invalld Operation Exception Bit (lEX): A logic
High indicates that input operands are unsuitable forthe
operation performed (e.g., ooxO). Latches until cleared.
Flag Register
The flag register contains 7 flag bits that report exception or Boolean results for the most recently performed
operation; its format is shown in Figure 5. The remaining
25 register bits are reserved for future use. The
Am29000 can read the current flag register contents by
issuing a read flags transaction request.
Flag· register bits 6-0 correspond to Flag 6-Flag 0
(FLs-FLo).
These flags assume a meaning that is operation-dependent, as discussed in the Operation Flags section.
The flag register is made transparent in flow-through
mode.
76543210

,,\

..

erved
,'\
(

F F F F F F
L L L L L L
6 5 4 3 2 1

F
L
0

09114-OO7C

Figure 5. Flag Register

Bit 5-Zero Result Flag (ZEX): A logiC High indicates
that an operation produced a zero result. Latches until
cleared.
Bit 4-lnexact Result Bit (XEX): A logic High indicates
that an operation result had to be rounded to fit the destination format. Latches until cleared.

1-131

29K Family CMOS Devices
Precision Register
The precision register contains a bits that report the precision of operands stored in the register file; its format is
shown in Figure 6. Bit 0 (PRo) reports the precision of
register file location 0 (RFo), bit 1 the precision of location 1 (RF,), and so on. A logic High indicates a singleprecision value, logic Low a double-precision value.
The precision register also contains the Accelerator Release Level (ARL), an a-bit, read-only identification
number that specifies the accelerator version. The ARL
field occupies bits 31-24.
The remaining 16 bits of the precision word are reserved
for future use, and must be set to 0 when written to assure future compatibility.

~

876543210

,\

~:

served

R R R R R R R R
7 6 5 4 3 2 1 0
09114-OO8A

3130282724232019161514131211109876

F

S

F

S

Q

M

S

T I R
M P P

S

R R

S
I

P

5
I

Q

5

S
I

I

T

F

540

I

C

F 0

09114-009A

Figure 7. Instruction Register
Bit 31-Reglster File Enable (RF): Enables a write to
the register file. When RF is High, the operation result is
written to the register file location specified by RFS and
the resulting precision is written to the corresponding bit
of the precision register. When RF is Low, no write
is performed either to the register file or the precision
register.
Bits 30-28-Reglster file select (RFS): Selects the
register file location (RF7-RFo) to which the operation
result is to be written. If bit RF is Low, the value of RFS is
a "don't care."
Bits 27-24-Select for P Operand Multiplexer
(PMS): Selects the data input for the ALU P port.
Bits 23-20--Select for Q Operand Multiplexer
(QMS): Selects the data input for the ALU port.

a

1-132

Bits 11-10--Slgn Q (SIQ): Sign-change control forthe
ALU input.

a

Bits 9-8-Slgn T (SIT): Sign-change control for the
ALU Tinput.

The function of the instruction word fields is discussed in
further detail in the Accelerator Instruction Set section.

The instruction register contains a 32-bit instruction
word that specifies the ALU operation; its format is
shown in Figure 7.

P
M

Bits 13-12-Slgn P (SIP): Sign-change control for the
ALU P input.

Bits 4-O-Core Operation (CO): Specifies the core operation to be performed by the ALU.

Instruction Register, I-Temp Register

R

Bit 14-Result Precision (RPR): Precision of the ALU
output; single precision when High, double precision
when Low.

Bit 5-lnteger/Floating-polnt Select (IF): A logic Low
selects a floating-point operation, a logic High an integer
operation.

Figure 6. Precision Register

R

Bit 15-lnput PreCision (IPR): Precision of the operands in Registers Rand S; single preCision when High,
double precision when Low.

Bits 7-6-Slgn F (SIF): Sign-change control for the
ALU output.

P P P P P P P P

")

Bits 19-16-Select for T Operand Multiplexer (TMS):
Selects the data input for the ALU T port.

The I-Temp register has a format identical to that of the
instruction register; this register is used to temporarily
buffer instructions for pending operations, thus allowing
the overlap of operation specification and execution.
The Am29000 can write to the instruction and I-Temp
registers by issuing the write instruction transaction
request, and can read the contents of these registers as
part of the save state sequence.

Operand Registers
The Am29027 holds operands in thirteen 64-bit registers. Four registers-R, S, R-Temp, and S-Tempstore ALU input operands; a fifth register, F, stores ALU
results. 'Eight remaining registers, RF7-RFo, are arranged as a file into which operation results can be
written, and from which operands can be taken for use in
subsequent operations.
All operand registers share common data formats; any
register can hold a single- or double-precision floatingpoint number, or a single- or double-precision integer.
Floating-point numbers are stored with the sign bit in the
most significant bit (bit 63) of the operand register. For
Single-precision numbers, the 32 LSBs of the register
are unused; the value of these unused bits is a "don't
care."
Integer numbers are stored with the least significant bit
placed in the least significant bit (bit 0) of the operand

Am29027

register. For single-precision numbers, the 32 MSBs of
the register are unused; the value of these unused bits is
a "don't care." Floating-point and integer formats are described in further detail in Appendix A.

Accelerator Transaction Requests
The Am29000 controls the Am29027 with 13 transaction requests. Transaction request type is indicated by
the state of four signals: Rm and OPT2-OPTo. Table 1
lists the transaction types and corresponding signal
states.
Transaction requests are conditioned by signal
DREOT 1 (which when High indicates an accelerator
transaction) and signals BINV and DREO. The
Am29027 will recognize a transaction request only if
DREOTI and BINV are High and DREQ is Low.
Signal DREOT0modifies the execution of most transaction requests. For transaction requests that transfer
operands or instructions to the Am29027, asserting
DREOTo will start the execution of an accelerator
operation. For transaction requests that transfer operation results, status, or flags to the Am29000, asserting
DREOTo will suppress the reporting of unmasked
exceptions via signal DERR. For the write status transaction request, asserting DREOTo either retimes the operation currently described by the instruction register
(flow-through mode) or invalidates the AlU pipeline
(pipeline mode).
Write Transaction Requests
Write transactions transfer data from the Am29000 to
the Am29027, or cause the Am29027 to transfer data
internally. To perform a write request, the Am29000:
•

Issues the appropriate transaction request on
Signals OPT2-0PTo, and asserts Signal Rm Low

•

Places the data to be transferred, if any, on output
signals 031-00 and A31-Ao
The Am29027 responds to the request by asserting one
(and only one) of two status signals:

•

CDA indicates that the Am29027· will take the
specified action and clock in the data accompanying the transaction request, if any, on the next
rising edge of clock.
• DERR indicates that the Am29027 is unable to
accept the data, due to the presence of an
unmasked exception.
Timing for write transactions is illustrated in Appendix D.

Table 1. Transaction Requests

RlW OPT2
0
0
0
0
0
0
0
0

OPT,

OPTD Request Type

0
0
0
0

0
0

0

1
1

0

1
1

0
0
1
1

0
1
0
1

0
0
0
0

0
0

0

1
1

0

1

0

0

1

1

1
1

1
1

Write Operand R
Write Operand S
Write Operands R, S
Write Mode
Write Status
Write RF Precisions
Write Instruction
Advance Temp Registers
Read Results MSBs
Read Results LSBs
Read Flags
Read Status
Save State

There are eight write transactions:
Write Operand R: An operand is written to Input Register Rand/or R-Temp. The most significant half of the
64-bit operand to be written is placed on Input Bus R, the
least significant half on Input Bus S. The action taken
depends on signal DREOTo and on whether an accelerator operation will be in progress during the next clock
cycle.

DREQTD
asserted
No
Yes
Yes

Operation
In progress Data
next
written
clock cycle
to
X

No
Yes

R-Temp
valid bit

R-Temp
Set
R-Temp, R Reset
R-Temp
Set

Operation
pending
bit
Unchanged
Reset
Set

If DREOTo is asserted and no accelerator operation will
be in progress during the next clock cycle, a new operation will be started on the next rising edge of ClK.
If mode register bit HE (Halt On Error Enable) is High
and an unmasked exception has been detected, the
Am29027 will respond to a write operand R request by
asserting signal DERR; the contents of Registers Rand
R-Temp will not be changed, and the R-Temp Valid and
Operation Pending bits will retain their current values.
Write Operand S: An operand is written to Input Register Sand/or S-Temp. The most significant half of the
64-bit operand to be written is placed on Input Bus R,
the least significant half on Input Bus S. The action taken
depends on signal DREOT0 and on whether an accelerator operation will be in progress during the next clock
cycle.

1-133

29K Family CMOS Devices
Operation
In progress Data
written
DREQTo
next
to
asserted clock cycle
No
Yes
Yes

X
No
Yes

S-Tem~

valid b t

S-Temp
Set
S-Temp, S Reset
S-Temp
Set

Operation
pending
bit
Unchanged
Reset
Set

If DREQTo is asserted and no accelerator operation will
be in progress during the next clock cycle, a new operation will be started on the next rising edge of ClK.
If mode register bit HE (Halt On Error Enable) is High
and an unmasked exception has been detected, the
Am29027 will re~ to a write operand S request by
asserting signal DERR; the contents of Registers Sand
S-Tempwill not be changed, and the S-Temp Valid and
Operation Pending bits will retain their current values.

Write Operands R, S: Two 32-bit operands are written
to Registers Rand S and/or Registers R-Temp and STemp. The 32-bit operand to be written to Registers R or
R-Temp is placed on Input Bus R; the 32-bit operand to
be written to Registers S or S-Temp is placed on Input
Bus S. Each 32-bit word is written to both the upper and
lower halves of the target register. The action taken
depends on Signal DREQTo and on whether an accelerator operation will be in progress during the next clock
cycle.

DREQTo
asserted

Operation
In progress Data
R-,Snext
written
Temp
clock cycle
valid bits
to

No

X

Yes

No

Yes

Yes

R-Temp
S-Temp
R-Temp
S-Temp
R,S
R-Temp
S-Temp

Operation
pending
bit

Set

Unchanged

Reset

Reset

Set

Set

If DREQTo is asserted and no accelerator operation will
be in progress during the next clock cycle, a new operation will be started on the next rising edge of ClK.
If mode register bit HE (Halt On Error Enable) is High
and an unmasked exception has been detected, the
Am29027 will respond to a write operands R, S request
by asserting Signal DERR; the contents of Registers R,
R-Temp, S, andS-Tempwill not be changed, and the RTemp Valid, S-Temp Valid, and'Operation Pending bits
will retain their current values.
Write Mode: A 64-bit word is written to the mode register. The least significant half of the mode word is placed
on Input Bus R, the most significant half on Input Bus S.
The state of signal DREQTo is a "don't care" for this
transaction request.

1-134

Write Status: A 32-bit word is written to the status register and the status word to be written is placed on Input
Bus R. Asserting signal DREQTo will produce an additional action that is mode-dependent. In flow-through
mode, asserting DREQTo will cause the operation currently specified by the instruction register to be retimed;
operation results will not be written to the status register
orthe register file. In pipeline mode, asserting DREQTo
will invalidate the ALU pipeline.
Write Register File Precisions: A 32-bit word indicating the precisions of register file locations RF7-RFo is
written to the preciSion register; the preCision word to be
written is placed on Input Bus R. The state of signal
DREQTo is a "don't care" for this transaction request.
Write Instruction: A 32-bit accelerator instruction is
written to the instruction register and/or Register 1Temp. The 32-bit instruction is taken from input signals
131-10. The action taken depends on signal DREQTo, and
on whether an accelerator operation will be in progress
during the next clock cycle.

DREQTo
asserted

Operation
In progress Data
next
written
clock cycle
to

No
Yes

X
No

Yes

Yes

I-Temp
I-Temp
instruction
register
I-Temp

I-Temp
valid bit

Operation
pending
bit

Set
Reset

Unchanged
Reset

Set

Set

If DREQTo is asserted and no accelerator operation will
be in progress during the next clock cycle, a new operation will be started on the next rising edge of elK.

If mode register bit HE (Halt On Error Enable) is High
and an unmasked exception has been detected, the
Am29027 will respond to a write instruction transaction
request by asserting signal DERR; the contents of Register I-Temp and the instruction register will not be
changed, and the I-Temp Valid and Operation Pending
bits will retain their current values.
Advance Temp Registers: The contents of the RTemp, S-Temp, and I-Temp registers are transferred to
Register R, Register S,and the instruction register, respectively. The state of signal DREQTo is a "don't care"
forthis transaction request. The advance temp registers
transaction request is used during restoration of accelerator state.
Read Transaction Requests
Read transactions transfer data from the Am29027 to
the Am29000. When data is to be transferred, the
Am29000:

Am29027
•

Issues the appropriate transaction request on
signals OPT~OPTo, and asserts signal R/W High.

•

Places its data bus drivers in a high-impedance
state.

The Am29027 then places the requested data on signals F31-Fo and issues two status signals:
•

DRDY indicates that the data requested is available
on Output Bus F31-Fo.

•

DERR indicates that the Am29027 has detected an
unmasked exception; the exception mayor may not
be related to the data requested.

DRDYand DERR may both be active at the same time;
if so, the Am29000 will respond to DERR and ignore
DRDY.
Timing for read transactions is illustrated in Appendix D.
There are five read transactions:
Read Result MSBs: The 32 MSBs of Register Fare
placed on output bus F. Asserting signal DREOTo will
suppress the reporting of unmasked exceptions.
Read Result LSBs: The 32 LSBs and 32 MSBs of
Register F are placed on Output Bus F in consecutive
clock cycles. Asserting signal DREOTo will suppress the
reporting of unmasked exceptions. The read result
LSBs request must always be followed by a read result
MSBs request.
Read Flags: The flag register contents are placed on
Output Bus F; bits F31-F7 will be logic Low. Asserting
signal DR EOTo will suppress the reporting of unmasked
exceptions.
Read Status: The status register contents are placed
on Output Bus F; bits F31-Fll will be logic Low. Asserting
Signal DREOTowill suppress the reporting of unmasked
exceptions.
Save State: The contents of the instruction register,
mode register, status register, register file, precision
register, and Registers R, R-Temp, S, S-Temp, and 1Temp are transferred to the Am29000 via Output Bus F.
Exception reporting via Signal DERR is suppressed; the
state of signal DRETOo is a "don't care." Further details
on the use of this request appear in the Saving and Re· storing State sections.
Coprocessor Data Accept
The Coprocessor Data Accept (CDA) Signal indicates to
the Am29000 that the Am29027 is able to accept new
operands or instructions. CDA is normally Low (active),
but will go High if:
•

the Am29027 has an operation currently in
progress and a completely specified pending
operation waiting in the temporary registers,

or
•
The Am29027 has halted in response to an
unmasked exception (Halt On Error mode enabled).

If the Am29027 issues any write transaction request and
CDA is active Low, the transaction request will complete
in a single cycle. If CDA is High, response to a write
transaction request depends on request type:
•

For the write operand R, write operand S, write
operands R, S, and write instruction transaction
requests, the Am29027 will assert CDA active when
it is able to accept new data. If it is not able to accept
new data indefinitely due to presence of an
unmasked exception (Halt On Error mode enabled),
it will respond to the transaction request by
asserting signal DERR.

..

For the write mode, write status, write register file
preciSions, and advance temp registers transaction reguests, the Am29027 will temporarily
assert CDA during the cycle after the request is
issued, regardless of whether an operation is in
progress or an unmasked exception has halted the
accelerator.

CDA pertains only to write transaction requests; for read
transaction requests, the Am29000 ignores the state of
CDA.
Data Ready
The Data Ready (DRDY) signal indicates to tho
Am29000 that the Am29027 is placing data on the F output bus. The Am29027 generates DRDY in response to
the read result MSBs, read result LSBs, read status,
read flags, and save state transaction requests.
For the read result MSBs, read result LSBs, read flags,
and read status transaction requests, there is usually a
minimum of one cycle delay between the time the
request is issued and the time that DRDY is asserted.
The only exception to this rule is when a read result
LSBs request is immediately followed by a read result
MSBs request, in which case the Am29027 responds to
the second request in a single cycle. If the Am29027 is
unable to respond immediately to a read transaction
request, as m~the case when an operation is in
progress, the DRDY signal will be held inactive until
such a time as the requested data can be output. Forthe
save state transaction request, the delay between the
issuance of the transaction request and the DRDY response varies according to the specific data requested.
DRDY pertains only to read transaction requests; for
write transaction requests, DRDY remains inactive.
Data Error
The Data Error (DERR) signal indicates to the Am29000
that the Am29027 is unable to respond to a transaction
request normally, due to the presence of an unmasked
exception bit in the status register.
For read transaction requests, read result LSBs, read
result MSBs, read flags, and read status, the Am29027
asserts DERR if the status register contains an unmasked exception bit. The Am29000 may suppress
1-135

29K Family CMOS Devices
error reporting for these requests by issuing them with
signal DREOT0 asserted.
For write transaction requests, write operand R, write
operand S, write operands R, S, and write instruction,
DERR is issued in the presence of an unmasked exception if Halt On Error Mode is enabled in such an event,
the contents of the target registers are left unchanged.
DERR is never issued in response to transaction requests write mode, write status, write register file precisions, advance temp registers, and save state.

Accelerator Instruction Set
The ALU performs 57 arithmetic and logic instructions.
Input operands for these instructions can be taken from
Input Registers Rand S, register file locations RF7-RFo,
and on-board constant stores. At the user's option,
results can be stored in register file locations RF7-RFo.
Instruction Word

The 32-bit instruction word, IN31-INo, specifies the operation to be performed by the ALU. The instruction
word is stored in the instruction register; instruction registerformat is shown in Figure 7.ln flow-through mode,
the instruction word specifies the operation to be performed by the entire ALU.ln pipeline mode, the instruction word specifies the operation to be performed by the
first pipeline stage; the remaining pipeline stage or
stages r;lre controlled by their respective pipeline registers. The instruction word also specifies input operand
sources, result destination, and operand precisions.
An instruction word comprises five sections: base operation code, sign-change selects, operand precision
selects, operand source selects, and register file
controls.
Base Operation Code

The base operation code consists of the core operation
field (CO), which specifies the type of operation to be
performed, and the integerlfloating-point select bit (IF),
which specifies whether the operation is integerorfloat~
ing-point. Available base operation codes and the corresponding values for CO and IF are listed in Table 2. Note
that the value of IF is a "don't care" for base operation
code MOVE P.
Sign-Change Selects

Each ALU input and output port has associated hardware that can be used to modify operand signs (see Fig-

1·136

ure 8). These sign-change blocks, when applied to base
operations, greatly increase the number of available
operations. The base operation code F' = P' + T', for
example, can be used to perform operations such as
P - T, ABS(P) + ABS(T), ABS(P + T), and others, simply
by modifying the signs of the input and output operands.
The SIP, SIO, and SIT instruction word fields control the
sign-change blocks for the P, a, and T input operands,
respectively; the SIO and SIF fields control the sign
change block for output operand F.
Using the sign-change blocks, the sign of an input operand may be left unchanged, inverted, set Low, or set
High; the sign of the output operand may be left unchanged, inverted, set Low, set High, set to the sign of
the P input operand, or set to the sign of the T input operand. Select codes for the P, a, T, and F sign-Change
blocks are shown in Tables 3,4,5, and 6, respectively.
Operand PreCision Selects

The Am29027 supports mixed-precision operations; it is
possible, for example, to perform an operation having
single-precision inputs and a double-precision output,
or one single- and one double-precision input, or any
other combination.
The precision of the operands in Registers Rand S
is specified by instruction bit IPR, which is logic High for
single-precision operands and logic Low for double-precision operands. Note that the operands in the Rand S
registers must have the same precision if they are to be
used in the same operation. This restriction does not
preclude performing an operation with mixed-precision
input operands, as there are no restrictions on the precisions of operands stored in the register file. The precision of each operand stored in the register file is
recorded in the preCision register; this precision information is automatically supplied to the ALU when a
register file location is specified as an input operand to
an operation.
The precision of an operation result is specified by instruction bit RPR, which is set High for a single-precision
result, and Low for a double-precision result. Should the
instruction word specify that the result is to be written to
the register file (instruction bit RF High), the resulting
precision will be written to the appropriate precision register bit when the result is written to the register file.

Am29027
Table 2. Operation Codes
CO

IF
INs

I~

IN3

INz

IN!

INo

0
0
0
0
0
0

0
0
0

0
0

0

0
1

0
0
0
0
1
1
1
1
1

1
1
1
1
0
0

0

0
0

0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
1

1
1
0

1
0

INs

I~

IN3

INz

IN!

INo

1

1
1
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
1
1

0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
0

1

0
0
0
0

0
0
1
1
0

INs

I~

X

1

0
0
0
0
0

1
1

0

0
0

0

1
1
1
1
0
0
0
0

0
1
1
0
1
1
0
0

0

1
1
0
0

0

1
0
1
0
1
0
1
0

0

1
0
1
0
1
0

1
0
1
0
1
0
1

1
1

1
1
0
0

IN3

INz

IN!

INo

1

0

0

0

Base Operation Code (Floating-Point)
F .. P
F.P'+T'
F' -p'xa'
Compare P, T
Max, P, T
MinP, T
Convert T to Integer
Scale T to Integer by a
F - (P' x a') + T'
Round T to Integral Value
Reciprocal Seed of P
Convert T to Alternate F. P. Format
Convert T from Alternate F. P. Format
Base Operation Code (Integer)
F .. P
F .. P+T

f .. Px a
Compare P, T
Max P, T
MinP, T
Convert T to Floating-Point
Scale T to Floating-Point by a
F= PORT
F .. PANDT
F- PXORT
Shift P Logical a Places
Shift P Arithmetic a Places
Funnel Shift PT Logical a Places
Base Operation Code (Special)
MOVEP

1-137

29K Family CMOS Devices
P

T

Q

09114-010C

F

Figure 8. ALU Sign-Change Blocks

Table 5. Select Codes for T Operand
Sign-Change Block

Table 3. Select Codes for P Operand
Sign-Change Block

SIT

SIP

IN.

INa

SIGN(P)
SIGN(P)

o
o

o

SIGN(T)
SIGN

0
1

1

o

o

INn

IN12

SIGN (P')

o

0
0
1

o
1

SIGN (T')

('n

Table 6. Select Codes for F Operand
Sign-Change Block

Table 4. Select Codes for Q Operand
Sign-Change Block

SIQ

SIQ
INn
0

IN10

SIGN (Q')

0

SIGN(Q)
SIGN (0)

0

1

1

0
1

0

IN11

IN 10

F'= P
(Floating-Point)

0

F = P (Integer)

0

QB
Maximum P, T
OR
Minimum P, T

A!1..Q1llill Base
Operations

1-138

SIF

Base Operation

IN7

INs

SIGN(F)

X

0

0

SIGN(F')

X
X
X

0

1

SIGN(F')

0
0

1
1

0
1

0
1

X
X

X
X

SIGN(P)
SIGN(T)

1

0

1

1

X
X
X
X

X
X
X
X

0

0

SIGN(F')

0

1

SIGN(F')

1
1

0
1

0

1

Am29027
Operand Source Selects
Instruction fields PMS, OMS, and TMS specify the
select codes for the P, 0, and T operand multiplexers,
respectively; these codes are summarized in Table 7.
The P, 0, and T operand multiplexers can independently select Register R, Register S, register file locations RF7-RFo, or one of six predefined constants. For
operations with floating-point inputs, constants 0,0.5, 1,
2, 3, and pi are available; for operations with integer inputs, constants 0, -1, 1, 2,3, and _(2 63 ) are available.
These constants are supplied to the ALU as double-precision numbers, independent of the precisions specified
for other input and result operands. Hexadecimal values
for the constants are listed in Table 8.
Register File Controls
Instruction fields RF and RFS control the storing of operation results in the register file. If registerfile enable bit
RF is High, the result of the operation specified by the
instruction word will be stored in register file location
RFS, where RFS is a number from 7 to 0; the precision
of the result, as specified by the RPR bit, will be written
to the appropriate bit in the precision register. If RF is
Low, the operation result is written to neither the register
file nor the preCision register.

some base operation codes, sign-change control settings SIP, SIO, SIT, and SIF are completely arbitrary;
for others, only the sign-change field values shown in
Table 9 are valid. Table 10 summarizes permissible
sign-change field values for each base operation code.
Table 7. Select Codes for P, 0, and T
Operand Multiplexers

PMS IN v IN26 IN25 IN24
OMS IN23 IN22 IN21 IN20
TMS IN,B IN,8 IN17 IN,6

P
0
T
Register R

0

0

0

0

0

0

0

1

Register S

0

0

1

0

o (Zero)

0

0

1

1

0.5 (F.P.) - 1(integer)

0

1

0

a

1

0

1

0

1

2

0

1

1

a

3

a

1

1

1

1t

1

0

0

a

RFo

1

0

0

1

RF,

1

0

1

0

RF2

Accelerator Operations

1

a

1

1

RF3

Table 9 illustrates a number of possible ALU instructions
and corresponding values for instruction word fields
SIP, SIO, SIT, SIF, IF, and CO. Note that the remaining
instructionfields-RF, RFS, PMS, OMS, TMS,IPR, and
RPR-can be specified independently.

1

1

0

0

RF.

1

1

0

1

RF5

1

1

1

a

RF6

1

1

1

1

RF7

(F.P.) - 263 (integer)

The user may create additional instructions using
instruction words other than those listed in Table 9. For

1-139

29K Family CMOS Devices
Table 8. Hexadecimal Values for On-Chip Constants
IEEE Floating-Point Constant

0000000000000000

0.5

3FEOOOOOOOOOOOOO
3FFOOOOOOOOOOOOO

1

2
3

4000000000000000
4008000000000000

1t

400921FB54442D18

DEC D Floating-Point Constant

o

Hexadecimal Representation

0.5

0000000000000000
4000000000000000

1
2
3

4080000000000000
4100000000000000
4140000000000000

1t

41490FDAA22168C2

DEC G Floating-Point Constant

o
0.5
1

Hexadecimal Representation

0000000000000000
4000000000000000
4010000000000000

2

4020000000000000

3

4028000000000000

1t

402921FB54442D18

IBM Floating-Point Constant

Hexadecimal Representation

o

0000000000000000

0.5

4080000000000000

1

4110000000000000
4120000000000000

2
3

4130000000000000

1t

413243F6A8885A31

Integer Constant

o
-1

1
2

1-140

Hexadecimal Representation

o

Hexadecimal Representation

0000000000000000
FFFFFFFFFFFFFFFF
0000000000000001

3

0000000000000002
0000000000000003

-263

8000000000000000

Am29027
Table 9. Instruction Words for Typical ALU Operations
Operation

SIP

SIQ

SIT

SIF

IF

CO

FPP
FP-P
FPABS(P)
FP Sign(T) x ABS(P)

00
00
00
00

00
00
00
11

xx

XX
XX
XX

00
01
10

XX

00000
00000
00000
00000

FPP + T
FPP-T
FPT-P
FP-P- T
FPABS(P + T)
FPABS(P- T)
FP ABS(P) + ABS(T)
FP ABS(P) - ABS(T)
FP ABS[ABS(P) - ABS(T)]

00
00
01
01
00
00
10
10
10

XX
XX
XX
XX
XX
XX
XX
XX
XX

00
01
00
01
00
01
10
11
11

00
00
00
00
10
10
00
00
10

a
a
a
a
a
a

FPPxO
FP (-P) xO
FPABS(PxO)

00
01
00

00
00
00

xx
XX
XX

00
00
10

FP Compare P, T

00

XX

01

00

a

00011

FP Max P, T
FP Max ABS(P), ABS(T)

00
10

00
00

01
11

00
00

0
0

00100
00100

FP Min P, T
FP Min ABS(P), ABS(T)
FP Limit P to Magnitude T

01,
11
11

00
00
10

00
10
10

00
00

XX

0
0
0

00101
00101
00101

FP Convert T to Integer

XX
XX

XX

00

00

0

00110

00

00

00

0

00111

00
00
00
00
10
10
10

00
00
01
01
10
10
11

00
00
00
00
00
00
00

0
0
0

FP ABS(P x 0) - ABS(T)

00
01
00
01
10
11
10

0
0

01000
01000
01000
01000
01000
01000
01000

FP Round T to Integral Value

XX

00

a

01001

00

XX
XX

00

FP Reciprocal Seed (P)

XX

00

0

01010

FP Convert T to Alternate
Floating-Point Format

XX

XX

00

00

0

01011

XX

XX

00
00
00
00

00
00
00
11

00
00
00
00
00

00
00
01
10

0
1
1
1

XX

01100
00000
00000
00000
00000

int P + T
int P- T
int T - P
int ABS(P + T)
int ABS(P - T)

00
00
01
00
00

XX
XX
XX
XX
XX

00
01
00
00
01

00
00
00
10
10

00001
00001
00001
00001
00001

int P x 0

00

00

XX

00

00010

int Compare P, T

00

XX

01

00

00011

int Max P, T

00

00

01

00

00100

int Min p, T

01

00

00

00

00101

FP Scale T to Integer by 0
FPT + PxO
FPT-PxO
FP-T + PxO
FP-T-PxO
FP ABS(T) + ABS(P x 0)
FP ABS(T) - ABS(P x 0)

FP Convert T from Alternate
Floating-Point Format
int P
int-P
intABS(P)
int Sign(T) x ABS(P)

0

a
a
a
a
a
a
a
a
0

a
a

00001
00001
00001
00001
00001
00001
00001
00001
00001
00010
00010
00010

1-141

29K Family CMOS Devices
Table 9. Instruction Words for Typical ALU Operations (continued)
Operation

SIP

SIQ

SIT

SIF

IF

CO

int Convert T to Float

XX

XX

00

00

00110

int Scale T to Roat by Q

XX

00

00

00

00111

int PORT

XX

XX

XX

XX

10000

int PAND T

XX

XX

XX

XX

10001

intPXORT
int NOT T (see Note 1)

XX
XX

XX
XX

XX
XX

XX
XX

10010
10010

int Shift P Logical Q Places

00

00

XX

00

10011

int Shift P Arithmetic Q Places

00

00

XX

00

int Funnel Shift PT Q Places

00

00

00

00

1

10101

MOVEP

XX

XX

XX

XX

X

11000

10100

Note 1. NOT T is performed by XORing T with a word containing all 1s (integer - 1). When invoking NOT T the user must set
instruction field PMS to 0011 2, thus selecting integer constant -1.

Table 10. Allowable Sign-Change Combinations
IF

0
0
0
0
0
0
0
0
0
0
0
0
0

X
Key:

1-142

CO

Operation

SIP

SIQ

SIT

SIF
V

00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100

FP F' .. P

F

V

X

FP F' = P' + T'

V

X

V

V

FP F'=P'x Q'
FP Compare P, T
FP Max P, T
FP Min P, T
FP Convert T to Integer
FP Scale T to Integer

V
F
F
F
X
X

V
X
F
F
X
F

X
F
F
F
F
F

V
F
F
F
F
F

FP F' =(P' x Q') + T
FP Round T
FP Reciprocal Seed P
FP Convert T to Alt Format
FP Convert T from Aft Format

V
X
F
X
X

V
X
X
X
X

V
F
X
F
F

V
F
F
F
F

00000
00001
00010
00011
00100
00101
00110
00111
10000
10001
10010
10011
10100
10101

int F= P
int F = P+ T

F
F

F
X

F
F

F
F

int F= PxQ
int Compare P, T
int Max P, T
int Min P, T
int Convert Tto F.P.
int Scale T to F.P.
int F .. PORT
int F= PANDT
int F =P XORT
int Shift P Logical
int Shift P Arithmetic
int Funnel Shift PT

F
F
F
F
X
X
X
X
X
F
F
F

F
X
F
F
X
F
X
X
X
F
F
F

X
F
F
F
F
F
X
X
X
X
X
F

F
F
F
F
F
F
X
X
X
F
F
F

11000

MOVEP

X

X

X

X

V = Variable; user can specify arbitrary sign change.
F .. Fixed; user is restricted to sign.chan~e combinations shown in Table 9.
X .. Don't care; this field does not affect t e operation or its result.

Am29027
Base Operation Code Description

F' = P (Floating-Point): The P-operand is passed
through the ALU unchanged, except for any specified
precision conversions. If the user specifies different input and output precisions, the operation may be used to
perform single-to-double or double-to-single conversions.lnstructions such as negation, absolute value extraction and sign transfer may be executed by setting
the sign-change controls appropriately while executing
this base operation.

=

F' P' + T' (Floating-Point): The two operands P' and
T' are added, taking into account any specified precision
conversions. Instructions such as subtraction, sum-ofabsolute-values, difference-of-absolute-values, absolute-value-of-sum, and absolute-value-of-difference
may be executed by setting the sign-change controls
appropriately while executing this base operation.

=

F' P' x Q' (Floating-Point): The operands P' and A'
are multiplied, taking into account any specified precision conversions. Instructions such as negative-product
and absolute-value-of-product may be executed by setting the sign-change controls appropriately while executing this base operation.
Compare P, T (Floating-Point): The two operands P
and T are compared, taking into account any specified
precision conversions. The output of the operation is the
result of the subtraction (P - T). The flags are set appropriately to indicate the result of the comparison, conforming to the relevant parts of the floating-point
standards. For IEEE and DEC operations, one of four
flags (greater than, less than, equal to, or unordered) is
set for any given compare operation. For IBM operations, the unordered flag does not apply since the format
does not support reserved operands.
Maximum P, T (Floating-Point): The two operands P
and T are compared, taking into account any specified
preCision conversions. The most positive operand is selected as the output. The Winner flag indicates which of
the operands is selected. Additionally, the operation
maximum-of-absolute-value may be performed by setting the appropriate sign-change controls.
Minimum P, T (Floating-Point): The two operands P
and T are compared, taking into account any specified
precision conversions. The most negative operand is
selected as the output. The Winner flag indicates which
of the two operands is selected. Additionally, the operations minimum-of-absolute-values and limit-P-to-magnitude-T may be performed by setting the appropriate
sign-change controls. The limit-P-to-magnitude-T operation is useful for clipping a sequence of operands to
ensure that their magnitude never exceeds a preset
limit.
Convert T to Integer (Floating-Point): The operand T
is converted from floating-point representation to two's
complement integer representation, taking into account
the specified precision of the floating-point operand. If
the output precision is specified as single, the result is a

32-bit integer. If the output precision is specified as double, the result is a 64-bit integer.
Scale T to Integer by Q (Floating-Point): The operand
T is converted from floating-point representation to
two's complement integer representation, using the
exponent of the floating-point operand
as a scale
factor and taking into account the specified precision of
the floating-point operands. The unbiased exponent of
the operand is added to the exponent of the operand
T, permitting IEEE and DEC operands to be multiplied
by any power of 2, and IBM operands by any power
of 16, before the conversion is performed. If the output
precision is specified as single, the result is a 32-bit integer. If the output precision is specified as double, the
result is a 64-bit integer.

a

a

=

F' (P'x Q') + T' (Floating-Point): The operands P' and
0' are multiplied, producing a double-precision product.
This product is added to the operand T', taking into account any specified precision conversions. Instructions
such as P x
T, T - P X 0, ABS (P x 0) + ABS(T) and
ABS(P x
+ T) may be executed by setting the signchange controls appropriately while executing this base
operation.

a

a-

Round T to Integral Value (Floating-Point): The floating-point operand T is rounded to an integer-valued
floating-point operand, using the speCified rounding
mode and taking into account any specified precision
conversions. As an example, the operation converts a
floating-point representation of Pi (3.14159 ... ) to a
floating-point representation of 3.0 or 4.0, depending on
the rounding mode selected. The final result of the operation is a floating-point number.
Reciprocal Seed of P (Floating-Point): An approximation to the reciprocal of the operand P is evaluated,
taking into account any specified precision conversions.
The reciprocal seed comprises an accurate sign, a fullyaccurate exponent and a mantissa that is accurate to
only one place. This operation can be used as the initial
step in performing Newton-Raphson division; optionally, an external seed look-up table can be used for
faster convergence.
Convert Tto Alternate Floating-Point Format (Floating-Point): The floating-point operand T, assumed to
be in the primary floating-point format, is converted to a
floating-point operand in the alternate floating-point
format, taking into account any specified precision
conversions.
Convert T from Alternate Floating-Point Format
(Floating-Point): The floating-point operand T, assumed to be in the alternate floating-point format, is
converted to a floating-point operand in the primary
floating-point format, taking into account any specified
precision conversions.

=

F P (Integer): The P-operand is passed through the
ALU unchanged except for any specified precision
conversions. If the user specifies different input and output precisions, the operation may be used to perform
1-143

29K Family CMOS Devices
single-to-double or double-to-single conversions. Instructions such as negation, absolute value extraction,
and sign transfer may be performed by setting the signchange control appropriately while executing this base
operation.
F = P + T (Integer): The two operands P and Tare
added, taking into account any specified precision
conversions. Instructions such as subtraction, absolutevalue-of-sum, and absolute-value-of-difference may be
performed by setting the sign-change controls appropriately while executing this base operation.
F = P x Q (Integer): The two operands P and 0 are multiplied, taking into account any specified precision conversions. Either 32-bit multiplication or 64-bit multiplication may be performed, and the user may select either
the MSBs or the LSBs of the product as the final result.
In addition, format-adjusting may be implemented if
required, and the operands may be considered as
signed (two's complement) or unsigned.
Compare P, T (Integer): The two operands P and Tare
compared, taking into account any specified precision
conversions. The output of the operation is the result of
the subtraction (P- T). The flags are set appropriately to
indicate the result of the comparison, one of three flags
(greater than, less than, or equal to) being set for any
given compare operation.
Maximum P, T (Integer): The two operands· P and
T are compared, taking into account any specified precision conversions. The most positive operand is selected
as the output. The Winner flag indicates which of the two
operands is selected.
Minimum P,T (Integer): The two operands P and Tare
compared, taking into account any specified precision '
conversions. The most negative operand is selected as
the output. The Winner flag indicates which of the two
operands is selected.
Convert T to Floating-Point (Integer): The operand T
is converted from two's complement integer representation to floating-point representation, taking into account
the specified precision of the integer operand. If the
output precision is specified as single, the result is a
32-bit floating-point operand. If the output precision is
specified as double, the result is a 64-bit floating-point
operand.
Scale T to Floating-Point by Q (Integer): The operand
T is converted from two's complement integer representation to floating-point representation, using the exponent of the floating-point operand 0 as a scale factor
and taking into account the specified precision of the integer operand. The unbiased exponent of the operand
Q is added to the exponent of the floating-point result,
permitting IEEE and DEC operands to be multiplied by
any power of 2, and IBM operands by any power of 16
after the conversion is performed. If the output precision
is specified as single, the result is a 32-bit floating-point
operand. If the output precision is specified as double,
the result is a 64-bit floating-point operand.
1-144

F = P OR T (Integer): The operand P is logically ORed
with the operand T. Before the operation is performed,
the inputs, if 32-bit, are sign-extended to 64 bits.
F = P AND T (Integer): The operand P is logically
ANDed with the operand T. Before the operation is performed, the inputs, if 32-bit, are sign-extended to 64 bits.
F = P XOR T (Integer): The operand P is logically exclusive-ORed with the operand T. Before the operation is
performed, the inputs, if 32-bit, are sign-extended to 64
bits. This operation may be used to invert an operand by
selecting the second operand to be the integer constant,
-1, so that all bits of this second operand are 1.
Exclusive-ORing an operand with -1 is equivalent to
inverting each bit in the operand.
Shift P Logical Q Places (Integer): This operation cannot be performed in mixed-precision mode. The precision of the result is the same as the precision of the input
operand P. A two's-complement shift length in the range
-64 to +63 (doiJble-precision) or -32 to +31 (single-precision) is extracted from the LSBs of the operand O. The
operand P is logically right-shifted by the number of
places specified by the shift length. A negative shift
length therefore produces a left-shift. If a right-shift is
performed, Os fill vacated bit positions to the left of the
input operand. If a left-shift is performed. Os fill vacated
bit positions to the right of the input operand.
Shift P Arithmetic Q Places (Integer): This operation
cannot be performed in mixed-precision mode. The precision of the result is the same as the precision of the input operand P. A two's-complement shift length in the
range -64 to +63 (double-precision) or -32 to +31 (single-precision) is extracted from the LSBs of the operand
O. The operand P is arithmetically right-shifted by the
number of places specified by the shift length. A negative shift length therefore produces a left-shift. If a rightshift is performed, the MSB (bit 63 or 31) is replicated to
fill vacated bit poSitions to the left of the input operand. If
a left-shift is performed, Os fill vacated bit positions to the
right of the input operand.
Funnel Shift PT Q Places (Integer): This operation
cannot be performed in mixed-precision mode. The operand T is interpreted as having the same precision as
the input operand p. and the precision of the result is
also the same as the preCision of the input operand P. A
two's-complement shift length in the range -64 to +63
(double-precision) or -32 to +31 (Single-precision) is
extracted from the LSBs of the operand O. A triple-width
operand (96-bit or 192-bit) is formed by concatenating
the input operands into the arrangement P-T-P, with the
32-bit or 64-bit result field initially aligned with the T-operand. The triple-width operand is logically right-shifted
by the number of places specified by the shift length. A
negative shift length therefore produces a left-shift.
Move P (Floating-Point· or Integer): The 64-bit
operand P is passed unchanged through the ALU. No
exceptions are detected or signaled.

Am29027
Primary and Alternate Floating-Point Formats
Two mode register fields. PFF and AFF. specify the primary and alternate floating-point formats used by the
ALU. All floating-point operations except format conversions are performed in the format specified by PFF. For
format conversion operations. either primary floatingpoint format PFF or alternate floating-point format AFF
are used as follows:
•

•

•

For conversions between floating-point and integer
formats (base operation codes Convert T to integer.
Convert T to floating-point. Scale T to integer by O.
Scale T to floating-point by 0). the floating-point
source or destination format is specified by PFF; for
the scale operations. the format of operand 0 is also
specified by PFF.
When converting from the primary floating-point
format to the alternate floating-point format (base
operation code Convert Tto alternate F. P. format).
an operand in format PFF is converted to format
AFF.
When converting from the alternate floating-point
format to the primary floating-point format (base
operation code Convert T to primary F.P. format).
an operand in format AFF is converted to format
PFF.

Operation Precision
The ALU performs all operations in double-precision
format. All Single-precision input operands are converted to double-precision equivalents by the ALU at
the start of an operation. If the operation is to report a
single-precision result. the ALU converts the doubleprecision internal result to single-precision at the end of
the operation.
Note that operation flags and exception bits pertain to
the source and destination precisions. If. for example.
an operation produces a single-precision overflowed result. an overflow is indicated regardless of whether that
result overflows the double-precision internal format.
Operation Flags
For each operation. the ALU produces thirteen flags. Of
these. a maximum of seven are relevant to any given operation. The relevant flags are placed in the flag register

in the manner shown in Table 11. All flags are active
High. In flow-through mode the flag register is made
transparent. and the selected flags are presented directly to the output multiplexer.
The ALU flags are:
C-CARRY: Carry-out bit produced by integer addition.
subtraction. or comparison.
I-iNVALID OPERATION: Indicates that the input
operands are unsuitable for the operation performed
(e.g .• 00 x 0).
R-RESERVED OPERAND: Indicates that the operation result is a reserved operand. Reserved operands include signaling or quiet NaNs in IEEE format. and DEC
reserved operands in DEC D or G formats.
S-SIGN: Result sign; Low for a non-negative result.
High for a negative result.
U-UNDERFLOW: Indicates that the operation result
underflowed the destination format.
V-OVERFLOW: Indicates that the operation result
overflowed the destination format.
W-WINNER: Indicates which of two input operands is
reported as the result of the MAX p. T and MIN p. Toperations. A logic High indicates that operand T is reported as the result. a logic Low operand P.
X-INEXACT RESULT: Indicates that the operation result had to be rounded to fit the destination format.
Z-ZERO RESULT: Indicates that the operation produced a zero result. Note that the result is exactly zero
only if the Z flag is High and the X flag is Low.
>, =, <, #-GREATER THAN, EaUAL TO, LESS
THAN, UNORDERED: Used to report the re~ult of an
operation with the Compare p. T base operation code.
The Greater Than flag indicates that P > T. the Equal To
flag that P =T. and the Less Than flag that P < T. The
Unordered flag indicates that one or both input operands are reserved operands and cannot be compared.
Note that the Unordered flag cannot arise when comparing IBM floating-point operands or integers. Exactly
one comparison flag will be active per comparison
operation.

1-145

29K Family CMOS Devices
Table 11. Organization of Flags
Flag Register

CO

F

F

F

F

F

F

F

L

L
5

L
4

L

L
2

L
1

0

Z
Z
Z

X
X
X

V
V
V

=

>

R
R
R
R
R
R
R
R
R
R
R
R
R

Format

Operation

IN.INo

6

IEEE

F' = P'

IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE
IEEE

F' =P' + T'
F' = P'xQ'
Compare P, T
Maximum P, T
Minimum P, T
Convert T to Integer
Scale T to Integer
F' = (P' x Q') + T'
Round T to Integral Value
Reciprocal Seed of P
Convert Tto Aft F.P. Format
Convert Tfrom Alt F.P. Format

00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100

S
S
S
S
S
S
S
S
S
S
S
S
S

DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD
DECD

F' = P'
F=P'+T'
F'=P'xQ'
Compare P, T
Maximum P, T
Minimum P, T
Convert T to Integer
Scale T to Integer
F' = (P' x Q') + T'
Round T to Integral Value
Reciprocal Seed of P
Convert T to Aft F.P. Format
Convert T from Alt F.P. Format

00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100

S
S

DECG
DECG
DECG
DECG
DECG
DECG
DECG
DECG·
DECG
DECG
DECG
DECG
DECG

F' = P'
F = P' + T'
F' = P' x Q'
Compare P, T
Maximum P, T
Minimum P, T
Convert T to Integer
Scale T to Integer

00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100

S
S
S
S
S
S
S
S

IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM

F'= P'

S

F=P'+T'
F' = P'xQ'
Compare P, T
Maximum P, T
Minimum P, T
Convert T to Integer
Scale T to Integer

IBM
IBM
IBM
IBM
IBM

F' = (P' x Q') + T'
Round T to Integral Value
Reciprocal Seed of P
Convert T to Alt F.P. Format
Convert Tfrom Alt F.P. Format

00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100

1-146

F' = (P' x Q') + T'
Round T to Integral Value
Reciprocal Seed of P
Convert Tto Aft F:P. Format
Convert T from Alt F.P. Format

S
S
S
S
S
S
S
S
S
S
S

S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S

Z
Z
Z
Z
Z
Z
Z
Z
Z

Z
Z
Z
=

Z
Z
Z

U

X
X
X
X
X
X
>

X
X

Z
Z

X

Z
Z

X
X

Z
Z
Z

X
X
X

=

>

Z
Z
Z

Z
=
Z

Z
Z

U
U
U
U
U
U
<
W
W

U

X
X
X
X
X
X
>

X
X

Z
Z

X

Z

U
U
<
W
W

X
X

Z

Z
Z

U
U
U

U

Z

Z

U
U
U
<
W
W

X
X

Z

Z
Z
Z
Z
Z
Z
Z

3

U
U
U

V
V
V
V
V
V
V
V
V
V

#

V
V
V
V
V
V
V
V
V
V

#

V
V
V
V
V
V
V

R
R
R
R
R
R
R
R
R
R
R
R
R

I
I

I
I
I

R
R
R
R
R
R
R
R

I
I

R
R
R
R
R

I
I
I

V

U
U
<
W
W

V
V

V
V

U

X
X

#

L

U
U

V
V
V
V
V

I
R
R

I

Am29027
Table 11. Organization of Flags (continued)

Flag RegIster
F

F

F

F

F

L

F
L

L

L

4

3

L
2

L
1

0

<

'V
V
V
V

F
CO
Format

Operation

IN4-1No

6

5

Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer
Integer

F .. P
F=P+T
F=PxQ
Compare P, T
Maximum P, T
Minimum P, T
Convert T to Floating-Point
Scale T to Floating-Point
F = PORT
F= PANDT
F= PXORT
Logical Shift P by Q Places
Arithmetic Shift P by Q Places
Funnel Shift P T by Q Places

00000
00001
00010
00011
00100
00101
00110
00111
10000
10001
10010
10011
10100
10101
11000

S
S
S
S
S
S
S
S
S
S
S
S
S
S

Z
Z
Z

MOVEP

..

Z
Z
Z
Z
Z
Z
Z
Z
Z
Z

>

L

C
C

W
W
X
X

U

V

R

V

S

Note: Unused flags assume the Low state.

1-147

29K Family CMOS Devices
Updating the Status Register
The status register exception bits are updated at the
conclusion of each operation in flow-through mode, and
at the start of each operation in pipeline mode. An exception bit is updated only if the operation reports that
exception with a flag. For example, an IEEE floatingpoint addition operation produces an overflow flag and
would therefore update the overflow exception bit; an
IEEE floating-point comparison operation, on the other
hand, does not produce an overflow flag and would
therefore leave the overflow exception bit unchanged.
The mode register exception mask bits do not affect the
updating of the status register exception bits-masked
exceptions still appear in the status register. However,
a masked exception will not set the exception status
bit (ES).

Operation Sequencing
The Am29027 can be configured for either pipelined
or flow-through (unpipelined) operation. Flow-through
mode is normally selected for performing scalar opera-

tions; pipeline mode provides high throughput for vector
operations. The manner in which operations are sequenced depends on the mode currently invoked.
Operation In Flow-Through Mode
Flow-through mode is invoked by setting mode register
bit PL (Pipeline Mode Select) to logic Low.
Programmer's Model
A programmer's model of the Am29027 in flow-through
mode is shown in Figure 9. Note that Output Register F
and the flag register are made transparent in this mode.
Performing Operations
Flow~through mode operations are performed by:

•
•

~----------4-------~

Storing instructions and/or operands in the
Am29027 and starting the operation
Loading the result

32

64

64

p

Q

64

T

ALU
F
Flags
Mode
Instruction Register Prec.
Register
Register

A

09114-11C

Figure 15. Programmer's Model for Flow-Through Mode
1-148

Am29027
Storing instructions and operands can be done in any of
three ways:

and, optionally, the operation result will be written to the
register file and precision register.

•

Writing the Instruction only. and starting the
operation: This is appropriate when all necessary
operands are already present in the Am29027,
as is sometimes the case when using on-board
constants or the results of previous operations
stored in the register file.

•

Writing the operands only. and starting the
operation: This is appropriate when the desired
instruction is already present in the Am29027, as is
the case when performing the second of two
identical operations.

There are two conditions for which the Am29027 will not
start an operation immediately. The first condition is
when an operation is already in progress. In this case
the new operation is kept pending in the I-Temp,
R-Temp, and S-Temp registers until the current operation is completed, at which time the new operation begins. The second condition is when a previous operation
creates an unmasked exception in Halt On Error mode
(mode register bit HE High). In this case the new operation is kept in the I-Temp, R-Temp, and S-Temp registers until the exception is cleared, at which time the new
operation begins .

•

Writing the Instruction and operands. and
starting the operation: This is appropriate
whenever the next operation requires both a new
instruction and new operands.

Operands and instructions are written using the write
operand R, write operand S, write operands R, S, and
write instruction. transaction requests. Operands and
instructions can be written to the Am29027 in any order,
with the operation start bit (DREOTo High) accompanying the last of the transaction requests.
Loading an operation result is performed using the read
result MSBs, read result LSBs, and read flags transaction requests. The specific request used depends on
whether the result of an operation is a flag or flags (as is
the case with comparison operations) or data (as is the
case with most other operations). In cases where the
operation result is stored in the register file, the user
may elect not to read the result but to proceed with the
next operation.

Operation Timing
The Am29027 will usually start a flow-through operation
during the first cycle following the receipt of a write
operand R, write operand S, write operands R, S, or
write instruction transaction request having signal
DREOTo set High.
Operation execution begins with the transfer of the contents of the R-Temp, S-Temp, and I-Temp registers to
Register R, Register S, and the instruction register, respectively; only those temporary registers written to as
part of the operation specification will be transferred.
The operand or instruction accompanying the transaction request that starts the operation (that is, the transaction request for which signal DREOTo is High) is written directly to the appropriate working register, that is,
Register R, Register S, or the instruction register.
Once started, an operation will proceed for the number
of cycles specified by mode register fields MATC,
MVTC, and PLTC; MATC specifies the numberofcycles
for base operation code (P x 0) + T, MVTC the number
of cycles for base operation code MOVE P, and PLTC
the number of cycles for all other base operation codes.
At the end of the last operation cycle, the status register
exception bits and exception status bit will be updated

Timing for typical accelerator operations in the flowthrough mode is illustrated in Appendix D.

Availability of Operation Results
In order to directly read the result of an operation, the
operation specification should be followed by the appropriate read transaction request. Should the Am29000
attempt to read an operation result before the operation
is completed, the Am29027 will withhold acknowledging
the transaction request by holding signals DRDY and
DERR inactive until the operation has been completed.
All read transaction requests, including save state, will
be held off in this manner.

Overlapping Operations
Due to the presence of the R-Temp, S-Temp, and
I-Temp registers, it is possible to partially or completely
specify a new operation while the previously specified
operation is being performed. Execution of the new
operation will begin immediately after the previous operation is completed. Execution begins with the transfer
ofthe contents ofthe R-Temp, S-Temp, and I-Temp registers to the corresponding working registers; only those
temporary registers that have been written to as part of
the operation specification are transferred.
It is important to note that, once the new operation is
completely specified, any attempt to read a result will be
held off until the new operation is completed. This
means that it is not possible to directly read the result of
an operation if another operation is completely specified
before the results of the first operation are read. If, for
example, specification of operation 2.0 + 3.0 is immediately followed by specification of operation 4.0 x 5.0,
subsequent read result LSBs and read result MSBs
transaction requests will return value 20.0, the result of
the second operation. Similarly, a read flags transaction
request will return flags for the second operation, and a
read status transaction request will return status reflecting the completion of the second operation. This delayed read feature is provided to eliminate ambiguity in
the correspondence between operations and results.
Should two operations be overlapped, and should the
first operation have as its target a register file location,
the second operation can be completely specified be1-149

29K Family CMOS Devices
fore the first operation is completed. If the first operation
produces a result that is to be read directly by the
Am29000, the second operation can be partially specified before the result of the first operation is read. A
partial operation specification is one that includes all but
the last operand or instruction.
Timing for typical overlapped operations in flow-through
mode is illustrated in Appendix D.

Saving and Restoring State
In flow-through mode, the complete state of the
Am29027 can be saved and restored with the save state
transaction request. The first save state transaction
request will return the contents of the instruction register; subsequent requests will return the contents of
Registers I-Temp, R, S, R-Temp, S-Temp, the status
register, the precision register, register file locations
RF7-RFo, and the mode register. The user has the option of saving only part of the state by issuing only the
number of save state transaction requests needed
to save registers of interest. When issuing a series of
save state transaction requests, data is returned in the
following order:
Request

Data Returned

1

Instruction
I-Temp
R LSBs
RMSBs
S LSBs
S MSBs
R-Temp LSBs
R-TempMSBs
S-Temp LSBs
S-Temp MSBs
Status
Precision
RFo LSBs
RFo MSBs

2
3
4

5
6
7
8
9
10
11
12
13
14

27
28
29
30

1-150

RF7 LSBs
RF7 MSBs
Mode LSBs
Mode MSBs

Sequencing for the save state transaction request is
reinitialized when the Am29000 issues any transaction
request other than save state. If, for example, the
Am29000 issues a write operand R transaction request
after a series of save state requests, the next save state
request will return the contents of the instruction
register.
It should be noted that the process of saving state alters
the contents of the instruction register and Registers R
andS.
Error reporting via signal DERR is suppressed for the
save state transaction request.
Accelerator state is restored using transaction requests
in concert with the MOVE P base operation code. Before
restoring state, all status register bits should be set to
logic Low using the write status transaction request to
prevent the possibility of an unmasked exception bit
inhibiting the restore sequence. The accelerator operand and instruction registers can then be restored,
followed by restoration of the status register using the
write status transaction request, with Signal DREQToasserted to indicate the end of the restore sequence.
When state restoration is complete, the Am29027 will
retime the operation specified by current instruction
register contents.

Am29027
Accelerator state is restored in the following order:
Register to
be restored

Procedure for restoring

Status

Set all bits in the status register to a logic
low using the write status transaction
request.

Mode

Write using
request.

RFo

Write "Move R to RFo" instruction using
write instruction transaction request.

write

mode

transaction

Write RFo value to Register R using write
operand R transaction request, start operation.

Write "Move R to RF7" instruction using
write instruction transaction request.
Write RF7 value to Register R using
write operand R transaction request, start
operation.
Precision

Guarantee that "Move R to RF7" operation
has been completed by performing a read
result MSBs transaction request.
Write precisions using write register file
precisions transaction request.

R,S,
Instruction

Write R value to Register R-Temp
using the write operand R transaction
request.
Write S value to Register S-Temp using the
write operand S transaction request.

tions of state restoration are the initial clearing of the
status register, and restoration of the status register with
signal DREOTo asserted to indicate completion of the
restore sequence.

Error Recovery
Six exception bits-invalid operation, reserved operand, overflow, underflow, inexact result, and zero result-are maintained in the status register; these bits
are updated upon completion of an operation. Exception
bits can be masked individually by programming the appropriate bits in the mode register; if the corresponding
mask bit is inactive (logic Low), the exception bit is said
to be unmasked and contributes to error reporting. The
Am29027 provides three mechanisms with which unmasked exceptions can be handled.
Reporting Errors Upon Read
If an unmasked status register exception bit is set, the
Am29027 will signal an error by asserting signal DERR
when the Am29000 performs a read result LSBs, read
result MSBs, read flags, or read status transaction request. Error reporting can be suppressed by issuing any
of these transaction requests with signal DREOTo
asserted.
Halt On Error Mode
Should the application require, the Am29027 can be
configured to halt operation upon detection of an unmasked exception; this mode is invoked by setting
mode register bit HE (Halt On Error) High. Once configured this way, the Am29027 will respond to an unmasked exception as follows:
•

Signal CDA will become inactive upon completion
of the operation producing the unmasked
exception.

•

Should the operation producing the unmasked
exception specify that the operation result be stored
on-chip, that is, in the register file, the result will not
be written to its destination.

•

A pending operation will not be started; the
operands and/or instruction for that operation will
remain in the appropriate temporary registers.

•

If the Am29000 attempts to start a new operation
during the last cycle of the operation that produces
the unmasked exception by issuing a write operand
R, write operand S, write operands R, S, or write
instruction transaction request with DREOTo
asserted, and if no other operation is pending, the
operand or instruction will be written to the
appropriate temporary register rather than to the R,
S, or instruction register.

•

Once CDA is deasserted, the Am29027will respond
to the write operand R, write operand S, write
operands R, S, and write instruction transaction
requests by asserting signal DERR one cycle after
the request is issued; the contents of the target
register or registers will remain unchanged.

Write instruction value to Register I-Temp
using write instruction transaction request.
Transfer contents of Registers R-Temp, STemp, and I-Temp to Register R, Register
S, and the instruction register, respectively,
using the advance temp registers transaction request.
R-Temp,
S-Temp,
I-Temp

Write R-Temp value to Register R-Temp
using the write operand R transaction
request.
Write S-Temp value to Register S-Temp
using the write operand S transaction
request.
Write I-Temp value to Register I-Temp using the write instruction transaction
request.

Status

Write status to status register using the
write status transaction request, with signal
DREQTo asserted to indicate that the restore sequence is complete.

The user may elect to restore only those registers relevant to a particular application by omitting parts of the
state restoration sequence. The only mandatory por-

1-151

29K Family CMOS Devices
Through these measures, the Am29027 will retain the
input operands and instructions for the operation causing the exception. The input operands will be retained in
the R register, S register, or register file locations,
and the instructions will be retained in the instruction
register. Additionally, the R-Temp, S-Temp, and I-Temp
registers may contain the operands and instructions
for a partially or fully specified pending operation. The
Am29000 can recover these operands and instructions
with the save state transaction request; this information can then be given to an error-handling routine for
resolution.
The error halt condition is removed by clearing the
status register exception status (ES) bit and the exception bit or bits responsible for producing the halt.

Reporting Errors via EXCP
Signal EXCPwili go active Low inthe presence of an unmasked exception. This signal can be connected to an
Am29000 trap or exception input signal, and is enabled
or disabled independent of other exception handling
mechanisms with mode register bit EX.

PLTC
PLTC specifies the number of cycles allotted to operations other than those using base operation codes
(P x a) + T or MOVE P. This count can assume values
between 3 and 15, inclusive, and must be given a value
that satisfies the relationship:
[8]~

PlTC x [1],

where

and

[8] = Operation time, flow-through
mode, all other base operation
codes
[1] = ClK period,

as described in the Switching Characteristics table.

MATC
MATC specifies the number of cycles allotted to operations that use base operation code F' = (P' X a') + 1'.
This count can assume values between 3 and 15, inclusive, and must be given a value that satisfies the
relationship:
[6]~MATC

x [1],

where

Writing to the Mode, Status, and
Precision Registers
Unlike the R, S, and instruction registers, the mode,
status, and precision registers are not preceded by temporary registers. Accordingly, writing to these registers
may produce undesirable or unpredictable side effects if
an accelerator 'operation is in progress at the time. To
avoid such side effects, a write to any of these registers
should be preceded by a read transaction request,
which will guarantee that any current or pending accelerator operations will have been completed before the
write transaction request is issued.

and

[6] = Operation time, flow-through
mode, F' =(P' x 0') + T'
[1] = ClK period,

as described in the Switching Characteristics table.

MVTC
MVTC specifies the number of cycles allotted to operations that use the MOVE P base operation code. This
count can assume values between 3 and 15, inclusive,
and must be given a value that satisfies the relationship:
[7] ~ MVTC x [1],
where

Writing to the Register File
The numerical result of any operation may be written to
the register file by specifying the desired destination in
instruction field RFS and setting instruction bit RF High.
The result can then be used as an input operand for subsequent operations.

It is permissible for an operation result to be placed in a
register file location that previously contained an input
operand for that operation. In such a case, however, it is
not permissible for the Am29000 to directly read the result, status, or flags for that operation, as the writing of
the result modifies the operation performed by the ALU.
Determining Timer Counts
To provide optimum accelerator performance over a
range of possible system clock frequencies, the timing
of Am29027 operations is programmable. Three mode
register fields-pipeline timer count (PLTC), timer count
for the Multiply-Accumulate Operation (MATC), and
timer count for the MOVE P Operation (MVTC)-must
be programmed according to system clock frequency
and accelerator speed.
1-152

and

[7] = Operation time, flow-through
mode, MOVE P
[1]= CLK period,

as described in the Switching Characteristics table.

ADVANCING DRDY
Normally, an operation result produced by the Am29027
in flow-through mode is read by the Am29000 no sooner
than the clock cycle following operation completion. Depending on the system clock frequency used, it may be
advantageous to overlap the reading of the result with
the last cycle of the operation. Consider, for example, a
system with a 45-ns clock cycle and an Am29027 that
performs an operation in 240 ns. The pipeline timer
count PLTC will have to be set to a minimum of 6 for
such a system, and the Am29000 will read a result
no sooner than during the seventh clock cycle after the
start of an operation.
Mode register bit DA, DRDY Advance, can be used to
advance transaction status Signals DRDY and DERR by
a full clock cycle, thus allowing the Am29000 to read
data one clock cycle earlier than would otherwise be

Am29027
possible. Forthe example given above PLTC remains at
6, but the Am29000 can read data during the sixth clock
cycle after the operation starts rather than the seventh,
thus saving a clock cycle.

formed in pipeline mode, the pipe must be advanced
twice (by starting two operations) before the result of the
addition appears in Register F, the flag register, the
status register, and, optionally, a register file location.

In orderto advance DRDY and DERR, the following system timing conditions must be met:

Performing Operations

[19]S (MATC x [1])-[x9B]-lgate]
[20] s (MVTC x [1]) -{x 9B]- [gate]
[21]S(PLTC x [1])- [x9B]-[gate]
where

[19]

= Data operation-start-to-output

[20]

= Data operation-start-to-output

valid delay, F' = P' x

a' + T'

valid delay, MOVE P
[21] = Data operation-start-to-output
valid delay, all other operations
[1] = ClK period

and

Pipeline mode operations are performed by:
•

Storing· instructions and/or operands
Am29027, and starting the operation

•

Loading the result of a previous operation

Storing instructions and operands can be done in any of
three ways:
•

Writing the Instructions only, and starting the
operation: This is appropriate when all necessary
operands are already present in the Am29027,
as is sometimes the case when using on-board
constants or the results of previous operations
stored in the register file.

•

Writing the operands only, and starting the
operation: This is appropriate when the desired
instructions are already present in the Am29027, as
is the case when performing the second of two
identical operations.

•

Writing the Instructions and operands, and
starting the operation: This is appropriate
whenever the next operation requires both new
instructions and new operands.

as described in the Switching Characteristics table
and
[x 9]

= Synchronous input setup time

as described in the Switching Characteristics table of
the Am29000 Preliminary Data Sheet (order #09075).
The term [gate] represents the delay of the external
gate through which the DERR signal passes.
Timing for a typical accelerator operation with DRDY
advanced is illustrated in Appendix D.
Operation In Pipeline Mode
Pipeline mode is invoked by setting mode register bit PL
(Pipeline Mode Select) to logic High.

Programmer's Model
A programmer's model of the Am29027 in pipeline
mode is shown in Figure 10. Note that Output Register F
and the flag register are non-transparent in this mode,
thus permitting the overlap of the current operation(s)
with the reading of the result for a previous operation.

Pipeline Delays
When placed in pipeline mode, the ALU is divided into
three pipeline stages for multiply-accumulate operations, and into two stages for all other operations. The
ALU configuration for pipeline mode is shown in
Figure 11, Note that for multiplication-accumulation operations, multiplicand P and multiplier 0 enter the first·
pipeline stage, while addend T enters the second pipeline stage. As a consequence, the source for operands
P and 0 must be specified in the corresponding multiplyaccumulate instruction, while the source for operand T
must be specified in the following instruction.
Pipeline Advance
The ALU pipeline is advanced whenever a new operation begins. One consequence of this advance criterion
is that data does not fall through the pipe but instead is
"pushed" through. If, for example, an addition is per-

in the

Operands and instructions are written using the write
operand R, write operand S, write operands R, S, and
write instruction transaction requests. Operands and
instructions can be written to the Am29027 in any order,
with the operation start bit (DREOTo High) accompanying the last of the transaction requests.
Loading the result of a previous operation is performed
using the read result MSBs, read result LSBs, and read
flags transaction requests. The specific request used
depends onwhetherthe result is a flag orflags (as isthe
case with comparison operations) or data (as is the case
with most other operations). In cases where the
operation result is stored in the register file, the user
may elect not to read the reSUlt, but to proceed with the
next operation.

Operation Timing
The Am29027 will usually start a pipe lined operation
during the first cycle following the receipt of a write operand R, write operand S, write operands R, S, or write
instruction transaction request having signal DREOTo
set High.
Operation execution begins with the transfer of the contents of the R-Temp, S-Temp, and I-Temp registers to
Register R, Register S, and the instruction register, respectively; data is transferred only from those temporary registers written to as part of the operation specification. The operand or instruction accompanying the
1·153

29K Family CMOS Devices

32

~----------~------~

32

64

p

64
Q

64
T

______ ..All! _____ _

09114·012C

Figure 16. Programmer's Model for Pipeline Mode

transaction request that starts the operation (that is, the
transaction request for which signal DREQTo is High) is
written directly to the appropriate working register, that
is, Register R, Register S, or the instruction register. At
the start of the operation, the output of the last ALU pipeline stage is transferred to Register F, the flag register,
and, optionally, to a register file location; the status
register exception status and exception bits are
updated. The outputs of all other ALU pipeline stages
are written to their respective pipeline registers.
Once started, an operation will proceed for the number
of cycles specified by mode register field PLTe, which
denotes the number of cycles needed for data to traverse a single pipeline stage.

1-154

There are two conditions for which the Am29027 will not
start an operation immediately. The first condition is
when an operation has been started recently and has
not yet had time to settle at the output of the first pipeline
stage. In this case the new operation is kept pending in
the I-Temp, R-Temp, and S-Temp registers until the
previous operation completes the first pipeline stage.
The second condition is when a previous operation creates an unmasked exception in Halt On Error mode
(mode register bit HE High). In this case the new operation is kept in the I-Temp, R-Temp, and S-Temp registers until the exception is cleared, at which time the new
operation will begin.

Am29027
P

a

T

Instruction

P

a

T

Instruction

Pipeline Register

F

F

a. Multiply-Accumulate

b. Other Operations

09114-013C

Figure 17. ALU Configuration for Pipeline Mode
Timing for typical accelerator operations in the pipeline
mode is illustrated in Appendix D.

Because Register F, the flag register, and the status
register are updated at the beginning of an operation,
these registers can be read at any time after an operation begins.

eration starts and if another operation is completely
specified thereafter, subsequent read result MSBs and
read result LSBs transaction requests will return not X.
but the result placed in the F register when the second
operation begins; the read flags and read status transaction requests will behave in like manner. This delayed
read feature is provided to eliminate ambiguity in the
correspondence between operations and results.

Overlapping Operations

Saving and Restoring State

Due to the presence of the R-Temp, S-Temp, and ITemp registers, it is possible to partially or completely
specify a new operation while the previously specified
operation is propagating through the first ALU pipeline
stage. Execution of the new operation will begin immediately after the previous operation completes the first
pipeline stage. Execution begins with the transfer of the
contents of the R-Temp, S-Temp, and I-Temp registers
to the corresponding working registers; only those
temporary registers that have been written to as part of
operation specification are transferred.

Due to the presence of ALU pipeline registers. it is not
possible to save the complete state of the Am29027 in
pipeline mode. Pipeline operations may therefore be interrupted only under special circumstances, such as:

Availability of Operation Results

It is important to note that, once the new operation is
completely specified, any attempt to read a result will be
held off until the new operation begins; this means that it
is not possible to read the result that is placed in the output registers when the first operation begins. If, for
example. result X is placed in Register F when an op-

•

If the interrupting routine does not use the
floating-point accelerator

or

•

If· the current series of pipelined operations has
been completed. and any operands needed for
future operations have already been transferred to
the Am29000

The save state transaction request is disabled in pipeline mode. It is permissible to switch to flow-through
mode and use the save state transaction request, but

1-155

29K Family CMOS Devices
doing so does not permit the saving of Register F, the
flag register, or the ALU pipeline registers.

The error halt condition is removed by clearing the
status register exception status (ES) bit and the exception bit or bits responsible for producing the halt.

Error Recovery
As for flow-through mode, the Am29027 provides three
mechanisms with which unmasked exceptions can be
handled.

Reporting Errors Upon Read
If an unmasked status register exception bit is set, the
Am29027 will signal an error by asserting signal DERR
when the Am29000 performs a read result LSBs, read
result MSBs, read flags, or read status transaction request. Error reporting can be suppressed by issuing any
of these transaction requests with signal DREOTo
asserted.

Reporting Errors via EXCP
Same as for the flow-through mode.
Pipeline Invalidation
There are several situations for which the ALU pipeline
stages may contain invalid data. The Am29027 recognizes these situations and invalidates results automatically; results marked as invalid will not update the
status register, register file locations RF7-RFo, or the
precision register. Results are invalidated forthe following conditions:
•

The Am29027 is switched from flow-through mode
to pipeline mode. Any data present in the ALU at the
time of the switch is marked as invalid. This
invalidation is illustrated in Figure 12a.

•

The Am29027 performs a multiply-accumulate
operation that is preceded by an operation other
than multiply-accumulate. The mUltiply-accumulate
operation result and the result that precedes it will
be separated by a spurious result, due to the
insertion of an additional pipeline stage for the
multiply-accumulate operation. The spurious result
is marked invalid. This invalidation is illustrated in
Figure 12b.

Halt On Error Mode
Should the application require it, the Am29027 can be
configured to halt operation upon detection of an unmasked exception; this mode is invoked by setting
mode register bit HE (Halt On Error) High. Once configured this way, the Am29027 will respond to an unmasked exception as follows:
•

Signal CDA will become inactive when the results of
the operation producing the unmasked exception
are transferred from the last pipeline stage to
Register F, the flag register, and the status register.

•

Once CDA is deasserted, the Am29027will respond
to the write operand R, write operand S, write
operands R, S, and write instruction transaction
requests by asserting signal DERR one cycle after
the request is issued; the contents of the target
register or registers will remain unchanged.

Through these measures, the Am29027 will retain the
input operands and instructions for the most recently
started operation. The input operands for that operation
will be retained in the R register, S register, or register
file locations, and the instructions will be retained in the
instruction register. Additionally, the R-Temp, S-Temp,
and I-Temp registers may contain the operands and instructions for a partially or fully specified pending operation. Note that the input operands and instructions
words for the operation causing the exception, as well
as for operations currently in the ALU pipeline, will not
be available. At the user's option, this information can
be stored in a circular queue in the Am29000 register
file so that full recovery from a pipe lined exception is
possible.
The Am29000 can read the contents of Am29027 operand and instruction registers by invoking flow-through
mode and using the save state transaction request.
Note that the contents of Register F, the flag register,
and the ALU pipeline registers will be lost. This information can then be given to an error-handling routine for
resolution.

1-156

The pipeline may also be invalidated manually by issuing a write status transaction request with signal
DREOTo asserted High; this request invalidates all current pipeline contents. Pipeline invalidation does not apply to operation in flow-through mode.

Writing to the Mode, Status, and PreCision
Registers
Unlike the R, S, and instruction registers, the mode,
status, and precision registers are not preceded by temporary registers. Accordingly, writing to these registers
may produce undesirable or unpredictable side effects if
an accelerator operation is pending at the time. To avoid
such side effects, a write to any of these registers should
be preceded by a read transaction request, which will
guarantee that any pending accelerator operation will
have started before the write transaction request is
issued.
The mode register outputs are not pipelined in the ALU,
that is, all pipeline stages receive mode information
directly from the mode register. Accordingly, writing to
the mode register may produce undesirable or unpredictable side effects for operations currently in the ALU
pipeline. To avoid such side effects, a write to the mode
register should be performed only if the contents of the
ALU pipeline are a "don't care,"that is, only after the last
operation result of interest has been written to Register
F, the flag register, or a registerfile location. If, for exam-

Am29027

~

Start Operation

~

~

~

~

~

~

~

Operation

2

3

4

5

6

7

Pipeline Stage 1 I

2

3

4

5

6

7

Pipeline Stage 21

2

?

3

4

5

6

Result

2

?

?

3

4

5

j4-Pip eline Outpu~
Invalid

i
Switch to
Pipeline Mode

a. Pipeline Invalidation timing for switch from flow-through to pipeline mode. Operations shown Incur
two pipe-line delays In pipeline mode [all base operations except F' (P' x a') + T].

=

Start Operation

~

Operation·

I ADD11 MPY11 MAC11 MAC2 1 MAC3 1 (DMAC)I ADD2

1MPY21 ADD3 1 MPY31 ADD41 MPY41

Pipeline Stage 11 ADD11 MPY11 MAC11 MAC2 1 MAC3 1 (DMAC)I ADD2 1MPY21 ADD3 1 MPY31 ADD41 MPY4 1
Pipeline Stage 21

1 ADD11 MPY11 MAC1 1 MAC2 1 MAC3 1 (DMAC)I ADD21 MPY21 ADD31 MPY31 ADD4

I

Pipeline Stage 31
Result

?

1 MAC1 1 MAC2 1 MAC3 1

1 ADD11 MPY1 1
Pipeline Output
Invalid

I

-+1

?

1 MAC1

1 MAC2 1MAC31 ADD2 1MPY21 ADD31 MPY3

I

I+-

b. Pipeline Invalidation timing for mUltiply-accumulate operations In pipeline mode.
Notes: ADDx
MPYx
MACx
(DMAC)

addition operation
multiplication operation
multiply-accumulate operation
dummy multiply-accumulate operation

09114-014C

Figure 18. Pipeline Invalidation Timing
pie, the last in a series of addition operations has
just been started, the mode register should not be written until the pipeline is advanced twice, placing that
operation's results in the F register, flag register, and,
optionally, a register file location.

instruction field RFS and setting instruction bit RF High.
The result may then be used as an input operand in subsequent operations. Because all ALU operations incur
one or more pipeline delays, the result of an operation
will not be available for use by the very next operation.

Writing to the Register File

It is permissible for an operation result to be placed in a
register file location that previously contained an input
operand for that operation.

The numerical result of any operation may be written to
the register file by specifying the desired destination in

1-157

29K Family CMOS Devices

Multiplication-Accumulation Operations
The pipeline structure of the Am29027 permits the
evaluation of sum-of-products expressions in a canonically efficient manner by interleaving the evaluation of
two sum-of-product expressions. Operation sequencing
is described in Figure 13.

can begin. This is accomplished by asserting the RESET
signal, which initializes accelerator state as follows:
•

All bits in the status register are cleared

•

The accelerator is placed in flow-through mode

•

Signal COA is active; signals OROY and OERR are
inactive

•

All internal circuitry controlling operation timing is
initialized

Determining Timer Counts
As for flow-through mode, the timing of operations in
pipeline mode is programmable to accommodate
variations in system timing. A single mode register
field-pipeline timer count (PLTC}-specifies the timing
of all pipelined operations; fields MATC and MVTC are
not used.
PLTC specifies the number of cycles allotted for data to
traverse a single pipeline stage. This count can assume
values between 2 and 15, inclusive, and must be given a
value that satisfies the relationship:
[9]sPlTC X [1],
where

and

[9] = Operation time, pipeline
mode, all operations
[1]= ClK period,

as described in the Switching Characteristics table.

Advancing DRDY
Because the Am29027 F register and flag register are
non-transparent in pipeline mode, it is not possible (nor
advantageous) to advance OROY. Accordingly, mode
register bit M44 has no effect in pipeline mode.

Master/Slave Operation
Two Am29027 accelerators can be tied together in master/slave configuration, with the slave checking the results produced by the master. All input and output signals of the slave, with the exception of SLAVE and
MSERR, are connected directly to the corresponding
signals of the master. The master is selected by asserting signal SLAVE Low, the slave by asserting signal
SLAVE High.
The slave accelerator, by comparing its outputs to the
outputs of the master accelerator, performs a comprehensive check of master accelerator logic. In addition, if
the slave accelerator is connected at the proper position
on the Am29000 buses, it may detect open circuits and
other fau Its in the electrical path between the master accelerator and the Am29000.
Note that the master accelerator also performs a
comparison between its outputs and its own internally
generated results, and is therefore able to detect faults
in its output drivers, which it reports with its MSERR
signal.

Initialization and Reset
The accelerator is in an unknown state when power is
first applied and must be initialized before processing
1-158

The RESET signal does not initialize the operand and instruction registers and may corrupt existing register
contents. It is the responsibility of the user to initialize
these registers, if needed.

Applications
Suggestions for Power and Ground
Pin Connections
The Am29027 operates in an environment of fast signal
rise times and substantial switching currents. Therefore,
care must be exercised during circuit board deSign and
layout, as with any high-performance component. The
following is a suggested layout, but since systems vary
widely in electrical configuration, an empirical evaluation of the intended layout is recommended.
The Veeo and GNOO pins carry output driver switching
currents and can be electrically noisy. The Vee and GNO
pins, which supply the logic core of the device, tend to
produce less noise and the circuits they supply may be
adversely affected by noise spikes on the Vee plane. For
this reason, it is best to provide isolation between the
Vee and Veea pins as well as independent decoupling for
each. Isolating the GNO and GNOO pins is not required.

Printed Circuit-Board Layout Suggestions
1.

Use of a multilayer PC board with separate power,
ground, and signal planes is highly recommended.

2.

All Vee and Veeapinsshould be connected to the Vee
plane. Veea pins should be isolated from Vee pins by
means of an isolation slot which is cut in the Vee
plane (see Figure 14). By physically separating the
Vee and Veea pins, coupled noise will be reduced.

3.

All GNO and GNOO pins should be connected
directly to the ground plane.

4.

The Veea pins should be decoupled to ground with a
O.1-IlF ceramic capaCitor and a 10-IlF electrolytic
capacitor, placed as closely to the Am29027 as is
practical. Vee pins should be decoupled to ground in
a similar manner.

A suggested layout is shown in Figure 14.

Operation

I MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC I

Register R

1 a11 1 a21

a12

a22

a13

a23

a14

a24

a31

a41

a32

a42

a33

a43

a34

844

Register S

I

b2

b2

b3

b3

b4

b4

bl

bl

b2

b2

b3

b3

b4

bl

1

MAC

I

bl

I a14xb4
la12xb2+1a22xb2+ la13xb3+ I a23xb3+

Pipeline Stage 1 lall xblla21xbl 1 a12xb2 1 a22xb21 a13xb31 a23xb3
Pipeline Stage 2 1

lal1xbll a21xbl

(Cl)

(02)

(Cl)

1 a14xb4+ 1 a24xb4+ 1 a31xbl
(el)

(02)

I all xbl I a21xbl Ia12xb2+ Ia22xb2+ 1 a13 x b3 +

Pipeline Stage 31

a24xb4 1 a31 xbl 1 a41 xbl 1 a32xb2 1 a42xb2

b4
1 a44xb41

I a41xbl I a32xb2+1 a42xb2+ 1 a33xb3+ la43xb3+ I a34xb4+1

(02)

(03)

1 a23xb3+ 1 a14xb4+ 1 a24xb4+

I" a33xb3 I a43xb3 I a34xb4
(e4)

(03)

(e4)

MAC·

(03)

I a31 xbl I a41 xbl I a32xb2+ 1 a42xb2+ 1 a33xb3+ I a43xb3+1

a44Xb4+1
(04)

a34xb4+ la44xb4 + 1

(Cl)

(02)

(el)

(02)

(el)

(02)

(03)

(04)

(03)

(e4)

(03)

RF t

(el)

(02)

(Cl)

(02)

(el)

(02)

el

02

(03)

(04)

(03)

(04)

(03)

(04)

03

104

RegisterF

(el)

(02)

(el)

(02)

(el)

(02)

el

c2

(03)

(e4)

(03)

(04)

(03)

(04)

03

1 04

(04)

Calculate matrix product C - A x S, where:

A_

a11
a21
[ a31
a41

a12
a22
a32
842

a13 a14J
a23 a24
a33 a34
a43 a44

B-[~J c-[~J

c1 =a11
c2=a21
c3=a31
c4 =841

xb1
xb1
xb1
xb1

+a12xb2+a13xb3+a14 xb4
+a22xb2+a23xb3+a24xb4
+832xb2+833xb3+a34xb4
+842xb2+a43xb3+s44xb4
09114'{)15C

Notes:

1.
2.

Register file location RFo is used as the accumulator.
Parentheses are used to indicate partial sums of products.

• Additional MAC operation needed to terminate sequence.

Figure 13. Canonically Efficient Sum-of-Products Evaluation In Pipeline Mode

>
3

-"
-"

(II

CD

N
CD

o
N

......

29K Family CMOS Devices

ABC 0 E F G H

J

K L MN P R T U

100000000000000000
200000000000000000
300000000@00000000
40
000
50
@
@OO
60 0
OOOA
70
sO

0

90

0

OOO~

0

0
0
0
0
0

100
110

120
130
140
150

160

170
@t f-O

0000

7

C

OOO~

0
0
0

0000
0000
0000

O@O
000
000

OOO@
000
000
000
000
000
000
000

CS

O-t ~3@1 ~Of ~~ f-O 01 ~@~

C,

C2

Cs

Cs

C3

C4

"

Vee Isolation Cut

o
o

= Through Hole

CDOl17ll

= Vee Plane Connection

C,

= C3 = Cs = C7 = 0.1

C2

=

C.

J.lF (ceramic or monolithic capacitor)

= Cs = Ce = 10 J.lF (electrolytic or tantalum capacitor)

Figure 20. Suggested Printed Circuit· Board Layout
(power and ground connections)

1·160

Am29027

ABSOLUTE MAXIMUM RATINGS

OPERATING RANGES

Storage Temperature ............ ~ -65 to +150°C
(Ambient) Temperature Under Bias .. -55 to + 125°C
Supply Voltage to
Ground Potential Continuous .. " -0.3 V to +7.0 V
DC Voltage Applied to Outputs for
High Output State ......... -0.3 V to +Vcc +0.3 V
DC Input Voltage ........... -0.3 V to +Vcc +0.3 V
DC Output Current,lnto Low Outputs ....... 30 rnA
DC Input Current ............. -10 rnA to +10 rnA

Commercial (C) Devices

Stresses above those listed under ABSOLUTE MAXIMUM RA T1NGS may cause permanent device failure.
Functionality at or above these limits is not implied. Exposure to absolute maximum ratings for extended periods may affect device reliability.

Case Temperature (Tc) " ......... 0 to +85°C
Supply Voltage (Vee) ....... +4.75 V to +5.25 V

Milltary* (M) Devices
Case Temperature (Te) ........ -55 to +125°C
Supply Voltage (Vee) ......... +4.5 V to +5.5 V

Operating ranges define those limits between which the
functionality of the device is guaranteed.
"Military Product 100% tested at Tc=+25°C, +125°C, and

-55°C.

1-161

29K Family CMOS Devices

DC CHARACTERISTICS over COMMERCIAL operating range unless otherwise specified
(for APL Products, Group A, Subgroups 1, 2, and 3 are tested unless otherwise noted)
Parameter
Symbol
VOH

Parameter
Description
Output High Voltage

VOL

Output Low Vo~age

VIH

Guaranteed Input Logical
High Voltage (Note 2)

V IL

Guaranteed Input Logical
Low Voltage (Note 2)

VIH(F)

Guaranteed Input Logical
High Voltage (Notes 2, 6)

F Bus, Slave Operation Only

VIL(F)

Guaranteed Input Logical
Low Voltage (Notes 2, 6)

F Bus, Slave Operation Only

IlL

Input Leakage Current

to

Output Leakage Current

Test Conditions (Note 1)

Min.

Max.

Unit

Vee =Min.

V

2.4

IOH=-4.0 rnA
Vee =Min.
VIN = VIH or VIL

0.45
2.0

V
V

0.8
Vee -0.5

V
V

0.5

V

.r,,""o'\\\i"

Icc Static

Static Power Supply Current

;~::~,;\,;t;~\~)"
(Note 3)
Qfy10S VIN =Vee or
',>"
GND

240

(Note 3)
TTL VIN =0.5 V or
2.4 V

275

(N6te3)

rnA

CMOS VIN =Vee or
GND
Te =-55 to
(Note 3)
+125°C

Iccop

Operating Power Supply
Current

TTL VIN =0.5 V or
2.4 V

Vee =Max.
Outputs floating

9.0

rnA/MHz

Notes: 1. Vee conditions shown as Min. or Max. refer to ±5% Vee (commercial) and ±10% Vee (military).
2. These input levels provide zero noise immunity and should only be statically tested in a noise-free environment
(not functionally tested).
3. Use CMOS lee when the device is driven by CMOS circuits and TTL Icc when the device is driven by TTL circuits.
4. lee (Total) .. lee (Static) + lecop x f, where f is in MHz. This is tested on a sample basis only.
5. Tested on a sample basis only.
6. These levels guaranteed compatible with F bus output levels.

CAPACITANCE
Parameter
Symbol

Parameter
Description

C IN

Input Capacitance

COUT

Output Capacitance

ClIO

1/0 Pin Capacitance

1-162

Test Conditions

tc =1 MHz (Note 5)

Min.

Max.

Unit

12

pF

20

pF

20

pF

Am29027

SWITCHING CHARACTERISTICS over COMMERCIAL operating range
25 MHz

No.

Parameter Description

1
2
3
4
5

ClK Period

6
7
8
9
10
11
12

Test Conditions

Min.

(Note 1)

40
18
18

ClK Low Time
ClK High Time
ClK Rise Time

(Note 2)

ClK Fall Time

(Note 2)

"",

280~~':::

""290>';

;~;~

.<,:':,:'< l'i~:;~1'5~>::::

Operation Time, Pipeline Mode
All Operations

,; ';;:~:'~120' '<;;.~
(Note 3) i"<,:>\~ ",'(;,:>it!·>'
i(::,>",';,i:'~:"':"i:,: :"<");.1"1
(Note 3)

Transaction Request Setup Time
Transaction Request Hold Time
BIN V Setup Time

<~,;!,,::,\"~ii:::::'";':::;,:~,

13

BIN V Hold Time

14
15
16

Data Setup Time

17

Instruction Hold Time

18

CDA ClK-to-Output-V~li~"Q~lay. ;::

19
20

F31-Fo ClK-to-Output-Val14:'qelay

Instruction Setup TIme

50
20
20

16 MHz

Max.

Min.

DC

60
22
22

5

,c:t26~;)

.,.,

Min.

DC

5
5

Operation Time, low-latency
Mode, F' = (P' x a') + T'
MOVEP
(All Other Base Operation Codes)

Max.
DC

Unit
ns
ns
ns

5
5

ns

5

300
150
250

360
180
300

ns
ns
ns

180

ns

150

ns

24
0
13

26
0
15

ns

2

2

2

ns

18
2
18
2

22
2
22
2

24
2
24
2

ns

ns
ns

.;,,,.

~ ~;::"::t:(~ot;~~1'»';f'
.,(;:;,~;,/",.

Data Hold Time

20 MHz

Max.

"d:~~', ';';:':,"';"~:~,i; ,,""
i:"~~ote 5)
"" . :t'I'""/:~·
"",'

ns
ns
ns

20

24

26

ns

30

35

37

ns

22

25

27

ns

285
135
235
21

340
160
280
23

ns
ns
ns

DRDY ClK-to-Output-Valid Delay

270
110
190
18

25

DERR ClK-to-Output-Valid Delay

18

21

23

ns

26

EXCP ClK-to-Output-Valid Delay

21

MSERR ClK-to-Output-Valid
Delay

23
30

ns

27

18
20

21
22
23
24

""1""

F31-Fo Three-State
ClK-to-Output-lnactive Delay

(Note 6)

Data Operation-Start-to-OutputValid Delay
F'=(p'xa')+ T'
MOVEP
(All Other Base Operation Codes)

25

ns

ns

Notes: 1. ClK switching characteristics are made relative to 1.5 V.

2. ClK rise time/fall time measured between 0.8 V and (Vee -1.0 V). Tested on a sample basis only.
3. Transaction request signals include

RiW,

oREa, DREaT,-DREaT

4. Data signals include R31-RO and S31-S0.

"

and OPTrOPTo.

5. Instruction signals include b,-Io.
6. Three-State Output Inactive Test load. Three-State ClK-to-Output-lnactive Delay is measured as the time to a
±500 mV change from prior output level.
Conditions: A. All inputs/outputs are TTL-compatible for V1H , V 1L, and VOL unless otherwise noted.
B. All outputs are driving 80 pF unless otherwise noted.
C. All setup, hold, and delay times are measured relative to elK at 1.5 V unless otherwise noted.

1·163

29K Family CMOS Devices

SWITCHING CHARACTERISTICS over MILITARY operating range
20 MHz
No.

Parameter Description

1
2
3
4

ClK Period

ClK Rise Time

(Note 2)

5

CLK Fall Time

(Note 2)

6
8

Operation Time, low-latency
Mode, F'=(P'xQ')+ T'
MOVEP
(All Other Base Operation Codes)

D

Operation Time, Pipeline Mode
All Operations

7

Test Conditions
(Note 1)

ClK low Time
ClK High Time

Max.

Min.

Max.

Unit

50
20
20

DC

60
22
22

DC

ns

'i('\

i!·•··':"':·:,:.

. ,.'i,.

Ii,::';;

"

..

10
11
12

Transaction Request Setup Time

(Note 3)/",,"\ 1(:.~::'.\24i·

Transaction Request Hold Time

(Note 3)':',"<: I.'·,", 0

13

BINV Hold Time

14
15
16
17

Data Setup Time

/':' ....

BINV Setup Time

'\

Instruction Setup Time
Instruction Hold Time

::,\:.:

I""')'

"""'\".'';:,

Data Hold Time
.... ,

ns
ns

5

5

ns

5

5

ns

360
180
300

ns
ns
ns

180

ns

./ ··'·~:'1..:,~:.,\, '.,

3Q01
"'\.150
" .:250

1<> '-',,:,:"'\.,,1.·\

,<

16 MHz

Min.

~,. '":':':"

".":,.,

,':'ii(Note"a)

. ii'' .··.·.';..
.. : ........ I:, . ··'··· (Note 5)

'"

150

ns

2

26
0
16
2

22
2
22
2

24
2
24
2

ns

14

ns
ns
ns

ns
ns
ns

18

COA CLK-to-Output-Valid Delay';\,

24

26

ns

19
20

F31-FO CLK-to-Output-Valid'pelay

35

40

ns

26

30

ns

340
160
280
23

ns
ns
ns

F3,-Fo Three-State CLK-toOutput-Inactive Delay

(Note 6)

Data Operation-Start-to·OutputValid Delay

21
22
23
24

F' = (P'xQ/) + T'
MOVEP
(All Other Base Operation Codes)
DRDY ClK-to-Output-Valid Delay

285
135
235
21

25

DERR CLK-to-Output-Valid Delay

21

23

ns

26

EXCP ClK-to-Output-Valid Delay

21

ns

27

MSERR ClK-to-Output-Valid Delay

25

23
30

ns

ns

Notes: 1. ClK switching characteristics are made relative to 1.5 V.
2. ClK rise time/fall time measured between 0.8 V and (Vcc -1.0 V). Tested on a sample basis only.
3. Transaction request signals include RlW, DREQ, DREQT,-DREQTo, and OPTrOPTo•
4. Data signals include R3,-Ro and S3'-SO.
5. Instruction signals include b,-Io.
6. Three-State Output Inactive Test load. Three-State ClK-to-Output-lnactive Delay is measured as the time to a
±500 mV change from prior output level.
Conditions: A. All inputs/outputs are TIL-compatible for V1H • V1L • and VOL unless otherwise noted.
B. All outputs are driving 80 pF unless otherwise noted.
C. All setup, hold, and delay times are measured relative to ClK at 1.5 V unless otherwise noted.

1-164

Am29027

SWITCHING WAVEFORMS

ClK
3 - -......1 - - -

Transaction
Request

Data,
Instruction

EXCP

Input Signal Timing; COA, EXCP Timing

1-165

29K Family CMOS Devices

SWITCHING WAVEFORMS (continued)
Start of
Operation
~I-----

6 , 7 , 8 - - - - -...

ClK

Transaction
Request

X_.;...;.;No;.;..;te;,..;1_X

Operation Timing for Flow-Through Mode, DRDY, DERR Not Advanced
(Mode Register Bit AD=O)

Notes: 1. Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with Signal
DREQTo asserted.
2. Transaction Request Read Result MSBs, Read Result lSBs, Read Flags, Read Status, or Save State. If reguest Read Result lSBs is issued, the Am29027 produces two data outputs in two consecutive cycles, with
DRDY or DERR active for both cycles.
3. Signal EXCP is asserted in the presence of unmasked exception.

1-166

Am29027

SWITCHING WAVEFORMS (continued)

Start of Operation

~

elK
Transaction
Request

26

~v

I(I\N'''o'te

1.SV'---=

3)

Operation Timing for Flow-Through Mode, DRDY, DERR Advanced
(Mode Register Bit AD=1)

Notes: 1. Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with Signal
DREQTo asserted.
2. Transaction Request Read Result MSBs, Read Result lSBs, Read Flags, Read Status, or Save State. If request Read Result lSBs is issued, the Am29027 produces two data outputs in consecutive cycles, with DRDY
or DERR active for both cycles.
3. Signal EXCP is asserted in the presence of an unmasked exception.

1-167

29K Family CMOS Devices

SWITCHING WAVEFORMS (continued)

Transaction
Request

X X
Not. t

--+

24,
25

+-

--+

-------------------------;:~----------+-~

24,
25

+-

1.5 V

26
4

~.5V

(Not. 3)

Operation Timing for Pipeline Mode
Notes: 1. Transaction request Write Operand R; Write Operand S; Write Operands R, S; or Write Instruction with signal
DREQTo asserted.
2. Transaction Request Read Result MSBs, Read Result lSBs, Read Flags, Read Status, or Save State. If request Read Result lSBs is issued, the Am29027 produces two data outputs in consecutive cycles, with DRDY
or DERR for both cycles.
3. Signal EXCP is asserted in the presence of an unmasked exception.

ClK

\.

...

Master/Slave Discrepancy
During This Cycle

2_7--+-J)-1.-5-V----~-27-~

MSERR _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Master/Slave Timing

1-168

Am29027

SWITCHING TEST CIRCUIT
Vee

R,

= 300 ohms

VOUT

I
Three-State Output Inactive Test

IOL

= 4.0 mA

Am29027
Pin Under Test

IOH

= 4.0 mA
09075B-001A

CL is guaranteed to 80 pF.

1-169

29K Family CMOS Devices

TEST PHILOSOPHY AND METHODS
The following nine points describe AMD's philosophy for
high-volume, high-speed automatic testing.
1.

Ensure that the part is adequately decoupled at the
test head. Large changes in Vee current as the device switches may cause erroneous function failures due to Vee changes.

2.

Do not leave inputs floating during any tests, as they
may start to oscillate at high frequency.

3.

Do not attempt to perform threshold tests at high
speed. Following an output transition, ground current may change by as much as 400 mA in 5-8 ns.
Inductance in the ground cable may allow the
ground pin at the device to rise by hundreds of millivolts momentarily.

4.

Use extreme care in defining point input levels for
AC tests. Many inputs may be changed at once, so
there will be significant noise at the device pins and
they may not actually reach VIL or VIH until the noise
has settled. AMD recommends using VIL'5, 0 Vand
VIH ~ 3.0 V for AC tests.

5.

To simplify failure analysis, programs should be designed to perform DC, Function, and AC tests as
three distinct groups of tests.

6.

Capacitive Loading!or AC Testing.
Automatic testers and their associated hardware
have stray capacitance that varies from one type of
tester to another, but is generally around 50 pF.
This, of course, makes it impossible to make direct
measurements of parameters that call for smaller
capacitive load than the associated stray capacitance. Typical examples of this are the so-called
float delays, which measure the propagation delays
into the high-impedance state and are usually
specified at a load capacitance of 5.0 pF. In these
cases, the test is performed at the higher load capacitance (typically 50 pF), and engineering correlations based on data taken with a bench setup are
used to predict the result at the lower capacitance.
Similarly, a product may be specified at more than
one capacitive load. Since the typical automatic

1·170

tester is not capable of switching loads in mid-test, it
is impossible to make measurements at both capacitances even though they may both be greater
than the stray capacitance. In these cases, a measurement is made at one of the two capacitances.
The result at the other capacitance is predicted from
engineering correlations based on data taken with a
bench setup and the knowledge that certain DC
measurements (loH, IOL, for example) have already
been taken and are within spec. In some cases,
special DC tests are performed in order to facilitate
this correlation.
7.

Threshold Testing
The noise associated with automatic testing (due to
the long, inductive cables) and the high gain of the
tested device when in the vicinity of the actual device threshold, frequently give rise to oscillations
when testing high-speed circuits. These oscillations
are not indicative of a reject device, but instead of an
overtaxed test system. To minimize this problem,
thresholds are tested at least once for each input
pin. Thereafter, hard high and low levels are used
for other tests. Generally this means that function
and AC testing are performed at hard input levels
rather than at VIL Max. and VIH Min.

8.

AC Testing
Occasionally, parameters are specified that cannot
be measured directly on automatic testers because
of tester limitations. Data input hold times often fall
into this category. In these cases, the parameter
in question is guaranteed by correlating these tests
with other AC tests that have been performed.
These correlations are arrived at by the cognizant
engineer by using precise bench measurements in
conjunction with the knowledge that certain DC
parameters have already been measured and are
within spec.
In some cases, certain AC tests are redundant,
since they can be shown to be predicted by some
other tests that have already been performed. In
these cases, the redundanttests are not performed.

Am29027

Am29027 Thermal Characteristics
Pin-Grid-Array Package

9JA = 9x + 9CA

Thermal Resistance - °C/WaU
Alrflow-ft./mln., (fTl/sec)
'

9CA

Case-to-Ambient (wiflt'oqtp
\:>~ ,../I-:/f;
Heatsink, ThermalloY,J~4t7 ,

e

Case-to-Ambient (wit~~~nidir~ctional Pin Fin

CA

700
(3.58)

900
(4.61)

4

4

11

9

8

(2,;~5)

Parameter

Heatsink, Wakefield 840-20

10

6

3

2

2

2

6

3

2

2

2

700
(3.58)

900
(4.61)

Am29027 Thermal Characteristics
Ceramic Quad-Flat-Pack Package

Thermal Resistance - °C/Watt
Alrflow-ft./mln. (rn/sec)
Parameter

ex
eCA

0
(0)

150
(0.76)

300
(1.53)

480
(2.45)

Junction-to-Case
Case-to-Ambient (no Heatsink)

Note: This is for reference only.

1-171

29K Family CMOS Devices

APPENDIX A-DATA FORMATS
The following data formats are supported: 32-bit integer, 64-bit integer, IEEE single-precision, IEEE double-precision,
DEC F, DEC 0, DEC G, IBM single-precision, and IBM double-precision.
The primary and alternate floating-point formats are selected by mode register fields PFF and AFF. The user may
select between floating-point operations and integer operations by means of instruction bit INs.
The nine supported formats are described below:

Integer Formats
32-Bit Integer
The 32-bit integer word is arranged as follows:

Bit 31 30 29 28 27 26 25
31

-2

30

2

29

2

28

2

27

2

26

2

7

25

6

5

4

3

o

2

76543210

2

22222222
TB001030

The 32-bit word is interpreted as a two's-complement integer. For integer multiplications, the user has the option of
interpreting integers as unsigned. An unsigned single-precision integer has a format similar to that of the two's-complement integer, but with an MSB weight of 231.
64-Bit Integer
The 64-bit integer word is arranged as follows:
B~

7

63 62 61 60 59 58 57
63

-2

62

2

61

2

60

2

59

2

58

2

57

6

5

4

3

o

2

76543210

2

22222222
TB001040

The 64-bit word is interpreted as a two's-complement integer. For integer multiplications, the user has the option of
interpreting integers as unsigned. An unsigned double-precision integer has a format similar to that of the two's-complement integer, but with an MSB weight of 263.

IEEE Formats
IEEE Single Precision
The IEEE single-precision word is 32 bits wide and is arranged in the format shown below:

31

30 29 28 27 26 25
7

6

5

4

3

2

24 23 22 21 20 19 18·
1

0

-1

222222222
sign

biased exponent (e)

-2

-3

-4

-5

2222'

.

. 3 2

1 0

-20 -21 -22 -23

.. 2222

fraction (f)

TB001050

The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 23-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers. 0 may have either sign.
The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias
value is 127. If, for example, the multiplicative value for a floating-point number is to be 2a, the value of the biased
exponent is a + 127, where "a" is the true exponent.

1-172

Am29027
The fraction is a 23-bit unsigned fractional field containing the 23 least significant bits of the floating-point number's
24-bit mantissa. The weight of the fraction's most significant bit is 2- 1• The weight of the least significant bit is 2-23 •
An IEEE floating-point number is evaluated or interpreted as follows:
Not a Number
Infinity
Normalized number
Denormalized number
Zero

If e=255 and f;tO ...... value=NaN
If e = 255 and f = 0 ......
If 0-1023 (1.1)
If e = 0 and f 0 . . . . . . .• value = (-1 )52""1022 (0.1)
Ife=Oandf=O ........ value=(-1)50

*

Not a Number
Infinity
Normalized number
Denormalized number
Zero

Infinity: Infinity can have either a positive or negative sign. The interpretation of infinities is determined by mode register bit AP.
NaN: A NaN is interpreted as a signal or symbol. NaNs are used to indicate invalid operations and as a means of
passing process status through a series of calculations. They arise in two ways: either generated by the Am29027 to
indicate an invalid operation, or provided by the user as an input. A signaling NaN has the MSB of its fraction set to 0
and at least one of the remaining fraction bits set to 1. A quiet NaN has the MSB of its fraction set to 1.
The IEEE format is fully described in ANSI/IEEE Standard 754-1985.

1·173

29K Family CMOS Devices

DEC Formats
DECF
The DEC F word is 32 bits wide and is arranged in the format shown below:

31 30 29 28 27 26 25
6

5

4

3

24 23
2

1

22
0

21 20 19 18 .

-2

-3

-4

-5

3

-6

1 0

-21 -22 -23 -24

2

222222222222·
biased exponent (e)

2

2

2 2

fraction (f)
TBO01070

The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 23-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers; 0 has a positive sign.
The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias
value is 128. If, for example, the multiplicative value for a floating-point number is to be 2", the value of the biased
exponent is a + 128, where "a" is the true exponent.
The fraction is a 23-bit unsigned fractional field containing the 23 least significant bits of the floating-point number's
24-bit mantissa. The weight of the fraction's most significant bit is 2-2 • The weight of the least significant bit is 2-24 •
A DEC F floating-point number is evaluated or interpreted as follows:
If e¢O ..•.•.......... value¢(-1)S2O-128 (0.11)
H s .. 0 and e .. 0 ...... value .. 0
If s .. 1 and e .. 0 ..•.... value =DEC· Reserved Operand

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a signal or symbol. DEC-Reserved Operands
are used to indicate invalid operations and operations whose results have overflowed the destination format. They
may also be used to pass symbolic information from one calculation to another.
The DEC formats are fully described in the VAXTM Architecture Manual.
DECO
The DEC D word is 64 bits wide and is arranged in the format shown below:
63

62 61 60

59

58 57 56 55

6543210

54

53 52

51

50

.

.

-2-3-4-5-6

222222222222
biased exponent (e)

.

3
2

fraction (f)

2

1

0

-53 -54 -55-56

2

2

2
TBO01080

The floating-point word is divided into three fields: a single-bit sign, an 8-bit biased exponent, and a 55-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers; 0 has a positive sign.
The biased exponent is an 8-bit unsigned integer representing a multiplicative factor of some power of 2. The bias
value is 128. If, for example, the multiplicative value for a floating-point number is to be 2", the value of the biased
exponent is a + 128, where "a" is the true exponent.
The fraction is a 55-bit unsigned fractional field containing the 55 least significant bits of the floating-point number's
56-bit mantissa. The weight of the fraction's most significant bit is 2-2 • The weight of the least significant bit is 2-56•
A DEC D floating-point number is evaluated or interpreted as follows:
If e ¢ 0 .. . . . . . . . . . . . •. value = (-1 )$20-128 (0.11)
H s = 0 and e = 0 ....... value = 0
If s'"' 1 and e =0 ....... value = DEC-Reserved Operand

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a signal or symbol. DEC-Reserved Operands·
are used to indicate invalid operations and operations whose results have overflowed the destination format. They
may also be used to pass symbolic information from one calculation to another.
The DEC formats are fully described in the VAX Architecture Manual.
1-174

Am29027

DECG
The DEC G word is 64 bits wide and is arranged in the format shown below:
54 53 52

63 62 61 60

sign

51

50 49

48 47

biased exponent (9)

3

2

1

0

T8001090

fraction (f)

The floating-point word is divided into three fields: a single-bit sign, an 11-bit biased exponent, and a 52-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers; 0 has a positive sign.
The biased exponent is an 11-bit unsigned integer representing a multiplicative factor of some power of 2. The bias
value is 1024. If, for example, the multiplicative value for a floating-point number is to be 2&, the value of the biased
exponent is a + 1024, where "a" is the true exponent.
The fraction is a 52-bit unsigned fractional field containing the 52 least significant bits of the floating-point number's
53-bit mantissa. The weight of the fraction's most significant bit is 2-2 • The weight of the least significant bit is 2-53.
A DEC G floating-point number is evaluated or interpreted as follows:
If e'l: 0 .. . . . . . . . . • . . .. value = (-1 )S20-1024 (O.H)

If s=O and
If s =1 and

9=0 ....... value=O
9 0 ....... value DEC-Reserved Operand

=

=

DEC-Reserved Operand: A DEC-Reserved Operand is interpreted as a Signal or symbol. DEC-Reserved Operands
are used to indicate invalid operations and operations whose results have overflowed the destination format. They
may also be used to pass symbolic information from one calculation to another.
The DEC formats are fully described in the VAX Architecture Manual.

IBM Formats
IBM Single Precision
The IBM single-precision word is 32 bits wide and is arranged in the format shown below:

31

sign

30 29 28

27 26 25 24

biased exponent (e)

23 22 21

20 19 18

fraction (f)

3

2

1

a

T8001080

The floating-point word is divided into three fields: a single-bit sign, a 7-bit biased exponent, and a 24-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers; a true 0 has a positive sign.
The biased exponent is a 7-bit unsigned integer representing a multiplicative factor of some power of 16. The bias
value is 64. If, for example. the multiplicative value for a floating-point number is to be 16&. the value of the biased
exponent is a + 64. where "a" is the true exponent.
The fraction is a 24-bit unsigned fractional field containing the 24 least significant bits of the floating-point number's
25-bit mantissa. The weight of the fraction's most significant bit is 2- 1• The weight of the least significant bit is 2-24 •
An IBM floating-point number is evaluated or interpreted as follows:
Value = (-1)S 16H;4(0.f)

Zero: There are two classes of zero. If the sign, biased exponent, and fraction are all zero, the operand is known as a
"True Zero." If the fraction is zero, but the sign and biased exponent are not both zero, the operand is known as a
"Floating-point Zero."
The IBM format is fully described in the IBM System/370 PrinCiples of Operation Manual.
1-175

29K Family CMOS Devices
IBM Double Precision
The IBM double-precision word is 64 bits wide and is arranged in the format shown below:

63

62 61 60 59 58 57 56
5

4

3

2

1

0

55
-1

54 53
-2

-3

52
-4

51
-5

50

222222222222
sign

biased exponent (e)

-6

fraction (f)

3

2

1

0

-53 -54 -55-56

2

2

2

2

TBOO110

The floating-point word is divided into three fields: a single-bit sign, a 7-bit biased exponent, and a 56-bit fraction.
The sign bit is 0 for positive numbers and 1 for negative numbers; a true 0 has a positive sign.
The biased exponent is a 7-bit unsigned integer representing a multiplicative factor of some power of 16. The bias
value is 64. If, for example, the multiplicative value for a floating-point number is to be 16a, the value of the biased
exponent is a + 64, where "a" is the true exponent.
The fraction is a 56-bit unsigned fractional field containing the 56 least significant bits of the floating-point number's
57-bit mantissa. The weight of the fraction's most significant bit is 2- 1 • The weight of the least significant bit is ~56. An
IBM floating-point number is evaluated or interpreted as follows:
Value = (_1)5 16tH;4(0.f)

Zero: There are two classes of zero. If the sign, biased exponent, and fraction are all zero, the operand is known as a
"True Zero." If the fraction is zero, but the sign and biased exponent are not both zero, the operand is known as a
"Floating-point Zero."
The IBM format is fully described in the IBM System/370 PrinCiples of Operation Manual.

1-176

Am29027

APPENDIX B-ROUNDING MODES
The round mode is selected by mode register field RMS as follows:
RMS

Round Mode

000
001
010
011
100
101
11 X

Round to Nearest (IEEE)
Round to Minus Infinity (IEEE)
Round to Plus Infinity (IEEE)
Round to Zero (IEEE)
Round to Nearest (DEC)
Round Away from Zero
Illegal Value

Round to Nearest (IEEE)
The infinitely precise result of an operation is rounded to the closest representable value in the destination format. If
the infinitely precise result is exactly halfway between two representations, it is rounded to the representation having
a least significant bit of O.

Round to Minus Infinity (IEEE)
The infinitely precise result of an operation is rounded to the closest representable value in the destination format that
is less than or equal to the infinitely precise result.

Round to Plus Infinity (IEEE)
The infinitely precise result of an operation is rounded to the closest representable value in the destination format that
is greater than or equal to the infinitely precise result.

Round to Zero (IEEE)
The infinitely precise result of an operation is rounded to the closest representable value in the destination format
whose magnitude is less than or equal to the infinitely precise result.

Round to Nearest (DEC)
The infinitely preCise result of an operation is rounded to the closest representable value in the destination format. If
the infinitely precise result is exactly halfway between two representations, it is rounded to the representation having
the greater magnitude.

Round Away from Zero
The infinitely preCise result of an operation is rounded to the closest representable value in the destination format
whose magnitude is greater than or equal to the infinitely precise result.
A graphical representation of these round modes is shown in Figures B1 and B2.
The IEEE standard specifies that all four "IEEE" modes be available so that the user may select the mode most
appropriate for the algorithm being executed. The DEC standard specifies that two rounding modes be availableRound-to-Nearest (DEC) and Round-to-Zero. The IBM standard specifies that all operations be performed using the
Round-to-Zero mode.
It should be noted, however, that the Am29027 permits anyof the supported rounding modes to be selected, regardless of the format of the operation. It is permissible to use one of the IEEE rounding modes with an IBM operation, or
DEC rounding with an IEEE operation, or any other possible combination. For those integer operations where rounding is performed, any rounding mode may be chosen. This flexibility allows the userto select the mode most appropriate for the arithmetic environment in which the processor is operating.

1-177

~

r-.)

...,

(0

~

Q:)

-(P+lq)
Infinitely Precise Result

Rounded Result

-P

-(P-lq)

0

P-lq

P

P+lq

\\111 \\1 ~ 1 ~ III \\lll \\1 ~

~ III

J,I

It,

-P

-(P+lq)

-(P-lq)

t,1

0

P-lq

It,

P

P+lq

Infinitely Precise Result

Rounded Result

~ lOP

t, I

-(P+lq)

-P

-(P-lq)

-P

-(P-lq)

0

P-lq

P

P+lq

P

P+lq

P

P+lq

1/& 1~ 1 ~ 1/& lOP 1~
I t,

t, I

0

P-lq

I t,

Round to Minus Infinity

-(P+lq)
Infinitely Precise Result

~1

Rounded Result

t,1

-P

~\\1

-(P-lq)

0

~1~
It,

-(P+lq)

-P

-(P-lq)

P-lq

1 ~1

~1

t,1

0

P-lq

"T1

Q)

~

-<
o

3:

oen
c

(!)

<
0(!)

Round to Nearest (Unbiased)
-(P+lq)

"

~1 ~
It,

P

Round to Plus Infinity

Figure 81. Graphical Interpretation of Round-to-Nearest (Unbiased), Round-to-Minus-Inflnity,
and Round-to-Plus-Infinlty Rounding Modes

P+lq

tn

-(P+1q)
Infinitely Precise Result

~l

Rounded Result

J,I

-P

-(P-1q)

~\l

~\\l~
IJ,

-(P+1q)

-P

-(P-1q)

0

P-1q

P

P+1q

P-1q

P

P+1q

P-1q

P

P+1q

P

P+1q

P

P+1q

1 ~llP liP 1~
J,I

0

J,

Round to Zero

-(P+1q)
Infinitely Precise Result

Rounded Result

.

-P

.

-(P-1q)

0

\\111 \\1 ~ 1 ~ III \\111 \\1 ~

~ 111
J, I

I J,

-(P+1q)

-P

-(P-1q)

J, I

0

I

P-1q

L-,

Round to Nearest (DEC)

-(P+1q)
Infinitely Precise Result

Rounded Result

-P

~ liP
J,I

-(P+1q)

-(P-1q)

0

P-1q

liP 1~ 1 ~ 1
IJ,

-P

-(P-1q)

~\\1

J,I

0

P-1q

~\\1 ~
I

P

L-,

P+1q

Round Away from Zero

-'

.....
~

co

Figure B2. Graphical Interpretation of Round-to-Zero, Round-to-Nearest (DEC),
and Round-Away-from-Zero Rounding Modes

l>

3

r-l

I.D

o

r-l
~

29K Family CMOS Devices

APPENDIX C-ADDITIONAL OPERATION DETAILS
There are several cases in which the implementation of the IEEE, DEC, and IBM floating-point standards in the
Am29C327 differs from the formal definitions of those standards. This appendix describes these differences.

Differences Between Floating-Point Arithmetic and Am29027 IEEE Operation
Section 7.3 of the IEEE-754 standard specifies that ''Trapped overflow on conversion from a binary floating-point format shall deliver to the trap handler a result in that or a wider format, possibly with the exponent bias adjusted, but
rounded to the destination's precision."
According to the IEEE standard, then, if a double-to-single IEEE operation overflows while traps are enabled, the
result is a double-precision operand, rounded to single-precision width (23-bit fraction), together with a correctly adjusted (double-precision) exponent and the appropriate flags for a trapped overflow.
In the case of an overflow in any IEEE operation, the Am29027 returns a result in the destination format specified by
the user, rounded to that destination format.
In the case of the double-to-single overflow described above, the result from the Am29027 is a single-precisionoperand, together with a correctly adjusted (single-precision) exponent and the appropriate flags for a trapped overflow.
A simple example serves to illustrate the discrepancy by describing the conversion of the double-precision IEEE number 52B123456789ABCD to single-precision, with traps enabled, and the round-to-nearest rounding mode selected.
This number is too large to be represented in single-precision format.
According to the IEEE standard, the result of this operation is the double-precision number 52B1234560000000, comprising the double-precision exponent of the input and a fraction truncated to 23 bits, together with flags V and X.
When the operation is performed in the Am29027, however, using the F' = P' operation with appropriate precision
controls, the result is the single-precision number 75891 A2B, comprising the single-precision (overflowed) exponent
reduced by 192 (decimal) and a single-precision fraction, together with flags V and X.
It should be noted that trapped operation is an optional part of the IEEE standard. Full adherence to the IEEE specification of trapped operation is therefore not necessary to ensure compliance with IEEE-754.

Differences Between DEC Floating-Point Arithmetic and Am29027 DEC Operation
The DEC F, DEC D, and DEC G standards, as implemented in the Am29027, differ from the implementations in a VAX
only in the way in which the subfields of the floating-point word are arranged. The differences are listed in Table C1.

Table C1. Differences In Am29027 and DEC Floating-Point Formats
Am29027 Arrangement
sign:
OECF

exponent:
fraction:

sign:
DECO

fraction:

exponent:
fraction:

1-180

bits 30-23
bits 22-0

bit 63

exponent: . bits 62-55

sign:
OECG

bit 31

VAX Arrangement
sign:
exponent:
fraction:

bit 15
bits 14-7
bits 6-0,
bits 31-16

sign:
exponent:
fraction:

bit 15
bits 14-7
bits 6-0,
bits 31-16,
bits 47-32,
bits 63-48

sign:
exponent:
fraction:

bit 15
bits 14-4
bits 3-0,
bits 31-16,
bits 47-32,
bits 63-48

bits 54-0

bit 63
bits 62-52
bits 51-0

Am29027

Differences Between IBM 370 Floating-Point Arithmetic and Am29027 IBM Operation
The Am29027's deviations from the IBM standard may be summarized as follows, assuming that the user has selected the round-to-nearest rounding mode:
1. The Am29027 provides more guard bits in its internal format than specified by the IBM standard. With certain
combinations of input operands, the Am29027 produces more accurate results than a standard IBM processorfor
instructions based on addition operations and comparisons.
2. The discrepancies are much larger for single-precision operations than double-precision operations, because the
difference in the number of guard bits is much greater (33 more for single, one more for double).
3. There is no universal rule for determining whether a given set of input operands will result in a discrepancy. Pro
vided the conditions in (1) above are met, the user must examine each operation on a case-by-case basis, taking
into account the input operands and the internal formats discussed in this section.
4. The Am29027 does not produce unnormalized results from additions. The results of all addition operations are
renormalized. Am29027 internal formats are compared with IBM internal formats in Figure C1.

Overflow
Bit

[Y] .....1____2_4_F_r_ac_t_io_n_B_its_ _ _

-.J

A

,

I

IGIGIGIGIGIGIGIGIGI---IGIGIGIGIGIGIGIGIGIUil

a. Am29027 Internal Format-lBM Single-Precision

Overflow
Bit
I

S~CitkY

37 Guard Bits
I

I

5
Guard
Bits

Sticky
Bit

I

I

....___________________5_6_F_ra_ct_io_n_B_it_s____________________~ IGI GI GI GI GI[§]

~.I

b. Am29027 Internal Format-IBM Double-Precision
4
Overflow
Bit

Guard
Bits

I

~.I

I

24 Fraction Bits

I GI GI GI GI

c. IBM Internal Format-Single-Precision
4
Guard
Bits

Overflow
Bit
I

~.I~

_______________________5_6_F~ra~ct~io~n~B~it~s______________________~

I

d. IBM Internal Format-Double-Precision

09114-016C

Figure C1. Differences In Internal Mantissa Formats of an IBM CPU and the Am29027

1-181

29K Family CMOS Devices

APPENDIX D-TRANSACTION REQUEST/OPERATION TIMING

ClK
Transaction
Request

----«
~

____

~'~

~>----

__________-J,

~<

~>-----I

------------------4~~c------------~--------------------I
I

1
Data Accepted
on this Edge
a. Normal Operation, Data Accepted

ClK
Transaction
Request

--~<~--~)>-----------«
-----«

»------

)>----

\'--------'/
b. Halt On Error Mode, Unmasked Exception Present
091148-017C

Note: Signals A31 -Ao and 0 31 -0 0 are the Am29000 address and data buses, respectively.

Figure 01. Timing for the Write Operand R, Write Operand 8, Write Operands R,
8, and Write Instruction Transaction Requests

1·182

Am29027

ClK
Transaction
Request

A3'-Ao

0 3,-00

~

<
<

)

<

~

I

I

COA
OROY
OERR

1
Data Accepted
on this Edge
a.CDA Low

ClK

Transaction
Request

A3'-Ao

0 3,-00
COA

)

<
<
<

I

~
~

I

\

OROY

~

OERR

t
Data Accepted
on this Edge
b. CDA High Initially

Note: Signals A3'-Ao and 0 3,-00 are the Am29000 address and data buses, respectively.
09114-018C

Figure 02. Timing for the Write Mode, Write Status, and Write Register File Precisions
Transaction Requests

1-183

29K Family CMOS Devices

ClK
Transaction
Request

----~<~--~)~--------------I

-~(

~>------

p>-------

I

~(

Registers Advanced
on this Edge

8.CDA Low

ClK
Transaction
Request

)>-----

-----«

~--------------------~I

~>-----

-----«

I

)>-----

-~(

I

\'-----f-~
. . . . . . . .~...............~
t

Registers Advanced
on this Edge

b. CDA High Initially
09114-019C
Note: Signals A31-Ao and D31-Do are the Am29000 address and data buses, respectively.

Figure 03. Timing for the Advance Temp. Registers Transaction Request

1·184

Am29027

ClK
Transaction
Request

\:-_____~X
------'\'I..~
X
lSBs

RD MSBs

)>-----------------

MSBs

)>-----------------

''\

/

-----~~~c----------------------------------------------------------

a. Read Result MSBs Request Issued in Cycle after
Read Result LSBs Request

ClK
Transaction
Request

<

~____________...J)
--~'..~

~'\

lSBs

X

MSBs

)

/

)>-'-----

Read Result MSBs

<
\

MSBs

)>----j'

-----~,~c--------------------------------------------------------

b. Read Result MSBs Request Issued Two or More Cycles after
Read Result LSBs Request

09114-020C

Figure 04. Timing for the Read Result LSBs Transaction Request, No Unmasked Exceptions

1-185

29K Family CMOS Devices

elK
Transaction
Request

~----~)~------------------~~~
X · )>-----

'\
"\

/
/
09114-021C

Figure 05. Timing for Read Result LSBs Transaction Request,
Unmasked Exception Present

1-186

Am29027

ItClK
Transaction
Request

1 or More
Cycles

-I

IL
-C~'c---:- - - - 1 ) > - - - - - ~,~

)~--------­

'\

/

--------~,~,------------------------------------------------

a. No Unmasked Exceptions Present

ItClK
Transaction
Request

1 or More
Cycles

-I

IL

-c:

)>------

~~~

)~---------

'\
'\

/
/
b. Unmasked Exceptions Present
09114-022C

Figure 06. Timing for Read Result

MSBs~

Read Flags, and Read Status Transaction Requests

1-187

29K Family CMOS Devices

IClK
Transaction
Request

1 or More
Cycles

-I

IL

X

-C~ave State
~

- - - - ' \ '..

lSBs

X

Save State

)>-----------------

MSBs

)>------------------

/

~\

a. Second Save State Request Issued In Cycle
Following First Request

ClK
Transaction
Request

<

r--------J)
---~'..~

~\

lSBs

X

MSBs

)

/

Save State

<

\

MS8s

)>-----)>------

/

----~~~(---------------------------------09114-023C

b. Second Save State Request Issued Two or More Cycles
after First Request

Figure D7. Timing forthe Save State Transaction Request, 64-Bit Resources (Registers R, R-Temp, S,
S-Temp; Register File Locations RF7-RFo: Mode Register)

1-188

Am29027

I• ClK

Transaction
Request

1 or More
Cycles

-I

~

----»).------

-C:~:

~,~

)~-------­

'\

/
09114-024C

Figure 08. Timing for the Save State Transaction Request, 32-8it Resources (Instruction Register,
Register I-Temp, Status Register, Precision Register)

~

Operation in Progress
6 Cycles

II

ClK
Transaction
Request
A 31-Ao/
0 31 -0 0
OREQTo

--<3G)----(

)

RM

S

--GX3

~

COA

V

OROY
DEAR

Notes:

WRS = Write Operands R, S
= Read MSBs
RM
INST = Addition Instruction

WI
A. B
RES

= Write Instruction
= Operands A, B
= Result

Signals A 31 -Ao and 0 31-0 0 are the Am29000 address and data buses, respectively.
09114-025C

Figure 09. Typical Timing for Single-Precision Operation in Flow-Through Mode-Perform the Operation
A PLUS 8, Readthe Result; Mode Register Field PLTC=6

1-189

29K Family CMOS Devices
b

Operation in Progress
J
6 Cycles
- - -.....

r...- - - ClK

~___________R_l____________~

Transaction
Request

__~f\~____________________

DREOTo

Notes:

WR
WI
RM
B
lSB

= Write Operand R
= Write Instruction
"" Read MSBs
Operand B
Result LSBs

=
=

WS = Write Operand S
Rl = Read lSBs
A
= Operand A
INST = Addition Instruction
MSB = Result MSBs

09114-026C

Signals A 31 -Ao and 0 31-0 0 are the Am29000 address and data buses, respectively.

Figure 010. Typical Timing for the Double-Precision Operation In Flow-Through Mode-Perform the
Operation A PLUS B, Read the Result; Mode Register Field PlTC=6

ClK
Transaction
Request

A31 -AoI
0 31-00
OREOTo

-<3G)---(

)

RM

~
~

~

COA
OROY
OERR

Notes:

WRS
RM
INST

= Write Operands R, S
= Read MSBs
= Addition Instruction

WI
A, B
RES

= Write Instruction
= Operands A, B
= Result

V
V
09114-027C

Signals A31 -Ao and 0 31 -00 are the Am29000 address and data buses, respectively.

Figure 011. Typical Timing for Single-Precision Operation in Flow-Through Mode, with Unmasked
Exception Present-Perform the Operation A PLUS B, Read the Result; Mode Register Field PlTC=6
1-190

Am29027
~It-----

Operation in Progress
J
6 Cycles
---~,

ClK

·)~------

Transaction
Request

OREOTo

_________
R_l_ _ _ _ _ _ _

__~f\~___________

COA

\\-------1/

OROY

----------~--~\
Notes:

WR = Write Operand R
WI
= Write Instruction
A
= Operand A
INST = Addition Instruction
MSB = Result MSBs

WS = Write Operand S
Rl = Read lSBs
B
= Operand B
lSB = Result lSBs

/

09114-028C

Signals A3'-Ao and 0 3,-00 are the Am29000 address and data buses, respectively.

Figure D12. Typical Timing for Double-Precision Operation in Flow-Through Mode, with Unmasked
Exception Present-Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC=6

ClK
Transaction
Request

~_ _ _ _ _ _ _ _R_M_ _ _ _ _ _~)~--------~--------------~~r----------------

OREOTo

~-------------------

v
Notes:

WRS = Write Operands R, S
RM
= Read MSBs
INST = Addition Instruction

WI = Write Instruction
A. B = Operands A, B
RES = Result

V
09114-029C

Signals A3,-Ao and 0 3,-00 are the Am29000 address and data buses, respectively.

Figure D13. Typical Timing for Single-Precision Operation in Flow-Through Mode, with DRDY
Advanced-Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC=6
1-191

29K Family CMOS Devices
...
~ _ _ _ _ Operation in Progres;:;..s---tl~
6 Cycles
CLK

~___________R_L__________~~

Transaction
Request

______________

OREOTo

Notes:

WR
WI
RM
B
LSB

=
=

Write Operand R
Write Instruction
= Read MSBs
= Operand B
= Result LSBs

-J~~__________________________________________

WS =
RL =
A
=
INST..
MSB =

Write Operand S
Read LSBs
Operand A
Addition Instruction
Result MSBs

09114-030C

Signals A 31 -Ao and 0 31 -0 0 are the Am29000 address and data buses, respectively.

Figure 014. Typical Timing for Double-Precision Operation In Flow-Through Mode, with ORO
Advanced-Perform the Operation A PLUS B, Read the Result; Mode Register Field PLTC=6

CLK
Transaction
Request

~~_ _ _ _ _ _R_M_ _ _ _ _ _ _-J)r-------------

~~--------------~~r---------OREOTo

~~---------------------------

v

Notes:

WRS = Write Operands R, S
RM
= Read MSBs
INST = Addition Instruction

V
WI = Write Instruction
A, B = Operands A, B
RES = Result

09114-031C

Signals A 31 -Ao and 0 31 -0 0 are the Am29000 address and data buses, respectively.

Figure 015. Typical Timing for Single-Precision Operation In Flow-Through Mode, with DROY Advanced
and Unmasked Exception Present-Perform the Operation A PLUS B, Read the Result;
Mode Register Field PLTC 6

=

1-192

Am29027
Operation In Progress
d
lit-I- - - 6 Cycles
- - -.....
,
ClK
Transaction
Request

OREOTo

_______
R_l________

~)~------

__~f\~_______________________

COA

\'----J/

OROY

---------------------~\
Notes:

WR = Write Operand R
WI
= Write Instruction
A
= Operand A
INST = Addition Instruction
MSB = Result MSBs

/

WS = Write Operand S
Rl = Read lSBs
B
= Operand B
lSB = Result lSBs
09114-037C

Signals A31-Ao and 0 31-00 are the Am29000 address and data buses, respectively.

Figure D16. Typical Timing for Double-Precision Operation in Flow-Through Mode, with DRDV Advanced
and Unmasked Exception Present-Perform the Operation A PLUS B, Read the Result;
Mode Register Field PLTC 6

=

Operation 2
- - -......- - 6 Cycles ------;

lJlnIL

elK
Transaction
Request

OREOTo

OERR
Notes:

WRS = Write Operands R. S
WR = Write Operand R
A. B = Operands A, B
C
= Operand C
RES = Result

WI = Write Instruction
RM = Read MSBs
11
= Addition Instruction
12
= Multiplication Instruction

09114-032C

Signals A31 -Ao and 0 31 -0 0 are the Am29000 address and data buses, respectively.

Figure D17. Typical Timing for Overlapped Single-Precision Operations In Flow-Through Mode; Perform
the Compound Operation (A PLUS B) x C by Performing Operations: (1) RFo ~ A PLUS B, (2) RFo x C
Mode Register Field PL TC 6

=

1-193

29K Family CMOS Devices
Operation 2

- - - - . . . - - 6 Cycles-tj

~

CLK

~----~~----~~

Transaction
Request

DREOTo

______~r\~~r\~~~~,,~---------_ _ _ _-----J/
~.~,_ __

DRDY
DERR
Notes:

WR
WI
RM
B

11
LSB

= Write Operand R

WS =
RL =
A
=
C
=
12
=
MSB =

= Write Instruction

= Read MSBs
= Operand B

= Addition Instruction
= Result LSBs

Write Operand S
Read LSBs
Operand A
Operand C
Multiplication Instruction
Result MSBs

Signals A31 -Aol and DrDo are the Am29000 address and data buses, respectively.

09114-033C

Figure 018. Typical Timing for Overlapped Double-Precision Operations In Flow-Through Mode;
Perform the Compound Operation (A PLUS B) x C by Performing Operations:
(1) RFo ~ A PLUS B, (2) RFo x C; Mode Register Field PLTC = 6
Mode Register Field PLTC 6

=

CLK
Transaction
Request

A31 -N
D31-Do

DREOTo

I'J

CDA

\
V

DRDY
DERR
Pl STAGE 1
Pl STAGE 2

Notes:

A PLUS B

WI
RM

A. B•... =

Write Instruction
Read MSBs
Operands

C PLUS D
A PLUS B

E PLUS F
C PLUS D

V

V--

G PLUS H
E PLUS F

I PLUS J
G PLUS H

WRS=· Write Operands R. S
I
= Addition Instruction
RES = Result

Signals A 31 -Ao and D31-Do are the Am29000 address and data buses. respectively.

Figure 019. Typical Timing for Single-Precision Operations in Pipeline Mode;
Perform a Series of Addition Operations A PLUS B, C PLUS 0,
E PLUS F, ... Mode Register Field PLTC 3

=

1-194

ClK
Transaction
Request

A31 -Ad
0 31 -00

n

OREQTo

n

n

n

n'--___

COA - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - OROY

\

;

____ U\

I

'--

OERR------------------------------------------------------------~---------------

PlSTAGE 1

A PLUSB

Pl STAGE 2

Notes: WI
=Write Instruction
WS
= Write Operand S
RM
= Read MSBs
A, B, ... = Operands
MSB
= Result MSBs

C PLUS D

E PLUS F

I

G PLUS H

I PLUSJ

A PLUS B

C PLUSD

.I

E PLUS F

G PLUS H

WR = Write Operand R
Rl = Read lSBs
I
= Addition Instruction
lSB = Result lSBs

Signals A31 -Ao and 0 31 -00 are the Am29000 address and data buses, respectively.
09114-035C

Figure 020. Typical Timing for Double-Precision Operations in Pipeline Mode;
Perform a Series of Addition Operations A PLUS B, C PLUS OJ
E PLUS F, ... Mode Register Field PLTC = 3

.....
~

U)

U1

»

3

N
<0

o

~

Table of Contents

CHAPTER 2
29K Family Support Tools

ASM29K Data Sheet ..............................................................................................................................................2-3
HighC29K Data Sheet .......................................................................................................................................... 2-10
MON29K Data Sheet ...........................................................................................................................................2-17
XRAY29K Data Sheet ..........................................................................................................................................2-24

ASM29K

Advanced
Micro
Devices

ASM29K
Cross-Development Toolkit, Release 2
DISTINCTIVE CHARACTERISTICS
•

•

Relocatable Macro Assembler supports complete Am29000™ microprocessor Instruction
set.

•

LIbrarian provides management facility for organizing modules Into logical collections of
functions.

LInker/Loader combines separately assembled
modules by resolving external references and
by searching libraries.

•

IEEE Software Floating-Point Emulation
routines.

•

Available for the PC-ATTM, and Sun-3™ development environments.

GENERAL DESCRIPTION
Processor performance depends on the processor's
hardware and software environment. The key to maximizing performance lies in the realization that the processor is part of a system that is a collection of components that must be integrated properly. To take
advantage of the advanced RiSe architecture of the
Am29000 microprocessor, equally sophisticated software tools must be available.
The ASM29KTM cross-development toolkit offers such
a development environment for creating efficient and
portable Am29000 microprocessor software. The package consists of the assembler, the linker, the floatingpoint emulation routines, and the object module librarian. These tools allow users to deSign more efficient
systems and applications than ever before.

Cross-development is the design of an application program on one computer (the host system) and the execution of that same application program on a different computer (the target system). The operating system on
the host, such as UNIXTM or DOS, provides the tools
needed to create the application program. These tools
include editors for writing the source code, compilers
and assemblers for translating the modules into executable code, and utilities for preparing the application
for execution. The Am29000 microprocessor-based target computer generally does not provide the tools required to develop the application program. Figure 1
shows the path that an application follows from development on the host system to execution on the target
system.

Target Computer

Host Computer

Cj5 nnnn c

=

c::=:J
Am29000
Microprocessor

o000
Via On-Board
Monitor or
ADAPT29K
Debugger

Figure 1. Cross Software Development

publication #
~
~
10292
B
/0
Issue Date: September 1989

2-3

29K Family Support Tools
The ASM29K cross-development toolkit transforms a
PC or Sun-3 workstation host into a powerful software
development environment. ASM29K software assembles user source and produces a relocatable object
module. This module can be combined with other
relocatable object modules (derived from the assembler
or high-level language cross-compilers) using the
ASM29K linker. Library modules prepared by the librarian can be linked in at this point as well. The resulting absolute object module then can be downloaded to a target system.
AMD has established and published the Am29000
microprocessor Common Object File Format (COFF) to
which all Am29000 development tools conform. The
AMD COFF format extends the already standard AT&T
COFF format to support source-level debugging and
other Am29000 microprocessor-specific features. Similarly, AMD has established a common calling conven-

tion that maximizes performance on the Am29000
microprocessor as well as defining another standard
for software vendors. This has led to a variety of compilers, assemblers, debuggers, and associated tools
that may be mixed freely by developers of Am29000
microprocessor software.
The contents of the ASM29K cross-development toolkit
include:

•
•
•
•
•
•

ASM29K macro assembler
ASM29K linker
ASM29K librarian
Hex utilities
IEEE floating-point emulation routines
Documentation

ORDERING INFORMATION

Licensing

Order Numbers

The ASM29K cross-development toolkit is licensed
through AMD's Standard End-User Software License
Agreement (Boxtop). This license does not require a
signature; breaking the seal on the software envelope
indicates acceptance of the license terms. If changes
are required to the license agreement, they can be arranged through your AMD sales representative. Many
software products require the customer to provide a
CPU ID number when ordering the product. Contact
your sales representative if this information is not available at the time of purchase. In addition, terms of the
license require the customer to complete a Software
Warranty card with the serial number and site of the host
computer on which the software will reside. This card
must be returned to AMD within 30 days of receipt forthe
warranty to be valid.

The ASM29K cross-development toolkit is available for
several different environments. Documentation can be
ordered separately. The order number (valid combination) is formed as a combination of:

2-4

•
•
•

•
•
•

Product Family
Product Category
Product Identifier
License Type
Host I OS Type
Media Type

ASM29K

ORDER INFORMATION (continued)
IlJl

I

Media Type
08 = 0.25" Sun cartridge tape, TAR format
14 = 3.5" DSHD floppies
21 = 9-track, 1600 BPI mag tape, TAR format
24 = 5.25" DSHD floppies

Host I 05 Type
07 = Sun-3
10= PC-AT

License Type
B = Boxtop
S = Signed
"-- = Not Applicable
Product Identifier
ASM = ASM29K Cross-Development Toolkit
Product Category
SWI = Software Product
DCI = Documentation Product
MN = Maintenance Agreement
Product Family
Am29000 Microprocessor

Valid Combinations
Valid Combinations list configurations planned to be supported in volume for this device. Consult the local AMD sales
office to confirm availability of specific valid combinations and to check on newly released combinations.
Host

Media

AM29000SW/ASMB0708 ASM29K Toolkit

Sun-3

0.25" cartridge tape, TAR format

AM29000SW/ASMS0708 ASM29K Toolkit

Sun-3

0.25" cartridge tape, TAR format

AM29000SW/ASMB0721

Sun-3

9-track, 1600 BPI tape, TAR format

Order Number

Product

ASM29K Toolkit

AM29000SW/ASMS0721 ASM29K Toolkit

Sun-3

9-track, 1600 BPI tape, TAR format

AM29000SW/ASMB1014 ASM29K Toolkit

PC-AT

3.5" DSHD floppies

AM29000SW/ASMS1014 ASM29K Toolkit

PC-AT

3.5" DSHD floppies

AM29000SW/ASMB1024 ASM29K Toolkit

PC-AT

5.25" DSHD floppies

PC-AT

5.25" DSHD floppies

AM29000DCIASM-99

ASM29K Documentation

UNIX

Not Media Specific

AM29000MNASM-07

ASM29K Maintenance

Sun-3

Not Media Specific

AM29000MNASM-10

ASM29K Maintenance

PC-AT

Not Media Specific

AM29000SW/ASMS 1024 ASM29K Toolkit

2·5

29K Family Support Tools

FUNCTIONAL INFORMATION
Assembler
The ASM29K assembler converts user-written
Am29000 assembly code into relocatable object modules. It produces standard COFF object modules that
can be linked with other assembled or compiled modules. Its advanced features permit the design of wellstructured modules that are easily maintained.
The assembler processes Am29000 microprocessor
instructions as defined in Chapter 8 of the Am29000
User's Manual. Each instruction mnemonic and register
identifier is recognized in both upper and lower case.
Identifiers (that is. user-named variables) can have up to
63 characters. all of which are significant. Integer. character. string. and floating-point constants are supported
as well as complex expression analysis.
In addition to the Am29000 microprocessor instructions.
the assembler supports a powerful macro facility. Programmers can define macros with multiple parameters
and direct macros to be repeated a specified number of
times. Macro code is inserted into the source code at the
position of the macro call. Macros may use local labels-labels that are visible only within the macro itself-to label an instruction that can be copied several
times throughout the program. Local labels are distinguished from regular labels by using the format "$n."
where n can be from one to six digits.
The assembler also provides a number of directives for
organizing the code into efficient sections or modules.
Use of the include directive merges separate files during
assembly. The section directive assigns areas of code
to named text. data. uninitialized memory. or initialized
memory sections. Conditional assembly is also supported. This useful feature allows the programmer to assemble code conditionally for debugging. The assembler directives are listed in Table 1.
The ASM29K software also produces a cross-reference
table for symbols. Flags allow the programmer to print
listings that contain expanded macros. instructions not
assembled due to conditional statements. and symbol
tables; and to insert user-specified headers into the
listing.
The assembler optionally emits debug information for
use with the XRAY29KTM source-level debugger. This
information allows the programmer to specify the symbolic names of variables and labels during debugging
sessions.
The wide selection of features available in the ASM29K
assembler gives the user the latest tools to produce
well-structured and maintainable code.

2-6

Linker
The ASM29K linker integrates a group of separately
compiled or assembled modules into a composite
module in which all references between modules are
resolved. It processes and produces COFF modules.
including any module produced by a compiler in any
language and any assembler that adheres to the AMDdefined COFF and calling-convention standards. Incremental linking is supported also. The ASM29K linker
produces an extensive load map with an optional
symbol cross-reference table.
Object module libraries are searched with required
modules automatically included. All code and data sections are given absolute addresses as specified by the
programmer. The linker provides options that create
ROMabie programs. generate warnings for possible
undefined external references. produce a global crossreference. and list defined symbols. Directives to the
linker may be included in a file (batch mode). on the
command line. or in combination. Programmers can use
the ASM29K to:
- Resolve external references between separately
compiled or assembled modules.
- Assign absolute addresses.
- Direct section ordering.
- Perform incremental linking.
- Load only those library modules referenced for efficient code space use.
-Generates optionally ROMabie programs.

Librarian
The ASM29K librarian is a management facility for organizing independently developed pieces of software
into logical units. It permits the addition. deletion. and replacement of object modules in one or more libraries.
The ASM29K librarian:
-Organizes and initializes modules into a library file.
- Lists library contents and information.
- Lists a library directory.

ASM29K

Table 1. Assembler DIrectives
Group

Directives

Meaning

File Processing

.end
.err
.ident
.include
.else
.endif
.if
.ifdef
.ifeqs
.ifnes
.ifnotdef
.eject
.lflags
.list
.nolist
.print
.sbttl
.space
.title
.equ
.extern
.global
.reg
.set
.comm
.data
.dsect
.lcomm
.sect
.text
.use
.align
.ascii
.block
.byte
.double
.extend
.float
.hword
.word
.endr
.irep
.irepc
.rep
.endm
.exitm
.macro
.purgem
.def
.dim
.endef
.file
.line
.In
.scl
.size
.tag
.type
.val

End of Assembly
Generate Assembly Error
Specify Module Name
Include Text File
Alternate Condition
End of Conditional Assembly Block
Assemble if Value is Not Zero
Assemble if Identifer is Defined
Assemble if Strings are Equal
Assemble if Strings are Not Equal
Assemble if Identifier is Not Defined
Advance to Top of Page
Set Listing Flags
Enable Listing
Disable Listing
Print to Standard Output
Set the Listing Subtitle
Space N Lines
Set the Listing Title
Equate a Symbol to a Value (Unlimited Scope)
Declare Symbols as External to This Module
Make Symbols Visible to Other Modules
Declare a Symbol as a Synonym for a Register
Set a Symbol to a Value (Limited Scope)
Declare a Common Symbol
Use the .data Section
Declare a Dummy Section
Declare a Local bss Symbol
Declare a New Section
Use the .text Section
Use a Declared Section
Specify Byte Alignment
Store the String
Reserve Bytes
Initialize Bytes
Initialize Double-Precision Values
Initialize Extended-Precision Values
Initialize Single-Precision Values
Initialize Half-Words
Initialize Words
End of Repeat Block
Repeat for Each Item in the List
Repeat for Each Character in the String
Repeat N Times
End Macro Definition
Terminate Macro Expansion
Macro Heading
Purge All Macros Listed
Define Symbol Table Entry Directive
Dimensions of an Array Attribute
End of Symbol Definition Block Directive
Source Filename Directive
Source-File Line-Number Directive
HLL Source-File Line-Number Directive
Storage Class of a Symbol Attribute
Size of a Symbol Attribute
Structure, Union, or Enumeration Identifier Attribute
Basic and Derived Type of a Symbol Attribute
Value of a Symbol Attribute

Conditional Assembly

Listing Control

Symbol Declaration

Section Declaration

Data Storage Declaration

Repeat Block

Macro Definition

High-Level Language (HLL) Debugging

2·7

29K Family Support Tools

Floating-Point Emulation
The Am29000 microprocessor instruction set includes
floating-point and integer math operations. In the current processor implementation, these instructions
cause traps to routines that perform the operations. The
user is provided with source to two complete sets of routines that emulate IEEE Floating-Point Standard 754 for
each of the instructions listed in Table 2.
The first set of routines is provided for users who
have integrated an Am29027™ arithmetic accelerator
into their systems. The Am29000 microprocessor

math instructions are emulated using the Am29027
co-processor.
The second set of routines implements emulation of the
floating-point operations entirely in software. No special
hardware is required.
Documentation instructs users how to integrate the
package into their target system. Both packages are designed to insure upward compatibility with next generation processors.

Table 2. Arithmetic Instructions
Type

Mnemonic

Operation

Integer Arithmetic

MULTIPLY
MULTIPLYU
DIVIDE
DIVIDEU
FADD
FSUB
FMUL
FDIV
DADD
DSUB
DMUL
DDIV
FEQ
DEQ
FGT
DGT
FGE
DGE
CONVERT

Signed Multiply
Unsigned Multiply
Signed Divide
Unsigned Divide
Single-Precision Add
Single-Precision Subtract
Single-Precision Multiply
Single-Precision Divide
Double-Precision Add
Double-Precision Subtract
Double-Precision Multiply
Double-Precision Divide
Single Compare Equal To
Double Compare Equal To
Single Compare Greater Than
Double Compare Greater Than
Single Compare Greater Than Or Equal To
Double Compare Greater Than Or Equal To
Convert Data Format

Single-Precision Floating-Point Arithmetic

Double-Precision Floating-Point Arithmetic

Floating-Point Compare

Data Format Conversion

Hex Utilities
A set of hex utilities are provided to create Hex files for
downloading into target systems and for creating ROM
images. These tools convert AMD standard COFF files
into Motorola® S-Record or Tektronix® Extended Hex
.files. These hex utilities and a brief description of each
are listed below.
Converts a binary file into an ASCII file.

•

btoa

•

coff2hex Converts a COFF file into a hex file.

•

sim29

2-8

ASM29K software architectural
simulator.

•

nm29

Prints name list of a COFF file,

•

romcoff

Generates COFF file for ROM.

•

cvcoff

Translates Am29000 microprocessor
COFF files between big endianJlittle
end ian hosts.

•

strpcoff

Strips symbolic information from a
COFF file.

ASM29K

WARRANTY and SUPPORT
Software Warranty
Software programs licensed by AMD are covered by the
warranty and patent indemnity provisions appearing in
AMD's standard software license forms. AMD makes no
warranty, express, statutory, implied or by description,
regarding the information set forth herein or regarding
the freedom of the described software program from
patent infringement. AMD reserves the right to modify,
change or discontinue the. availability of this software
program at any time and without notice.

Customer Support
Maintenance
All orderable software products include one year of free
Maintenance Support, which starts from the date of
original purchase. Maintenance Support allows customers to receive technical assistance from highly trained
field and factory personnel, to use a call-in on-line infor~
mation system and to receive product and documentation updates at no additional charge. Customers may
extend Maintenance Support in one-year increments.
Customers can access support services by calling
the 24-hour, toll-free 29I(TM Family hotline at (800)
2929-AMD (292-9263).
On-Line Call-In Bulletin Board
In addition to the support engineering staff, AMD offers
a 24-hour on-line technical support center. The cus-

tomer can call (800) 2929-AMD at any time to query the
system for the latest information on a particular product:
bug fixes, work-arounds, information on upcoming
releases, etc. Messages may be left for the support
engineering staff during "after hours."
Training Classes
AMD offers training classes for the 29K Family products. These classes focus on 29K Family system design
and implementation using the broad range of AMD software development tools. Customers can shorten the development process through extensive hands-on training
covering a variety of topics. Contact your local AM 0 field
office for more information on training classes~
Fuslon29K Program
AMD encourages broad-based development and support for the Am29000 microprocessor with the
Fusion29KTM program, a joint-effort program between
AMD and third-party developers. Published twice a
year, the Fusion29K program catalog reveals the
breadth of development and system solutions for the
29K FamilYi including software generation and debug
tools; hardware development tools; executive, kernel
and multi-user operating systems; board-level products;
silicon products; and more. For a copy of the Fusion29K
program catalog, call your local AM D field sales office or
the literature center at (800) 222-9323.

2-9

29K Family Support Tools

'Mi@i+'i'
HighC29K

Advanced
Micro
Devices

Cross-Development Toolkit, Release 2
DISTINCTIVE CHARACTERISTICS
•

Efficient, globally optimizing C complier technology developed by MetaWaren ", Inc. ANSI
Standard C support and conformance verification (ANSI document X3J11/88-159, December
7,1988 and compile-time error checking.

•

HlghC29KTM toolkit Includes the entire
ASM29KTM Cross-Development Toolkit. The
ASM29K package contains:
Relocatable macro assembler supports complete Am29000 microprocessor instruction set.

•

Complier supports load scheduling and delayed branch optimizations to promote fast
Am29000™ microprocessor code execution.

Linker/loader combines separately compiled or
assembled modules by resolving external references and by searching libraries. .

•

Complier supports AMD's Am29027™ Arithmetic Accelerator.

Librarian provides management facility for
organizing modules into logical collections of
functions.

•

Full ANSI standard run-time library of over 100
functions Include all standard 1/0 routines
(stdlo).

•

Available for the PC-ATTM and Sun-3™ development environments.

•

Special library of high-performance transcendental functions.

Full architectural simulator of the Am29000
microprocessor with user-defined memory
access times. Allows designers to obtain pricel
performance statistics for their particular
Am29000 microprocessor design.
IEEE software floating-point emulation functions accessible from C and assembly language modules.

GENERAL DESCRIPTION
Processor performance depends on the processor's
hardware and software environment. The key to maximizing performance lies in the realization that the processor is part of a system which is a collection of components which must be properly integrated. To take advantage of the advanced RISC architecture of the
Am29000 microprocessor, equally sophisticated software tools must be available to achieve this integration.
The HighC29KTM Cross-Development Toolkit offers
such a development environment for creating efficient
and portable software for the 29KTM Family. The package consists of the full ANSI standard, optimizing C
compiler, run-time libraries, assembler, linking loader,
floating-point emulation, and object module librarian.
These tools allow users to design more efficient systems and applications.
Cross-development is the design of an application program on one computer (the host system) and the execution of that same application program on a different
computer (the target system). The operating system on
the host, such as UNIX or DOS, provides the tools
needed to create the application program. These tools
include editors for writing the source code, compilers

and assemblers for translating the modules into executable code, and utilities for preparing the application for
execution. The Am29000-based target computer generally does not provide the tools required to develop the
application program. Figure 1 shows the path that an
application follows from development on the host system to execution on the target system.
The HighC29K Cross-Development Toolkit transforms
a PC or Sun workstation host into a powerful software
development environment. The HighC29K cross-compiler generates 29K Family relocatable object modules
which can be combined with other relocatable object
modules derived from the assembler or HighC29K compiler using the 29K Family linker/loader. Library modules prepared by the librarian can be linked in at this
point as well. The resulting absolute object module can
then be downloaded to a target system.
AMD has established and published the 29K Family
Common Object File Format (COFF) to which all 29K
Family development tools conform. The AMD COFF
format extends the already standard AT&T COFF format to support source-level debugging and other 29K
Family-specific features. Similarly, AMD has estabPublication' 10957

Rev. B

tssue Date: September 19811

2-10

Amendment /0

HIghC29K
lished a common calling convention that maximizes
performance on the 29K Family of microprocessors as
well as defining standards for software vendors. This
has led to a variety of compilers, assemblers, debug-

gers, and associated tools that may be mixed freely by
developers of 29K Family software.
The contents of the HighC29K Cross-Development
Toolkit include:

HlghC29K:

ASM29K (Included In HlghC29K Development Package):

Optimizing C Compiler

Relocatable Macro Assembler

Documentation

Documentation

Function Libraries

Architectural Simulator
Linker/Loader
Librarian
IEEE Floating Point Emulation Routines
Utilities

Host Computer

Target Computer

Via On-Board
Monitor or
ADAPT29K
Debugger

Am29000
Microprocessor

Figure 1. Cross Software Development

2-11

29K Family Support Tools

ORDERING INFORMATION
LIcensing

Order Numbers

The HighC29K Cross-Development Toolkit is licensed
through AMD's Standard End-User Software License
Agreement (Boxtop). This license does not require a
signature; breaking the seal on the software package indicates acceptance of the license terms. If changes are
required to the license agreement, they can be arranged through your AMD sales representative. Many
software products require the customer to provide a
CPU 10 number when ordering the product. Contact
your sales representative if this information is not available at time of purchase. In addition, terms of the license require the customer to complete a Software
Warranty card with the serial number and site of the
host computer on which the development package will
reside. This card must be returned to AMD within 30
days of receipt for the warranty to be valid.

The HighC29K Cross-Development Toolkit is available
for several different environments. Documentation can
be ordered separately. The order number (Valid Combination) is formed as a combination of:
• Product Family
• Product Category
• Product Identifier
• License Type
• Host/OS Type
• Media Type

AM29000

SWI

HCC

B

##

##

T

Media Type
08 = 0.25" Sun cartridge tape, TAR format
14 = 3.5" DSHD floppies
21 =9-track, 1600 BPI mag tape, TAR format
24 = 5.25" DSHD floppies

Host/OS Type
07 = Sun-3
10 = PC-AT
99 = Not Host Specific
LIcense Type
B = Boxtop
S = Signed
"_" = Not Applicable
Product Identifier
HCC = HighC29K Cross-Development Toolkit
Product Category
SW/ = Software Product
DCI = Documentation Product
MN = Maintenance Agreement
Product Family
Am29000 Microprocessor

2-12

HlghC29K

Valid Combinations
Valid Combinations list configurations planned to be supported in volume forthis device. Consult the local AMD sales
office to confirm availability of specific valid combinations and to check on newly released combinations.
Order Number

Product

Host

Media

AM29000SWIHCCB0708
AM29000SWIHCCS0708
AM29000SWIHCCB0721
AM29000SW/HCCS0721
AM29000SWIHCCB1014
AM29000SWIHCCS 1014
AM29000SWIHCCB1024
AM29000SWIHCCS1024
AM29000DCIHCC-99
AM29000MAlHCC-07
AM29000MAlHCC-10

HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Toolkit
HighC29K Documentation
HighC29K Maintenance
HighC29K Maintenance

Sun-3
Sun-3
Sun-3
Sun-3
PC-AT
PC-AT
PC-AT
PC-AT
Not Host Specific
Sun-3
PC-AT

0.25" cartridge tape, TAR format
0.25" cartridge tape, TAR format
9-track, 1600 BPI tape, TAR format
9-track, 1600 BPI tape, TAR format
3.5" DSHD floppies
3.5" DSHD floppies
5.25" DSHD floppies
5.25" DSHD floppies
Not Media Specific
Not Media Specific
Not Media Specific

FUNCTIONAL INFORMATION

Compiler
The HighC29K cross-compiler supports an extended
version of the C language designed for professional
programmers. It includes a full ANSI implementation for
portable applications, yet also allows user access to the
best features of other languages such as nested functions from Pascal and named parameter association
from Ada. Extensions to the C language also are supported, such as range notation in case statements and
enumerated data types. The compiler allows users to
create re-entrant procedures and to generate efficient
code in terms of space and execution speed.
The HighC29K cross-compiler facilitates program development for dedicated or stand-alone Am29000 designs. The compiler generates optimized, sharable
code that takes full advantage of the Am29000 instruction set. The language contains a variety of control
statements, data types, and predeclared procedures
and functions that promote the development of wellstructured programs. For example, the user may specify
the parameter types for external functions so that the
compiler can check that arguments are passed correctly.
The HighC29K cross-compiler generates 29K Family
object modules directly. The HighC29K compiler optionally generates information necessary for symbolic debugging at the C or assembly level with XRAY29KTM,
AMD's source-level debugger for the 29K Family. The
compiler preprocessor allows the user to define macros,
merge files into source and conditionally include or exclude code.

Optimization
As a highly optimizing cross-compiler, HighC29K software ensures the generation of fast, compact code by
using advanced optimization techniques including common subexpression elimination, loop invariant analysis,

global register allocation and automatic allocation of
variables to registers. Many of the optimizations are
particularly effective when using the unique features of
the Am29000 microprocessor architecture. For example, its large register set means passing parameters
in registers is more effective on the Am29000 microprocessor than on any other microprocesor. Optimizations
specifically developed for the Am29000 RISC microprocessor architecture are also performed such as load
scheduling for maximum instruction throughput. Additionally, the compiler makes extensive use of Am29000
microprocessor's large register file as a stack cache to
store frequently accessed values. The list of optimizations performed include:
Common subexpression elimination
Retention/reuse of register contents
Automatic allocation of variables to registers
Dead code elimination and cascaded jumps
Cross jumping (tail merging)
Constant folding
Switch statements optimally encoded using in-line
branch table, binary search or linear search.
Global flow analysis leading to removal of loop
invariant values
Load Scheduling
Delayed Branch
Several of these optimizations are explained below:
Loop Invariant Analysis: Computations made inside
of loops that do not change value in the loop can be
moved outside the loop. The value is stored in a register
for optimum access. Since an application may spend as
much as 90% of its time executing loops, this optimization produces a significant gain in performance.

2-13

29K Family Support Tools
Fold Constants: Operands that are constant can often
be folded into a single constant, or into a temporary
value. If constants are defined at compile time, the
compiler can reduce them to a single value.

Function Libraries
The HighC29K toolkit includes three different sets of
function libraries that enhance the functionality of the
compiler. The library sets are comprised of:

Load Scheduling: The Am29000 microprocessor supports overlapped load and store capabilities to decrease
delays incurred while waiting for data. The compiler
recognizes when certain instructions can be advanced
in the pipeline for efficient operation.

the ANSI standard library which provides the full set
of functions specified by the ANSI C language standard

Delayed Branch: The Am29000 microprocessor
branch instruction is delayed by one cycle to allow the
processor pipeline to achieve maximum throughput.
The instruction following the branch instruction, called
the delayed instruction is executed whether the branch
is successful or not. In most cases, the compiler can
easily place a useful instruction, i.e. an instruction other
than NO-OP, as the delay instruction by reorganizing
the code.

a library of hand-coded transcendental functions
optimized for use with the Am29000/Am29027
microprocessor combination.

Data Types
The single addressing mode of the Am29000 microprocessor combines with high-level language implementations to provide efficient access to all data types.
Data Type
int
long int
pointer
short int
char
float
double
unsigned
unsigned char
unsigned short
enum (default)
enum (option)

Size (Bits)
32
32
32
16

8
32
64
32
8
16
32
8,16,32

Am29027 Arithmetic Accelerator Support
Target systems that include the Am29027 Arithmetic
Accelerator for high-speed computations are directly
supported through the compiler. Users may direct the
compiler to generate in-line code to access the control
and instruction registers of the accelerator. Versions of
the libraries that assume direct use of the Am29027
microprocessor are included.
Alternatively, the user can signal the compiler to generate Am29000 microprocessor floating-point instructions
that are used in conjunction with the IEEE FloatingPoint Emulation Routines to access the accelerator.
The HighC29K Cross-Development Toolkit includes
AMD's entire ASM29K Cross-Development Toolkit. Details of this package are contained in the ASM29K
Cross-Development Toolkit data sheet (order #10292).

2-14

a library of routines implementing the floating-point
environment functions specified in the IEEE-754
standard

Each library set contains several versions of the library
which reflect the different possible target environments.
The compiler driver is able to select the proper version
of the library to use based on the compile-time options
specified.

ANSI Standard Library
This library contains the full functionality specified by
the ANSI standard for the C language (X3J11/88-159,
December, 1988). At the lowest level, the library functions interface with HIF (Host Interface), a small kernel
system defined by AMD. HIF is supported in all AMD
products, and is defined in the HighC29K toolkit manual
for the customer who needs to adapt to a different environment.
The functions included in the ANSI Standard Library
are:
Mathematical
atan2
abs
ceil
fabs
floor
log
log10 sinh

Routines
frexp
exp
Idexp pow
sin
tanh

modf
tan
atan

sqrt
asin
cosh

acos
cos
fmod

Memory Allocation
calloe free
malloc realloc
Standard Formated I/O
fprintf printf sprintf vfprintf vsprint fscanf
sscanf vprintf _setmode
Standard File I/O
fopen
fclose
fflush
freopen

remove
rename

Character Routines
isalnum iscntrl
isgraph
isxdigit
toupper isalpha
ispunct
isupper tolower
Character I/O Routines
fgetc
fpute
getc
ungetc
fgets
fputs

scanf

setbuf
setvbuf

tmpfile
tmpnam

isprint
isdigit

isspace
islower

gets
getchar

putchar
putc puts

HlghC29K
String Routines
memchr
strcat
_strncat
memcmp
strxfrm
memcpy
_rmemcpy me move
_rstrcpy
memset
_strcats

Direct I/O Routines
fgetpos
fread
fseek
rewind
General Routines
abort
atol
strtoul
atexit
srand
system
exit
strtod
mblen
qsort

strncpy
strerror
strlen
strncat
strncmp

strcspn
strchr
strcmp
strcoll
strcpy

getenv
bsearch
atoi
wctombs
strtol

Date and Time Routines
asctime ctime
gmtime
strftime
time
clock

strtok
strpbrk
strrchr
strspn
strstr

fsetpos

ftell fwrite

mbstowcs
labs
div
atof
wctomb

rand
mbtowc
Idiv onexit

localtime mktime
difftime

Miscellaneous Routines
assert
ferror
localeconv perror
setjmp
signal
va_end
clearerr
kill
longjmp
raise
setJocale va_arg
va_start feof

Floating-Point Environment Library
The functions included in the Floating-Point Environment Library are:
. class
rclass
copysign rcopysign
finite
rfinite
isnan
10gb
risnan
rlogb
nextafter rnextafter remainder rremainder scalb
rscalb
unordered runordered
Fast Transcendental Library
This library provides special hand-coded versions of the
standard transcendental functions. These functions are
optimized for performance with the Am29000/Am29027
microprocessor combination.
The functions included are:
atan
cos
exp
sin
sqrt
tan

log

pow

Floating-Point Emulation
The Am29000 microprocessor's instruction set includes
floating-point and integer math operations. In the simplest processor implementation, these instructions
cause traps to routines that perform the operations. The
user is provided with source to two complete sets of
routines that emulate IEEE Floating-Point Standard 754
for each of the instructions listed below.
The first set of trap handlers is provided for users who
have integrated the Am29027 arithmetic accelerator
into their systems. The Am29000 microprocessor math

instructions are performed using the Am29027 microprocessor.
The second set of trap handlers implements emulation
of the floating-point operations entirely in software. No
special hardware is required.
Documentation instructs users how to integrate the
package into their target system. Both packages are
designed to insure upward compatibility with future
generation processors. The floating-point routines are
accessible from both the assembler and compiler.
To eliminate the overhead incurred by using the trap
handlers, direct code generation (in-line coding) of
Am29027 microprocessor floating-point operations is
an included option of the HighC29K Cross-Development Toolkit.
Am29000 Microprocessor Floating-Point
Instructions
Mnemonic

Operation

CONVERT

Convert values between types
Integer, Float, and Double
Compare Floats Equal
Compare Doubles Equal
Compare Floats Greater Than
Compare Double Greater Than
Compare Floats Less Than
Compare Double Less Than
Float Add
Double Add
Float Subtract
Double Subtract
Float Multiply
Double Multiply
Float Divide
Double Divide

FEQ
DEQ
FGT
DGT
FGE
DGE
FADD
DADO
FSUB
DSUB
FMUL
DMUL
FDIV
DDIV

Utilities
A set of utilities is provided to work with the output files
produced by the deve lopment tools. They allow the user
to prepare output files for downloading into target systems and to create ROM images. The utilities include:
• coff2hex: Converts Am29000 microprocessor COFF
files to Motorola® S-record or Extended Tektronix®
Hex Files.
• romcoff: Allows creation of ROM images. from
Am29000 microprocessor COFF files.
• cvcoff: Translates Am29000 microprocessor COFF
files between big endian/little endian hosts.
• strpcoff: "Strips" symbolic information from an executable COFF file.

2-15

29K Family Support Tools

MAINTENANCE AND SUPPORT

Software Warranty
Software programs licensed by AMO are covered by the
warranty and patent indemnity provisions appearing in
AMO's standard Software License Forms. AMD makes
no warranty, express, statutory, implied or by des~rip­
tion regarding the information set forth herein or regarding the freedom of the described software program from
patent infringement. AMD reserves the right to modify,
change or discontinue the availability of this software
program at any time and without notice.

Support
Customer Support
All orderable software products include one year of free
maintenance support, which starts from the date of
original purchase. Maintenance support allows customers to receive technical assistance from highly trained
field and factory personnel, to use a call-in on-line
information system and to receive product and documentation updates at no additional charge. Customers
may extend maintenance support in one-year
increments. Customers can access suppport services
by calling the 24-hour, toll-free 29K Family hotline at
(800) 2929-AMD (292-9263).

On-Line Call-In Bulletin Board
In addition to the support engineering staff, AMD offers
a 24-hour on-line technical support center. The customercan call (800) 2929-AMD at anytime to query the

2-16

system for the latest information on a particular product:
bug fixes, work-arounds and information on up-coming
releases. Messages may be left for the support engineering staff during "after hours."

Training Classes
AMD offers training classes for the 29K Family products. These classes focus on 29K Family system design
and implementation using the broad range of AMD
software development tools. Customers can shorten
the development process through extensive hands-on
training covering a variety of topics. Contact your local
AMD field sales office for more information on training
classes.

Fuslon29K Program
AMD encourages broad-based development and support for the Am29000 with the Fusion29I(TM program, a
joint-effort program between AMD and third-party'
developers. A bi-annual Fusion29K program catalog
reveals the breadth of development and system
solutions for the 29K Family, including software
generation and debug tools; hardware development
tools; executive, kernel and multi-user operating
systems; board-level products; silicon products; and
more. For a copy of the Fusion29K program catalog, call
your local AM 0 field sales office or the literature center
at (800) 222-9323.

MON29K

~

Advanced
Micro
Devices

MON29K
Target Resident Debug Monitor
DISTINCTIVE CHARACTERISTICS
•

Provides local control of an Am29000™ microprocessor-based system

•

Provides eight breakpoints plus singleand multiple-Instruction stepping

•

Interfaces to the XRAY29KTM Source-Level
Debugger

•

Allows selection of user-defined displays after
each breakpoint or single step

•

Allows modification and display of memory,
registers and 1/0 ports

•

Provides In-line assembler and disassembler

•

•

Supports modification and display of specialpurpose registers by group

Supports downloading of COFF and hex flies
from remote systems

•

•

Allows access to both user- and system-level
code

Provided in source form (C and Am29000
microprocessor assembly) to simplify
Installation of I/O devices

•

Supports the AMD Am29027™ Arithmetic
Accelerator

•

Offers familiar user interface, similar to DEBUG
on IBM~ PC

•

Allows modification and display of Am29027
microprocessor registers

GENERAL DESCRIPTION
The Target Resident Debug Monitor (MON29KTM)
resides on Am29000 microprocessor-based hardware.
It provides all the control a designer needs to load,
execute and debug Am29000 microprocessor
programs. MON29K software is provided in source form
so its I/O drivers and service routines can be modified
easily, which allows MON29K software to be
customized for various hardware configurations.
MON29K software provides the ability to set
breakpoints, to set and display memory and registers, to
read and write I/O ports, to trace execution in single or
multiple steps, and to download files from a remote

host. MON29K software is controlled by either an ASCII
terminal or a host computer connected to a serial port
on the target system.
MON29K software supports high-level language
debugging through XRAY29K, the Am29000
microprocessor source-level debugger. In addition to its
own standard command set, the XRAY29K debugger
supports all the MON29K software commands.
The MON29K product includes:
MON29K source code
Documentation

Publication #

Rev.

~B

Amendment

--'0-

Issue Date: September 1989

2-17

29K Family Support Tools

ORDERING INFORMATION

Licensing

Order Numbers

The MON29K Resident Monitor is licensed through
AMD's Standard End-User Software License
Agreement (Boxtop). This license does not require a
signature; breaking the seal on the product package
indicates acceptance of the license terms. If changes
are required to the license agreement, they can be
arranged through your AMD sales representative. Many
software products require the customer to provide a
CPU ID number when ordering the product. Contact
your sales representative if this information is not
available at the time of purchase. In addition, terms of
the license require the customer to complete a Software
Warranty card with the serial number and site of the
host computer on which the resident monitor source will
reside. This card must be returned to AMD within 30
days of receipt fO,r the warranty to be valid.

MON29K software executes on Am29000
microprocessor-based systems but is distributed in
machine readable source form for several hosts. Thus,
media type is the only distinguishing characteristic
when ordering MON29K software. Documentation can
be ordered separately. The order number (Valid
Combination) is formed as a combination of:

AM29000

SWI

MON

B

•
•
•
•
•
•

Product Family
Product Category
Product Identifier
License Type
Host/OS Type
Media Type

##

T

Media Type
08 = 0.25" cartridge tape. TAR format
14 = 3.5" DSHD floppies
21 = 9-track, 1600 BPI mag tape, TAR format
24 =5.25" DSHD floppies

L...-_ _ __

"-------~----------

L...-_ _ _ _ _ _ _ _ _ _ _ _ _ __ _

Host/OS Type
99 = Not Host Specific
LIcense Type
B = Boxtop
S = Signed
"-" = Not Applicable
Product Identifier
MON = MON29K Target Resident Debug Monitor
Product Category
SWI = Software Product
DCI = Documentation Product
MN = Maintenance Agreement
Product Family
Am29000 Microprocessor

2-18

MON29K

Valid Combinations
Valid Combinations lists configurations planned to be supported in volume for this device. Consult the local AMD
sales office to confirm availability of specific valid combinations and to check on newly released combinations.
Part Number
AM29000SW/MONB9908
AM29000SW/MONS9908
AM29000SW/MONB9914
AM29000SW/MONS9914
AM29000SW/MONB9921
AM29000SW/MONS9921
AM29000SW/MONB9924
AM29000SW/MONS9924
AM29000DC/MON-99
AM29000MAlMON-99

Product
MON29K
MON29K
MON29K
MON29K
MON29K
MON29K
MON29K
MON29K
MON29K
MON29K

Resident Monitor
Resident Monitor
Resident Monitor
Resident Monitor
Resident Monitor
Resident Monitor
Resident Monitor
Resident Monitor
Documentation
Maintenance

Host
Not
Not
Not
Not
Not
Not
Not
Not

Host Specific
Host Specific
Host Specific
Host Specific
Host Specific
Host Specific
Host Specific
Host Specific

UNIX
Not Host Specific

Media
0.25" cartridge tape, TAR format
0.25" cartridge tape, TAR format
3.5" DSHD floppies
3.5" DSHD floppies
9-track, 1600 BPI tape, TAR format
9-track, 1600 BPI tape, TAR format
5.25" DSHD floppies
5.25" DSHD floppies
Not Media Specific
Not Media Specific

FUNCTIONAL DESCRIPTION
MON29K software resides on the target system and
interfaces to the user through an ASCII terminal
connected to a serial port on the target system. All
commands and formatted displays are communicated
through this serial link. MON29K software supports
simple display formats so that compatibility can be
maintained with any CRT.
MON29K software provides program development
support at the assembler source level. High-level
source code development is provided by the XRAY29K
debugger when it is connected to MON29K monitor.
MON29K serves as the target resident monitor that
interrogates memory and registers for the host-resident
source-level debugger.

Memory, Register and 1/0 Addresses
MON29K software supports three address spaces:
register, memory, and 110. Data values are always
represented in hex, as are memory and I/O addresses.
Register addresses are represented by decimal
numbers and grouped as general, local, global, specialpurpose, and TLB. Special-purpose and TLB registers
can be accessed by register number or by their
abbreviated mnemonic. The Special-Purpose Registers
section that follows discusses other commands for
accessing these registers.
Memory and I/O addresses are assumed to be real
because MON29K software has no mechanism for
calculating or interpreting virtual addresses. MON29K
software allows specification of user and supervisor
modes and specification of OPT lines with all memory
and I/O addresses.

Displaying Memory and Registers
The Displaycommand shows data for a specified range
of addresses, beginning at a specified address or from
the currently active address. Each line in the display
contains 16 bytes of data. The 16 bytes are displayed
as either bytes, half-words, words, single-precision, or
double-precision floating points, depending on the
command entered.
Floating-point numbers are displayed in decimal format
if the value can be represented accurately within the
digits available. Otherwise, scientific notation, E format,
is used.
Following the numeric data is a string of ASCII
characters in which each character corresponds to one
byte of data. When no ASCII equivalent exists for the
byte of data, a period is displayed. Figure 1 shows
examples of memory and register displays.

Altering Memory and Registers
Memory and register contents can be set, filled, or
moved. The set command allows the contents of
registers and memory to be examined and optionally
changed. One or more values can be set without
examining the previous contents. The fill command sets
a range of register or memory addresses to a specific
value. The move command copies blocks of data from
one range of addresses to another. Blocks in the
destination address range may overlap blocks in the
source address range.

2-19

29K Family Support Tools

Special-Purpose Registers

In-Line Assembler/Disassembler

The special-purpose register commands provide.
another method for accessing the Am29000
microprocessor special-purpose and TLB registers.
These registers are organized into groups:
Unprotected, Protected, TLB Entries, and Coprocessor.
Specific commands are used for examining the
contents of registers in each. group. Within a group,
each register's contents can be examined or changed
explicitly.

An in-line assembler/disassembler allows the user to
examine and change memory using instruction
mnemonics rather than hex values. This improves
readability and minimizes user efforts while entering
changes to instruction memory. The lexical conventions
and statement syntax used are identical to the standard
AMD assembler, ASM29KTM.

I/O Commands
I/O commands provide simple forms of input and output.
They are intended to allow quick examination and
simple control of devices. These commands read or
write a full word of data to or from a real 110 address.

The large number of registers necessitates special
register display screens that clearly present each
group's registers. To enhance display efficiency, the
single command X is available. It displays the registers
most likely to be in use: all the global registers, half the
local registers, and all the unprotected registers.
Figures 2 and 3 show examples of special-purpose
register display screens.
#dw LR4, LRll
LR004 61006200 63006400 65006600 67006800
LR008 69006aOO 6b006cOO 6d006eOO 6f007000

#
#
# DB 100001, 1001FI
000100001 61 00 62 00 63 00 64 00 65 00 66 00 67 00 68 00 a.b.c.d.e.f.g.h.
000100101 69 00 6a 00 6b 00 6c 00 6d 00 6e 00 6f 00 70 00 i.j.k.l.m.n.o.p.

Figure 1. Register and Memory Display

#xp
CA
0
OPS: 0

cps:

VAB
0000

IP
0
0

TE
0
0

TP
0
0

CFG: PRL VF
01 1

CHA
00000000
RBP: BF
0
TCV
000000
#

RV
0

CHD
00000000
BE
0

BD
0

TR: OV
1

TU
0
0

BC
0
IN
1

BO
0

FZ
0
0
CP
0

CHC: CE
0
BB
0

BA
0

LK
0
0

RE
0
0

PD
0
0

WM

0
0

PI
0
0

SM
0
0

1M
0
0

DI
0
0

CD
1
CNTL
00
B9
0

CR
00
B8
0

LS
0
B7
0

ML
0
B6
0

ST
0
B5
0

LA
0
B4
0

TR
00

TF
0
B3
0

PC2
IE TRV
PCO
PC1
0 000000 00010004 00010000 00000000

Figure 2. Protected Register Group Display
2-20

DA
0
0

B2
0

NN CV
0 0
B1
0

MMU: PS
0

BO
0
PID LRU
00 0

MON29K

Downloading

Program execution can be stepped one instruction at a
time or a group of instructions at a time. User-defined
displays and the address and contents of the next
executable instruction are displayed after each
instruction step. When stepping by group, these
displays can be delayed either until after the last
instruction in the group is executed, or until after each
instruction is executed. An option allows only register
data that was changed to be displayed. This
automatically informs the user of register changes, thus
eliminating the need to visually monitor register
contents.

Downloading controls the transmission of data from a
remote system to the local memory on the target
system. MON29K software can read COFF binary,
Motorola S3 hex records, and TEK extended hex files.
Each of these formats contains the address and byte
count in-formation for loading memory, so no other
parameters need to be specified.
An optional downloading parameter, ,
can be specified by the user. The  is a
character string that is uploaded by MON29K to the
remote host system. This command can be used to
initiate the host download procedure remotely from the
MON29K monitor terminal.

Remote Mode
MON29K software supports two serial ports: one to a
terminal and one to a host computer. In normal mode,
either port can be used for initiating commands or for
downloading programs. In remote mode, the two serial
ports are linked together, allowing the terminal to
communicate directly with the host computer.

Execution Control
Execution control commands allow the user to start
program execution, setup through instruction singly or
in groups, breakpoint execution, and specify monitor
commands to be performed when termination occurs.
Following each break in program execution, the
MON29K monitor displays the address and
disassembled contents of the next executable
instruction. In addition, the user can identify registers
and memory he wishes to view after the termination of
each breakpoint or step command. This reduces the
amount of information displayed to the data that is
pertinent to the current debugging session.

Miscellaneous Commands
An on-screen help facility, as seen in Figure 4, lists all
MON29K monitor commands. Information about a
specific command is obtained by specifying the
command name as a parameter to the help command.

Am29027 Arithmetic Accelerator Support
MON29K software is fully integrated with the AMD
Am29027 Arithmetic Accelerator. In the same manner
that the Am29000 microprocessor registers can be
accessed, the Am29027 microprocessor registers can
be both displayed and modified using MON29K
software. An example o(an Am29027 microprocessor
register display is shown in Figure 5.

MON29K software provides eight "sticky" and two "nonsticky" breakpoints. Sticky breakpoints remain set until
expressly removed by the user. These are useful when
debugging code within an instruction loop. Non-sticky
breakpoints occur once and are removed automatically.
Non-sticky breakpoints are optional parameters of the
go command. Users can easily display, set, and reset
breakpoint addresses.

#XT
LINE SET 1ST REG
00
0
TROOO
1
TR064
00
01
TR002
0
01
1
TR066
02
TR004
0
02
1
TR068
0
TR006
03
1
03
TR070
#

0: VTAG
00000
00000
00000
00000
00000
00000
00000
00000

VE
0
0
0
0
0
0
0
0

SR
0
0
0
0
0
0
0
0

sw
0
0
0
0
0
0
0
0

SE
0
0
0
0
0
0
0
0

UR
0
0
0
0
0
0
0
0

UW
0
0
0
0
0
0
0
0

UE
0
0
0
0
0
0
0
0

TID
00
00
00
00
00
00
00
00

1:

RPN
000000
000000
000000
000000
000000
000000
000000
000000

PGM
0
0
0
0
0
0
0
0

U
0
0
0
0
0
0
0
0

F

0
0
0
0
0
0
0
0

Figure 3. TLB Entries Group Display

2-21

29K Family Support Tools

Target System Requirements

Other Tools

The Am29000 microprocessor supports separate code
and data spaces and provides no instructions for
moving information between data and instruction
spaces. Because of this. the target system must
provide a mechanism for writing to code space in order
for MON29K monitor to set breakpoints and load
instruction memory.

MON29K is a stand-alone product that does not depend
on other software to function. However. MON29K
software is delivered in source form and will need to
be compiled with the AMD HighC29KTM CrossDevelopment Toolkit; modification may be necessary if
compiled with other Am29000 microprocessor C
compilers.

MON29K software is designed to support a memory.
mapped Z8530 SCC serial device. However. source
code is provided so the user can change the MON29K
monitor to support other devices on a particular target
system.
4/0 H
Help:
H or?
H
?

XP-Display/set
XT- Display/set
XC- Display/set
X - Display key
Y - Load a file
V - Save memory

to see this display
help with a named command
help with a named command

Target Resource Access:
D - Display registers/memory
S - Set registers/memory
F - Fill registers/memory
M - Move registers/memory
A - Assemble in memory
L - List disassembly from mem
I - Input from port
o - Output to port
XU-Display/set unprotected reg

protected reg
TLB entries
Arn29027 reg
registers
to memory
to a file

Execution Control:
E - End execution command list
B - Display/Set/Clear breaks
G - Go (start execution)
T - Trace (single/multiple step)
Miscellaneous:
R - Remote mode (talk to host)
N - Normal (change 'normal' char)
Q - Re-initialize monitor

Figure 4. On-Screen Help Facility
4/0

xc

RFO:
RF2:
RF4:
RF6:

PR
0
0
0
0

MSW
00000000
00000000
00000000
00000000

LSW
00000000
00000000
00000000
00000000

RF1:
RF3:
RFS:
RF7:

R:
R TEMP:

00000000
00000000

00000000
00000000

S:
S TEMP:

F:

00000000

00000000

IP
INSTR:
0
I TEMP: 0

RP
0

RF

0

0

0

RFS
0
0

PMS
0
0

QMS
0

TMS
0

0

0

STATUS: OP IV SV RV ES ZE XE UE VE RE IE
0 0 0 0 0 0 0 0 0 0 0
OP HE AD
0 0 0

MVTC MATC PLTC
0
0
0

PR
0
0
0
0

SIP
0
0

SIQ
0
0

FLAGS:FL6 FLS
1
0

LSW
00000000
00000000
00000000
00000000

00000000
00000000

00000000
00000000

0

SIF
0
0

IF
0
0

CO
00
00

FL4
0

FL3
0

FL2
0

FLl
0

SIT
0

FLO
0

ZM XM UM VM RM 1M PL RMS MF MS BU BS SU TR AP SA AFF PFF
0 0 0 0 0 0 0 0 0
0
0
0 0 0 0 0 0 0

Figure 5. Am29027 Register Display
2-22

MSW
00000000
00000000
00000000
00000000

MON29K

MAINTENANCE AND SUPPORT
Software Warranty
Software programs licensed by AMD are covered by the
warranty and patent indemnity provisions appearing in
AMD's standard software license forms. AMD makes no
warranty, express, statutory, implied, or by description
regarding, the information set forth herein or regarding
the freedom of the described software program from
patent infringement. AMD reserves the right to modify,
change, or discontinue the availability of this software
program at any time and without notice.

Training Classes
AMD offers training classes for the 29K Family
products. These classes focus on 29K Family system
design and implementation using the broad range of
AMD software development tools. Customers can
shorten the development process through extensive
hands-on training covering a variety of topics. Contact
your local AMD field office for more information on
training classes.

Customer Support
Maintenance

Fuslon29K Program
AM D encourages broad-based development and
support for the Am29000 microprocessor with the
Fusion29KTM program, a joint-effort program between
AMD and third-party developers. Published twice a
year, the Fusion29K program catalog reveals the
breadth of development and system solutions for the
29K Family, including software generation and debug
tools; hardware development tools; executive, kernel,
and multi-user operating systems; board-level products;
silicon products; and more. For a copy of the Fusion29K
program catalog, call your local AMD field sales office or
the literature center at (800) 222-9323.

All orderable software products include one year of free
Maintenance Support, which starts from the date of
original purchase. Maintenance Support allows
customers to receive technical assistance from highly
trained field and factory personnel, to use a call-in online information system, and to receive product and
documentation updates at no additional charge.
Customers may extend Maintenance Support in oneyear increments. Customers can access support
services by calling the 24-hour, toll-free 29KTM Family
hotline at (800) 2929-AMD (292-9263).

On-Line Call-In Bulletin Board
In addition to the support engineering staff, AMD offers
a 24-hour on-line technical support center. The
customer can call (800) 2929-AMD at any time to query
the system for the latest information on a particular
product: bug fixes, work-arounds, information on upcoming releases, etc. Messages may be left for the
support engineering staff during "after hours."

2·23

29K Family Support Tools

~~i~!ij&if~----------------------------~--,

Advanced
Micro
Devices

XRAV29K
Source-Level Debugger
DISTINCTIVE CHARACTERISTICS
•

Supports symbolic debugging with C expressions and statements for Am29000'·
microprocessor development environments

•

Controls and examines program execution In
high-level and assembly-level modes

•

Provides Interface and start-up code for the
Am29000 microprocessor, which allows use of
the MON29K'· Target-Resident Monitor,
ADAPT29K'· Advanced Development and
Protoyplng Tool and PCEB29K'" PC Execution
Board

•

Uses window-oriented display to segregate
debug Information In meaningful regions

•

Allows single-step execution and placement of
simple and complex breakpoints

•

Supports custom screens and vlewports, and
one-key command functions

•

Provides command, breakpoint, and viewport
macros

•

Supports automatic test sequences by processing command flies and logging output to a
file

•

Includes on-line help, comprehensive documentation, and a sample debug session

GENERAL DESCRIPTION
AMD's XRAY29K'· source-level debugger provides
engineers with a multiwindow interactive environment
for debugging high-level and assembly-level software
programs for Am29000"based systems. XRAY29K software resides on IBM8 ATs8 and compatibles, and Sun
Workstations 8. Program execution is monitored and
controlled in high-level source or assembly language,
from the host system through the PCEB29K execution
board, MON29K monitor or ADAPT29K debugger on
the target system. Control is extensive, including debugger commands for seHing breakpoints, single stepping through the program, and examining or altering
register and memory contents.

XRAY29K software allows examination and modification of a variable's contents and computation of highlevel and assembly language expression values. Symbols can be added, displayed, and deleted in the symbol table.
The XRAY29K product includes:
II XRAY29K Software

• Documentation
• Install testing program
• Start-up code for ADAPT29K or targets using
MON29K

Publication # 10626

Rev. C

Issue Date: September 1989

2·24

Amendment /0

XRAY29K

ORDERING INFORMATION
Licensing

Order Numbers

The XRAY29K Source-Level Debugger is licensed
through AMD's Standard End-User Software License
Agreement (Boxtop). This license does not require a
signature; breaking the seal on the product package indicates acceptance of the license terms. If changes are
required to the license agreement, they can be arranged through your AMD sales representative. Many
software products require the customer to provide a
CPU ID number when ordering the product. Contact
your sales representative if this information is not available at the time of purchase.

The XRAY29K Source-Level Debugger is available for
several different environments. Documentation can be
ordered separately. The order number (Valid Combination) is formed as a combination of:
• Product Family
• Product Category
• Product Identifier
• License Type
• Host I

as Type

• Media Type

AM29000

SWI

XRY

B

##

##

T

Media Type
08 = 0.25" Sun cartridge tape, TAR format
14 = 3.5" DSHD floppies
21 = 9-track, 1600 BPI mag tape, TAR format
24 = 5.25" DSHD floppies

Host I as Type
07 = Sun-3
10 = PC-AT

B = Boxtop
S = Signed
"-" = Not Applicable
Product Identifier
XRY= XRAY29K Source-Level Debugger
' - - - - - - - - - - - Product Category
SWI = Software Product
DCI = Documentation Product
MN = Maintenance Agreement
Product Family
Am29000 Microprocessor

2-25

29K Family Support Tools

Valid Combinations
Valid Combinations list configurations planned to be supported in volume for this device. Consult the local AMD
sales office to confirm availability of specific valid combinations and to check on newly released combinations.
Order Number
AM29000SWIXRYB0708
AM29000SWIXRYS0708
AM29000SWIXRYB0721
AM29000SWIXRYS0721
AM29000SWIXRYB 1014
AM29000SWIXRYS1014
AM29000SWIXRYB1024
AM29000SWIXRYS 1024
AM29000DCIXRY-99
AM29000MAlXRY-07
AM29000MAlXRY-10

Product
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Source-Level Debugger
XRAY29K Documentation
XRAY29K Maintenance
XRAY29K Maintenance

Host
Sun-3
Sun-3
Sun-3
Sun-3
PC-AT
PC-AT
PC-AT
PC-AT
UNIX
Sun-3
PC-AT

Media
0.25" cartridge tape, TAR format
0.25" cartridge tape, TAR format
9-track, 1600 BPI tape,TAR format
9-track, 1600 BPI tape,TAR format
3.5" DSHD floppies
3.5" DSHD floppies
5.25" DSHD floppies
5.25" DSHD floppies
Not Media Specific
Not Media Specific
Not Media Specific

FUNCTIONAL DESCRIPTION
XRAY29K software aids the control and examination of
program execution, and can set and examine memory
and register contents, set and remove breakpoints in
either high-level source or assembly language code,
and display and alter the microprocessor state. In addition to symbolic debugging, the XRAY29K debugger's
special features include help screens, macro capabilities, command files, conditional commands, and
debugging through ports. For example, in batch mode,
command files can issue directives to XRAY29K software to implement automated test sequences.
XRAY29K software functions in either high-level or assembly-level mode. In high-level mode, an application
is debugged using C language source lines to control
and monitor execution. C variables and expressions
replace numeric addresses for memory access. Code
can be viewed by line number or procedure name. In
assembly-level mode, an application is debugged using
assembly language statements. In addition to all the capabilities available in high-level mode, assembly-level
mode includes machine-level register and status bit manipulation. For each mode, the monitor's screen is partitioned in areas called viewports, where information is
displayed in meaningful regions and is easy to identify.

2-26

Viewport Commands
When the XRAY29K debugger executes, the screen is
divided in areas called viewports. The number of viewports and the information shown in each depends on
whether the object module was written in a high-level
language (high-level mode) or assembly language (assembly-level mode).
The standard screen for high-level mode has four viewports: data, trace, code, and command. This screen is
displayed when an object module generated by a highlevel source program is executed. The standard screen
for assembly-level mode has five viewports: data, stack,
disassembled code, Am29000 microprocessor registers, and command. This screen is displayed when an
object module generated by an assembly language
program is executed. Figures 1 and 2 show examples of
these screens.
Viewport commands control the way information is displayed on the screen. Changing a viewport's size, color,
and cursor position as well as adding and deleting a
custom viewport are viewport commands. In addition,
viewports can be cleared of data, and macros can be
associated with them. Frequently used viewport commands are associated with function keys for easy
access.

XRAY29K

vactive
vclear
vclose
vmacro
vopen
vscreen
vsetc
zoom

Activate a viewport
Clear data from a viewport
Remove a user-defined viewport or
screen
Attach a macro to a viewport
Create a screen/create or resize a
viewport
Activate a screen
Set a viewport's cursor position
Increase or decrease viewport size

Commands to attach a macro to a viewport are part of
the viewport command set. Commands that attach a
macro to a breakpoint are part of the execution and
breakpoint command set.

define Create a macro
show Display a macro source

Debugger Commands

Macro Commands
XRAY29K software supports macros to create and execute complex command procedures, such as testing
program variables, and to conditionally execute other
sets of commands. Macros can be defined and used
any time during a debugging session and can include
comments to explain its function. The macro definition
may contain parameters that can be changed for each
macro call.
Used as commands or in expressions, macros can be
attached to a breakpoint to create complex breakpoint
condition testing, or to a custom viewport to control data
display. Complex initialization conditions can be represented as a sequence of macro commands in a command file. Statements to increment variables, perform
loops and conditions, and control target program flow
can be part of a macro.
XRAY29K software provides a set of macro flow control
statements. These statements are similar to C conditional statements (e.g., IF, ELSE, WHILE, DO, FOR,
RETURN and CONTINUE). To create a macro, the define command is used. After macro creation, the show
command allows the macro's source to be viewed.

Commands, whether in high-level source or assembly
language mode, can be entered interactively from the
keyboard in the command viewport or placed in a command file and accessed as include or batch files. Some
commands take qualifiers that provide additional information on how to execute the command and parameters that describe an object and communicate addresses or file specifications.

Breakpoints and Execution Commands
A breakpoint causes program execution to halt or
causes the XRAY29K debugger to take some action,
such as incrementing a counter each time the target
program attempts to execute an instruction at a specified memory location. A macro can be associated with
the breakpoint to control execution. A special breakpoint viewport shows breakpoint information during the
debugging session, including the breakpoint identification number. Automatically assigned by XRAY29K software, the breakpoint number can reference or clear a
breakpoint.
Execution commands start program execution or
re-sume execution after explicit suspension. The program can be instructed to continue, single step, or set
temporary instruction breakpOints. Single stepping is
performed by C source line in high-level mode and
microprocessor instruction in assembly-level mode.
In addition, for each step, a macro can be invoked.

- - - - - Data------.
Monitored Data

TraceRoutine Traceback
Information

. - - - - - - - - - - - Code - - - - - - - - - - - ,

Source Code

- - - - - - - - S t a t u s Line - - - - - - - . - - - - - - - - - - Command - - - - - - - - - ,
Debugger Commands

Figure 1. Standard High-Level Screen
2-27

29K Family Support Tools
~Stack-

Data

I

Stack Contents

Monitored Data

I
, - - Registers-

Code

Am29000
Microprocessor
Registers

Disassembled Code

Status Line
Command
Debugger Commands

Figure 2. Standard Assembly-Level Screen

breakinstruction Set an instruction breakpoint
clear
Clear a breakpoint
go
Start or continue program
gostep
step
stepover

execution
Execute a macro after each
instruction step
Execute a number of instructions
or lines
Single step, but execute through
procedures

expand
find
fopen
fprintf
list
monitor
next
nomonitor

Display Commands
Display commands write program information to a viewport or file about memory, expressions, or procedures.
C source code, for example, can be listed starting at a
particular line number or for a named procedure. Any
active procedure-a procedure on the stack---<:an have
its values displayed.
Memory contents can be dumped in both hexadecimal
and ASCII text format, and, when in assembly-level
mode, memory can be disassembled and displayed in
the code viewport. Variables can be monitored and
examined in the data viewport as the target program
executes. An expression or expression range can be
displayed in the command viewport according to type.
For type conversions, scaling, and output positioning,
display commands can open a file or device and then
write formatted output to it. Several format options are
provided, similar in function to those provided to C in
standard runtime libraries.

disassemble. Display disassembled memory
dump

2-28

(assembly mode)
Display memory contents

printf
printvalue

Display a procedure's local
variables
Search for a string
Open a file or device for writing
Print formatted output to a
viewport
Display C source code
Monitor expressions
Find a string's next occurrence
Discontinue monitoring an
expression
Print formatted output to command
viewport
Print a variable's value

Memory and Register Commands
To help track down problems and test fixes, memory
and registers can be examined and altered. Two blocks
of memory, for example, can be compared for similarities or differences to check for a corrupt RAM image.
Memory and registers can be modified temporarily to
patch programs and continue testing during a debugging session. Expression evaluation is supported during searching and modification.

compare
copy
fill
nomen
search
setmem
setreg
test

Compare two blocks of memory
Copy a memory block
Fill a memory block with values
Prevent access to a memory location
Search a memory block for a value
Change a memory address
Change a register's contents
Examine memory area for invalid values

XRAY29K

Symbol Commands
A symbol is a sequence of characters used to represent
arithmetic values, memory addresses, and C variables.
XRAY29K software knows about two types of symbols:
program and debugger. Program symbols are symbolic
data names or program labels that were defined during
the source program's creation. Debugger symbols manipulate and direct the flow of the debugger and are
specified by the user during a debugging session.
Symbol commands encompass both types of symbols.
Debugger symbols can be added to the debugger symbol table, and then displayed or removed. Information
about program symbols, such as name, data type, storage class, and memory location, can be displayed.
add
Create a symbol
context
Show the current context
delete
Delete a symbol from the symbol table
printsysbols Display symbol information
scope
Specify current module and procedure
. scope

Utility Commands
Command files are commonly used to read macro definitions from a file or to change viewports. After a command file has been created, it may be included in a
startup file and executed as if entered at the keyboard.
When an include file error is encountered, XRAY29K
software can be directed to quit, abort, or continue. A log
of commands entered at the keyboard can be retained
and then subsequently used as a command file. If
XRAY29K software display and execution defaults are
changed, they can be saved in a new startup file.
All these operations are accessed through utility
commands.
Other utility commands control the microprocessor's
state. Reset simulates a microprocessor reset. Restart
restores the microprocessor to its initial state without
initializing memory or restarting the program, and it sets
the program counter to the original starting address
from the absolute file but maintains breakpoint declarations. In addition, the user can temporarily change the
default values for debugger startup options, such as
enabling procedure-level tracing in the trace viewport
and intermixing C source code with assembly code in
the code viewport.
XRAY29K software automatically selects the correct debugging mode-based on whether the object module was
created by the high-level compiler or the assembler.
When a program has both kinds of object modules, a
utility command toggles between the two modes.
XRAY29K software includes a search facility that can
find information in a source file and display the value of
an expression in decimal, hexadecimal or ASCII format.

On-line help is provided for all debugger commands,
command arguments, and function keys, and includes a
selection menu.

alias
cexpression
error
help
history
include
journal
log
mode
option
pause
reset
restart
startup

Replace the name of the command
Calculate an expression's value
Set include file error handling
Display on-line help screen
Recall a specifc command
Read in and process a command file
Save all viewport commands and data
to a file
Record debugger commands and
errors in a file
Select debugging mode (high or
assembly)
Set debugger options for this session
Pause simulation
Simulate microprocessor reset
Reset the program starting address
Save the default startup options

Session Control
The debugger session can be ended at any time or can
be paused while the host operating system environment
is used and then entered again. This area also controls
which object modules are loaded for debugging.

host Temporarily enter the host environment
load Load an object module for debugging
quit End a debugging session

System Requirements
The XRAY29K software resides on the host system and
presents the user with a friendly, high-level interface to
the Am29000 microprocessor-based system. The software communicates with the host system through a serial interface to the ADAPT29K unit or a target board
running the MON29K target-resident debug monitor, or
a bus interface to the PCEB29K personal computer
execution board. The MON29K software and the
ADAPT29K unit actually perform all the Am29000
microprocessor memory and register reads and writes
requested by the user through XRAY29K debugger
commands.
Before the XRAY29K debugger can be used, an absolute object module must be created and downloaded
into the target system RAM memory. The object module
is created using AM D's HighC29K compiler or ASM29K
assembler. Once generated, the object module is
loaded into target system RAM memory by invoking the
XRAY29K software Load command. Figure 3 illustrates
the AMD development tool chain.

2-29

29K Family Support Tools

Software Warranty
Software programs licensed by AMD are covered by the
warranty and patent indemnity provisions appearing in
AMD's standard software license forms. AMD makes no
warranty, express, statutory, implied, or by description
regarding the information set forth herein or regarding
the freedom of the described software program from
patent infringement. AMD reserves the right to modify,
change or discontinue the availability of this software
program at any time and without notice.

Customer Support
Maintenance
All orderable software products include one year of free
Maintenance Support, which starts from the date of
original purchase. Maintenance Support allows customers to receive technical assistance from highly trained
field and factory personnel, to use a call-in on-line information system and to receive product and documentation updates at no additional charge. Customers may
extend Maintenance Support in one-year increments.
Customers can access support services by calling the
24-hour, toll-free 29K'· Family hotline at (800) 2929AMD (292-9263).

On-Line Call-In Bulletin Board
In addition to the support engineering staff, AMD offers
a 24-hour on-line technical support center. A customer

(so~ce)1

can call (800) 2929-AMD at any time to query the
system for the latest information on a particular product:
bug fixes, work-arounds, information on upcoming releases, etc. Messages may be left for the support engineering staff during "after hours."

Training Classes
AMD offers training classes for the 29K Family products. These classes focus on 29K Family system design
and implementation using the broad range of AMD
software development tools. Customers can shorten
the development process through extensive hands-on
training covering a variety of topics. Contact your local
AMD field sales office for more information on training
classes.
Fuslon29K Program
AMD encourages broad-based development and support for the Am29000 microprocessor with the
Fusion29K'· program, a joint-effort program between
AMD and third-party developers. Published twice a
year, the Fusion29K program catalog reveals the
breadth of development and system solutions for
the 29K Family, including software generation and
debug tools; hardware development tools; executive,
kernel and multi-user operating systems; board-level
products; silicon products; and more. For a copy of
the Fusion29K program catalog, call your local
AMD field sales office or the AMD literature center at
(800) 222-9323.

HighC29K

or PCEB29K

Figure 3. AMD Development Tool Chain

2-30

Table of Contents

CHAPTER 3
29K Family Application Notes

Am29000 SYSCLK Driving ...................................................................................................................................3-3
Connected Am29000 Instruction/Data Buses ....................................................................................................... 3-5
Byte-Writable Memories for the Am29000 ............................................................................................................3-8
Am29027 Hardware Interface .............................................................................................................................3-10
When is Interleaved Memory with the Am29000 Unnecessary? ........................................................................ 3-14
Implementation of an Am29000 Stack Cache ................................................. :.................................................. 3-20
Introduction to the Am29000 Development Tools ............................................................................................... 3-42
Preparing PROMs Using the Am29000 Development Tools .............................................................................. 3-81
Programming Standalone Am29000 Systems .................................................................................................. 3-107
Host Interface (HIF) v1.0 Specification .............................................................................................................3-163

Am29000 SYSCLK Driving
Application Note
by Tom Crawford

INTRODUCTION
The purpose of this note is to describe the options of
connecting the SYSCLK pin in an Am29000™ system.

GENERAL CONSIDERATIONS
SYSCLK in any Am29000 system is going to be a highfrequency, heavily loaded signal with strict duty factor
requirements. The most important considerations are
DC levels, capacitive loading, rise/fall times, high/low
times, and transmission line effects.

Before resorting to parallel termination, one should consider carefully the effects of relatively high DC loading
on the buffer V OH and VOL'
The prudent engineer will analyze his SYSCLK signal
with SPICE or a similar CAD package. This permits a
prediction of the actual behavior of the circuit, which is
essentially impossible to obtain without modeling.
At this time, there is no guaranteed relationship between the input on INCLK and the output on SYSCLK.
Information on this relationship will be included in the
Am29000 Data Sheet (order #09075).

There are basically two options. One may make
SYSCLK a source or one may make SYSCLK a destination.

SYSCLK AS A DESTINATION

SYSCLK AS A SOURCE

SYSCLK can be driven externally. This is typically done
to provide an external signal with a known phase relationship to SYSCLK, perhaps at twice the frequency.
Figure 2 shows the connections.

The easiest (and the recommended) way to connect the
clocks in the system is to have the Am29000 generate
and drive SYSCLK. Figure 1 shows the connections.
In this configuration, PWRCLK (pin P3) is connected
directly to Vcc' This is a power pin; it must not be just
pulled up through a resistor.
Two times the desired operating frequency is injected
into INCLK. This is a TTL signal and the duty factor is
unimportant so long as it meets the minimum High time
and Low time parameters (see the Am29000 data
sheet, order# 09075).
SYSCLK is an output with CMOS levels (it swings from
nearly ground to nearly Vee)' All the SYSCLK relativetiming parameters are measured with respect to
SYSCLK at 1.5 volts, the normal TTL ''trip point."
Since SYSCLK must have fairly fast rise and fall times
and may be physically long, it may behave as a transmission line (i.e., exhibit reflections). These effects can
be minimized using a few precautions.
If SYSCLK goes to more than one or, at most, two
places on the board, separate traces to each destination should be used. This minimizes the length of each
line and minimizes the capacitive loading on each line.
Series resistors at the source (at the Am29000) for each
line will reduce the edge rates. Using Schottky or Fast
logic is often preferable to CMOS logic, which lacks
input diodes to ground.

Publication /I

Rev.

11024

A

Amendment

10

PWRCLK and INCLK must both be connected directly
to ground.
SYSCLK is an input and must be driven with a CMOSlevel clock at the operating frequency. The fact that signals are generated from both edges of SYSCLK dictates that it be very nearly a perfect square wave (from
1.5 V to 1.5 V). Perhaps the best way to generate such
a signal is to begin with one at 2X frequency and divide
it by two with a flip-flop. The result is buffered with one
or more pieces of a CMOS buffer. A typical clock generator is shown in Figure 3.

PWRCLK
2X Clock

SYSCLK . - - -__~

INCLK

to external
logic
Am29000

Figure 1. Source

Issue Date

11/89

©

1989 Advanced Micro Devices, Inc.

3-3

29K Family Application Notes

PWRCLK
INCLK
Am29000

As long as they are in the same package and are similarly loaded, they will exhibit similar delays. In some
design groups, putting buffers in parallel is a prohibited
activity, since it is sometimes difficult to determine when
one of the buffers has failed. Local design rules should
always prevail.

GND
SYSCLK
CMOS Clock

Figure 2. Destination
The TTL oscillator operates at twice the required
frequency. Since the 74AC74 is edge triggered, it
responds only to the Low-to-High transition of the
oscillator. Its output is nominally a square wave
(nominally because the tPHL may not be the same as
tPLH).
The buffer is more interesting. Clearly, it has to be
CMOS since SYSCLK is a CMOS input. It has to be
characterized to drive substantial capacitance since the
Am29000 has an input capacitance of 90 pF. One can
put multiple elements in parallel as long as they are in
the same package. In addition, one can drive different
portions of the load with different sections of the device.

Take, for example, the lOT 74FCT240A. With light DC
loading, the output swings within 0.2 V of the power
supply. At 50-pF loading, the propagation delay is
1.5 ns minimum and 4.8 ns maximum. Putting two
elements in parallel will solve the capacitive-loading
situation, if it really needs to be solved. The actual
waveforms should be examined before adding another
buffer. The lOT data book does not distinguish between
tPHL and tPLH. The device should be characterized at
the actual expected loading, temperature, and voltage
ranges to determine the actual switching characteristics.
Take, for a second example, the 74AC04. With light DC
loading, the output swings within 0.1 V of the power
supply. The guaranteed propagation delays for the
74ACOO are 1.0 ns to 7.0 ns; we expect an AC04 to be
the same. In fact, a device actually driving an Am29000
has measured propagation delays of tPLH = 4 and
tPHL =5. Two elements in parallel appear to provide a
somewhat cleaner waveform.

-...r '. ..~

D

Q 1----+-1

74'AC74

OSC

to Am29000

, , ' - - - - to Am29027™

Q
-...r',

.....,.-,

.--.

one half
of board

other half
- . of board

early SYSCLK

Figure 3. Clock Generator

3-4

Connected Am29000 Instruction/Data Buses
Application Note
by Tom Crawford

The use of the Am29000™ has been proposed in a system where the instruction and data buses are connected directly to each other and to a single memory.

have to be two cycles (because BINV is valid so late).
Presumably this would be either a fairly high-end system with lots of very fast memory or a cache system with
a modest amount of SRAM backed up by lots of DRAM.

Am29000

This depends on the availability of very fast static
RAMs. The equation below shows how to calculate the
required access time of the RAMs.

Data

ADRS

-

tMAX

Instruction

Static RAM

~

(para6 + para9A)

For a 25-MHz device running at various clock rates:

t
--

= tClK -

-

FREQ

tCLK

para6

para9A

tMAX

25.00
22.22
20.00
18.18

40
45
50
55

14
14
14
14

6
6
6
6

20
25
30
35

An attempt to actually mechanize a system like this
uncovered a problem. When the Am29000 follows an
instruction read with a data write, there is a guaranteed
"bus crash."

Figure 1. Block Diagram
If the memory is very fast (single cycle), then pipe lined
or burst accesses never need to take place. Every access is a simple one-cycle access. Data writes would
tCLK

Parameter 10 requires that the data remain on the bus
for 2 ns after the rising edge of SYSClK; in fact, RAM
disable times are typically 15 ns. This means there is no
known method to get the instruction off the instruction
bus until as long as 15 ns after the clock rises. Additionally, in the best possible case, a PAL

Instruction Bus

~

Initial Address
11656A-O

Figure 1. Typical Memory

Publication.

Rev.

11656

A

3-14

Amendment

/0

Issue Date:

11/89

© 1989 Advanced Micro Devices, Inc.

When Is Interleaved Memory with the Am29000 Unnecessary?
~

Not To Scale

tClock

~'t'

/'

SYSCLK

--.

~ tPO_Count

)(
-.

Counter

Column
Address

14- tPO_MUX

)K:

-.

+- tPO_Wire

)

Address
At DRAM

<

-.

~tSU

4-tMAX---

)(

Data

11656A-02

Figure 2. Single-Cycle Burst

In order to guarantee positive margins, the following
inequality must be satisfied:
Equation 1: n· tCLOCK - (tMAX + tPO_COUNT +
tPO_MUX + tSU + tPO_WIRE) > 0

The value n is the number of clock cycles a~ailable for
memory. If there is no interleaving or wait states, n = 1.
For two-way interleaving, n = 2, and so on.
The maximum column address delay (static column
decode DRAM) that can be allowed is tMAX. The clockto-output delay of the counter is tPD_COUNT. The
value of tPD_MUX is the input-to-output delay of the
multiplexer. The value of tSU is the setup time for
Am29000 instructions or data.
The value of tPD_WIRE is the propagation delay from
the multiplexer output to the furthest memory chip input.
This is the propagation delay per unit length of wire
times the length of the wire. The propagation delay per
unit length can be estimated from the equation:
(1)

tpd'

= tpd

delay and is usually taken to be approximately 18.5
pFIft. The distributed capacitance (Cd) resulting from
the memory chips is calculated from the per-device input capacitance and the device spacing; assuming 5 pF
per device and two devices per inch gives: 120 pF/ft.
Using these numbers in the above equation yields:
tpd' = 1.77 ---J ( 1 + (120/18.5) = 4.84 ns/ft

Finally, assuming that 32 devices at 24 devices per foot
equals 1.33 tt j then the value fortPD_WIRE is 6.45 ns.
These numbers are summarized in Table 1.

Table 1. Initial Numbers
Name

Value

Obtained From

tPO COUNT
tPO-MUX
tPO-WIRE
tS 25 MHz
tSD 20/16 MHz

6.5
8.0
6.5
6.0
8.0

PAL16R8-7
74F253 In to Zn
See discussion above
Am29000 25MHz tSU
Am29000 20/16 MHz tSU

ns
ns
ns
ns
ns

---J ( 1 + (Cd I Co )

The unloaded propagation delay (tpd) is determined
only by the board material dielectric constant. It is equal
to approximately 1.77 nslft. The trace capacitance (Co)
is a function of the trace impedance and propagation

Figure 3 shows the results of these values in equation 1.
The x-axis is tCLOCK and the y-axis is the allowable
access time; The solid line shows the allowable access
time for n = 1 (single-cycle operation [no interleaving]).
The dotted line shows the allowable access time for
n=2.

1 See Appendix A of the Am29000 Memory Oesign Handbook (order #10623) for additional information on this equation.

3-15

29K Family Application Notes

The discontinuity in the n =1 line reflects the difference
in tSU between 25 MHz and 20/16 MHz. The horizontal
lines show the access times for -70, -80, and -100
Toshiba 1M-by-1 DRAMs. The vertical lines show the
minimum tCLOCK times for 25-, 20-, and 16-MHz
Am29000s. The hatched area indicates where operation is possible without interleaving.

for a 16-MHz Am29000 from '1ast" DRAMs with no interleaving. However, one cannot build a single-cycle burst
memory for a 20- or 25-MHz system without interleaving
with any available DRAM.
Finally, using two-way interleaving, it is possible to build
a memory that supports single-cycle bursts at a clock
rate of 25 MHz or below, from memories with a column
address access time of less than 50 ns.

INITIAL RESULTS
From inspection of Figure 3, it might be concluded that it
is almost possible to build a single-cycle burst memory

n .. 2
(two-way
interleave)

55

50

-100 RAM

~---+~--------~--------~~----------------

45

40

Access Time

35

~---+----------~--------~~-------

~----+------~-----4---.,,~

30

25

20

15

35

40

45

50

55

60

tClock

Figure 3. Initial Results

3-16

65

70

11656A-03

When Is Interleaved Memory with the Am29000 Unnecessary?

ARE IMPROVEMENTS POSSIBLE?

Just before RAS falls, the three-state buffer is enabled.
When the Column Address is required, the three-state
buffers of the PAL device are enabled and the counter is
driven into the array.

Could a system be built with single-cycle bursts without
interleaving to run at 20 MHz? To answer this question
graphically, move the heavy line in Figure 3 upwards
(extending the hatched area to the left). This is done by
reducing or eliminating the numbers, other than tMAX,
in the inequality. These are examined below, one at a
time.

In this configuration, a worst-case design requires that
the extraordinary loading on the PAL device be considered. The total capacitance connected to the outputs
of the PAL devices is greater than the standard load.
However, the capacitances are distributed rather than
lumped. The driver never sees the entire load, so the
wire delay allowance is sufficient.

tPD_COUNT
The 6.5 ns value is based on using a -7 PAL~. This is
already faster than any 74F, 74AS, or 74ACT counter
(or flip-flop, for that matter) in any data book this author
has examined.

tPDWIRE
The wire delay can be reduced only by reducing the
wiring length. Instead of connecting all the memory
chips in serial, the board can be deSigned so that there
are two sets of chips connected in parallel. This halves
the 1.33-foot length previously calculated and reduces
the wire delay to 3.22 ns.

It is certainly possible to "play games" with the clock
scheme. SYSCLK on the Am29000 could be driven a
little later than the clock to the counter. Data hold time is
unlikely to ever be a problem. But the uncertainties in
propagation delay through a CMOS clock driver are
likely to cancel a lot of what could be gained. Furthermore, delaying the clock to the Am29000 delays the
address on the initial cycle.

To reduce tSU, a fast Am29000 at a reduced clock rate
can be used. For example, a 30-MHz Am29000 has a
tSU of only 5 ns; this is 3 ns better than a 16-MHz part,
but it is expensive.

tPD_MUX

Another approach is to insert a pipeline register with a
very low setup time. For example, the data setup time of
a 74F374 is only 2 ns. Of course, including a pipeline
register has adverse consequences. The first access of
a burst-mode access will then be one SYSCLK cycle
longer than would otherwise be required. In addition, the
control logic is made slightly more complicated. A positive side effect is that three-state buffers are included in
the register packages. Figure 5 shows registers in the
instruction path.

The 8.0 ns value is based on using a 74F253. A 1/2 ns
reduction could be realized by building a multiplexer
with a 16L8-7 (7.5 ns). A better way is to completely
eliminate the multiplexer delay by building a three-state
bus. Figure 4 shows one way to do this.
The counter is implemented with a 16R8-7 (actuaily,
more than one is probably required). An 8-bit counter is
required and 2 additional bits of address must be maintained. Since the clock is not gated, some additional
inputs are required to indicate whether the counter
should load, hold, or count.

I

I

"
..../

...
.,)

...

Buffer

/

Counter

..... ~

.,)
Array

11656A-04

Figure 4. Multiplexer Avoidance

3-17

29K Family Application Notes
Now, assuming the implementation of all the changes
described above, the fixed numbers become the values
shown in Table 2.
Table 2. The Improved Numbers
Name

Value

Obtained From

tPO COUNT
tPO MUX
tPO WIRE
tSU-

6.5 ns

PAL16R8-7

0.0 ns
3.2 ns
2.0 ns

Three-state multiplexor
Length
74F374 data sheet

If this is plotted as a function of cycle time, the line has
moved up a considerable amount as compared to
Figure 3. This indicates that it is possible to build a
20-MHz system with the fastest available DRAMs. It
also indicates that it is possible to build a 16-MHz
system with 100-ns DRAMs.

Address
Bus
Am29000

Instruction Bus

1\
SYSCLK

....
.....

Pipeline Register

/\
~~

K

\
I
y

Counter

~

Instruction
Memory

I

"

11656A-05

Figure 5. Pipeline Access

3-18

When Is Interleaved Memory with the Am29000 Unnecessary?
configuration by the access-speed driven memory
device costs of the configuration yields an approximate
cost for each memory system approach.

CONCLUSION
By using the values for proposed memory architectures
into Equation 1, two to four specific values of tMAX can
be determined for appropriate values of tCLOCK. With
this information it is easy to draw graphs like those of
Figures 3 and 6. Such graphs provide a simple display of
the available trade-ofts between system clock rate,
memory architecture, and the memory device access
speed. Multiplying the memory device count for each

Such an analysis may point out significant cost
reductions by quickly identifying those situations in
which a non-interleaved memory architecture and
reduced clock rate can support the required system
performance.

55

50

45

40

Access Time

35

30

25

20

MHz

15

35

40

MHz

45

50

1 MHz

55

60

65

70

tClock
11656A-06

Figure 6. Final Results

3-19

Implementation of an Am29000 Stack Cache
Application Note
by Phil Bunce and Erin Farquhar

INTRODUCTION
This application note will describe the basic mechanisms of the AM D Am29000's cache of the run-time
stack. The stack cache is an important performance feature, because it permits a procedure's entire context to
be resident in on-chip registers, thus eliminating, or at
least reducing, the need for memory accesses.
Our discussion is centered around a single example
program, which is shown in its entirety in Appendix B.
. Before discussing this example, we provide a brief overview of the basic operation of the stack cache.

OVERVIEW
Procedures executing on the Am29000 make use of a
run-time stack, which consists of consecutive, overlapping structures called activation records. An activation
record contains the dynamically allocated information
specific to a particular activation of a procedure. Each
time a procedure is called, a new activation record is
allocated on the stack; when the procedure has finished
executing, its activation record is deallocated from the
stack.
Compilers and assemblers for the Am29000 use two
run-time stacks for activation records: the register stack
and the memory stack. A procedure's activation record
may be divided between these stacks. Both stacks grow
toward lower addresses in memory, and items on the
stacks are referenced as positive offsets from RSP
(Register Stack Pointer) and MSP (Memory Register
Stack Pointer). Both pointers are realized using internal
Am29000 global registers. The global and local registers are both subsets of the general-purpose registers.
The register stack contains parameters passed to the
procedure, the local scalar variables used by the procedure, return linkage information, and the arguments that
the procedure will pass to procedures that it in turn calls.

Publication II

13053

3-20

Rev.

A

Amendment

10

The register stack is cached in the local registers, IrDIr127, as explained below.
The memory stack is used for local structured data, for
example, arrays and records. It also is used for additional scalar data when needed. When the scalar portion
of the activation record for a particular procedure
requires more than 128 words of local-register storage,
the excess may be kept in the procedure's activation
record in the memory stack.
Both stacks are aligned on a double-word (64-bit)
boundary. Procedures are required to maintain this
alignment by adjusting the size of the register stack
frame allocated at procedure entry to be a multiple of
eight bytes.
STACK CACHE

The 128 local registers are used to cache locations in
the register stack, such that when a procedure is active,
its entire register-stack activation record is mapped to
the local registers.
Each word location in the register stack is mapped to a
Single local register. The registernumbercorresponding
to a location in the register stack is given by bits 8-2 of
the 32-bit memory address of that location in the register
stack. Because there are 128 local registers, quantities
whose addresses differ by 512 (all addresses are byte
addresses) are mapped to the same local register and
cannot be in the cache at the same time.
Figure 1 shows a snapshot of the register stack in memory after some calls have been made, and the mapping
of the register stack to the local registers. As shown in
the figure, Global Register 1, called the Register Stack
Pointer (RSP), contains the 32-bit virtual address of the
top of the register stack in memory. This virtual address
on the Am29000 is the lowest-addressed valid stack
location in the current activation record.

Issue Date:

11189

© 1989 Advanced Micro Devices, Inc.

Implementation of an Am29000 Stack Cache

Absolute
Register
Number
R170

•

•
•

I

--+

~

LR2
LR1

R213
R212

LRO
LR127

R:71

----

LR213

R215
R214

•
•

--.

Start of
Stack

Registers

4 - - GR1 (RSP)

...

"I.

-.

ll---L-R-2-14-~L _R_A~~
----

Register Stack
4FFC
Spilled
Activation
Records

4EAB

Used
Locations
40FE
Current
Activation
Record

512
Bytes
4054

Free
Locations
4CAB
11031A-01

Figure 1. Mapping of Register Stack to Stack Cache

Local registers are addressed as positive word offsets
from RSP, as in Figure 2. Specifically, when a local register operand is specified in an instruction (that is, the
most significant bit of the register number is set), the
seven least significant bits are added to bits 8-2 of RSP
and the result is truncated to seven bits. For example, if
RSP has the value 0, as shown in Figure 2, then IrO is
absolute register 128 (the first local register), and Ir1 is
absolute register 129 (the second local register): if RSP
has the value four, then IrO is absolute register 129 and
Ir1 is absolute register 130.
Referring again to Figure 1, the current activation record
is delimited by the Frame Pointer (FP), which by soft-

31

GAl (ASP)

I

ware convention uses Local Register 1, and RSP. FP
points to the "top" of the previous activation record, that
is, to the lowest-addressed word location above the current activation record. When a procedure is active, this
entire area must be cached in local registers.
The register stack between FP and RFB (Register Free
Bound) contains the saved activation records of previously called procedures, which are also currently
mapped to the local-register cache. RFB, by convention
Global Register 127, is set to pOint to the lowestaddressed word in the register stack that is not mapped
to the local registers.

15

:~t--_--------L~...:....-~:::..L....::......!~==:::::_~==T~
lR~ABSREG#

T

Ox80

0,05 = 21'

11031A-02

Figure 2. Local Register Addressing

3-21

29K Family Application Notes
The register stack between RSP and RAB (Register
Allocate Bound) represents stack locations (and corresponding local registers) that are currently "unused" and
thus available for allocation when another procedure is
called. RAB (by convention Global Register 126) is set
to point to the lowest-addressed word in the register
stack that is currently mapped to a local register.
When a procedure is called, RSP is decremented by the
number of words required to accommodate the called
procedure's activation record. When RSP is decremented beyond the location pointed to by RAB and thus
beyond the available local registers, more local registers will be required for the activation record, and some
locations in the stack cache must be written to memory
(or "spilled") before the new activation record is created.
This condition is called overflow. Note that in Figure 1,
locations between RFB and the Start of Stack are saved
activation records that have been previously spilled to
memory.
On return from a procedure, the activation record is
de-allocated by incrementing RSP by the same amount

3-22

it was decremented when the procedure was called. If
the caller's FP (which points to highest location in the
caller's activation record) is greater than RFB (which
points to the first unmapped register stack location
above the activation record), the contents of that portion
of the register stack will have to be loaded into the local
registers to accommodate the caller's activation record.
This condition is called underflow.
Overflow and underflow conditions are detected by
instruction sequences in the prologue and epilogue,
which are the instruction sequences that execute as a
result of a procedure call and procedure return, respectively, and cause a transfer of control to the appropriate
trap handler routine. In the case of an overflow, the trap
handler moves the contents of the required number of
local registers to the register stack in memory and
adjusts the value in RAB and RFB. In the case of an
underflow, the trap handler loads the required numberof
register stack locations into the local registers and
adjusts the value in RAB and RFB.

Implem~ntatlon

OVERVIEW OF EXAMPLE PROGRAM
Our example program consists of the four text files listed
below.
regdcl.h:

Register name declarations

macros.h:

Macro definitions for prologue and epilogue

start.s:

CPU Initialization
Overflow and Underflow trap handler routines

example.s: Two procedures main and recurse
Appendix A contains partial listings from the example
program that are described individually in the sub-sections below.

of an Am29000 Stack Cache

LOGUE and EPILOGUE. These macros are discussed
in the Prologue and Epilogue sections.
START CODE
The module start.s contains code that sets up the execution environment for our example program. The initial
portion of the start code is shown in Appendix A-2, Start
Code. The overflow and underflow trap handlers, also in
start.s, will be discussed later.
We set the beginning of the stack (its highest address in
memory) at Ox5000. The "& -7" in the expression ensures that the value is a multiple of eight, with rounding
downward if necessary.
.equ TOP_STK, (Ox5000 & -7)

Appendix B contains the source for the entire example
program which includes all of the above files.
INCLUDE FILES
There are two include files, regdcl.h and macros.h.
Note that regdcl.h must be included before macros.h,
because macros.h uses definitions from regdcl.h.
In regdcl.h (see Appendix A-1, Register Declarations),
we assign the value 80 as the base of registers to be
used as temporaries by system software. Additional
temporaries will be addressed as offsets from it. These
registers will be used for work space in the start code
and the two trap handler routines.

The two temporary registers, tmp1 and tmp2, are
assigned values that are offsets of SYS_TMP, which
means that tmp1 is Global Register 80, and tmp2 is
Global Register 81.
.reg
.reg

tmp1,
tmp2,

%% (SYS_TMP

+ 0)

%% (SYS_TMP

+ 1)

Then we initialize the four pointers that define the stack
environment.
const

rsp, (TOP_STK-8)

.equ SYS_TMP, 80 isystem temp registers

add
const

rsp, rsp, 0
rab, (TOP_STK-512)

We also assign symbolic names to global and local registers, in accordance with the software calling conventions of the Am29000.

const
const

fp,TOP_STK
rfb,TOP_STK

.reg
.reg
.reg
.reg
.reg
.reg

rsp,gr1
msp,gr125
rab,gr126
rfb,gr127
fp,lr1
raddr,lrO

ilocal reg stack pointer
imemory stack pointer
iregister allocate bound
iregister free bound
iframe pointer
ireturn address

The overflow and underflow trap vectors, V_SPILL and,
V_FILL, are set to the constant values 64 and 65. These
are the vector numbers for the trap handlers chosen for
this example.
.equ
. equ

V_SPILL, 64
V_FILL, 65

The second include file in our example program,
macro.h, contains the macro definitions for PRO-

icreate
;double word
; alignment

;set stack
;pointer
; update rsp
;set register
;alloc bound
;set frame ptr
;set reg free
ibound

Figure 3 shows the initialized stack. Because there has
been no spilling of local registers to the stack in memory,
RFB points to the top of the stack. RAB is, by definition,
512 bytes less than RFB.ln the initial activation record,
defined by FP and RSP, FP points to the top of the stack
(because there has been no prior context) and RSP is
set to a value eight bytes less than FP to allow for the
current FP and raddr when a new activation record
is created. Note that the setting of RSP must precede the setting of FP by at least two instructions
because of the delayed effect of modifying RSP, and
that an explicit arithmetic or logical instruction must
be used to update RSP .
The CPS (Current Processor Status Register) is initialized with the value Ox0072. Assuming the prior state of

3-23

29K Family Application Notes
this register was Reset mode (shown in Figure 4), we
have in effect cleared FZ, OA, and RE, and left the other
bits unchanged. The FZ (Freeze) bit is cleared because
the processor is unfrozen for normal operation. (For a
description of the Freeze bit, refer to the section called
"Special-Purpose Registers," in the Am29000 User's
Manual). We clear the OA (Disable All Interrupts and
Traps) bit to enable all traps. The RE (ROM Enable) bit
is cleared because this example assumes we are executing from RAM.

mtsrim

cps,Ox72

We set the Vector Fetch bit in the Configuration Register
to select a vector table configuration for the Vector Area.

mtsrim

mtsrim

vab,O

Next we initialize the vector table with the address of
the Overflow trap handler routine, called SpiliHandler.
First we load the address of the Spill Handler into a temporary register, using two CONST instructions for the
case when Spill Handler is not in the first 64K-bytes of
memory.

PO, PI, SM, and 01 remain set, meaning that address
translation is disabled (PO and PI), supervisor mode is
selected (SM), and external interrupts are disabled
(01). Supervisor mode is selected because some of the
instructions in our example program are privileged.
Address translation is disabled because this example is
designed for systems not using the TLS. External interrupts are disabled because we have no interrupting
devices and want to eliminate any spurious interrupt
requests.

fp

; VF

The VAS (Vector Area Base Address) register, which
specifies the beginning address of the vector table in
memory, is set to zero.

; PD, PI, SM, DI

rfb

cfg,OxlO

const
consth

tmpl,SpillHandler
tmpl,SpillHandler

y
fp

rsp - - .

raddr

,.

....

512
Bytes

"

ft-----l[

rab _ _

11031A-03

Figure 3. Initialized Stack

31

15

0

101 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 0111 0111 01111111 01 0111
\

V
Reserved

d

I I I I IT I

JI
CA

TU

TE
IP

TP

LK
FZ

WM
RE

PO

DA

1M

PI
SM

01

11031A-04

Figure 4. Current Processor Status Register in Reset Mode
Because each entry in the vector table is four bytes, we
compute the address in the vector table by multiplying

3-24

the vector number V_SPILL (64) by four (a shift left by
two).

Implementation of an Am29000 Stack Cache
const tmp2,V_SPILL
sll
tmp2,tmp2,2 ;compute vector
; address

Then we store the address of Spill Handler (in tmp 1) into
the vector table address we just computed.
store

O,O,tmp1,tmp2

write spill
vector

Initializing the vector table with the address of the underflow trap handler routine (vector number V_FILL) is
done the sam~ way:
const
consth
const
sll
store

tmp1,FillHandler
tmp1,FillHandler
tmp2,V_FILL
compute vect
tmp2,tmp2,2
addr
write fill
O,O,tmp1,tmp2
vector

The procedure start then calls main. passing it the return
address (IrO). A NOP follows the call because the
Am29000 always executes one instruction beyond a call
instruction before the call is taken.
call
nop
halt

raddr,main
;halt after successful
; completion

EXAMPLE FUNCTIONS MAINO AND RECURSEO

After the start code has executed. control is passed to
the procedure mainO. The purpose of mainO is to call
the procedure recurseO. providing it with an initial set of
values. RecurseO calls itself a total of 86 times. then
returns to itself 86 times before returning to mainO. An
overflow condition occurs with the 21 st call. and each
subsequent call causes an additional spill of local registers to memory. When the program returns. the 22nd
return causes an underflow condition. and each subsequent return causes an additional fill from memory to the
local registers.
The basic operation of mainO and recurseO is summarized by the following C program:
main ()
{

recurse(1,42);
recurse(n,m)
int n,m;
int i,j;
if (n > 85) return;
i = n + 1;
recurse (i,m) ;

The code for mainO and recurseO is shown in Appendix
A-3 and A-4. Code for MainO and Code for RecurseO.
respectively.

3-25

29K Family Application Notes
words needed for the called procedure's FP and return
address when it calls another procedure. ANDing the
expression with the complement of 1 (& -1) maintains
double-word alignment on the stack by setting the least
significant bit to zero. The "+ 1" ensures that the amount
is rounded up, not down.

PROLOGUE
As with all Am29000 procedures, mainO begins with a
prologue. The macro definition of PROLOGUE and
the expansion of PROLOGUE for mainO are shown in
Appendix A-5 and A-6, Prologue Macro and Prologue
Expansion for MainO, respectively.

The expression for SIZE_CNT includes INCNT and two
additional words for Ira (return address) and FP of the
caller.

The purpose of PROLOGUE is to allocate an activation
record and check for overflow before the body of the procedure is executed. It is invoked with three parameters:
the number of arguments passed (INCNT), the number
of registers required for the procedure's local variables
(LOCCNT), and the maximum number of arguments
that the procedure may pass to anyone function it in turn
calls (OUTCNT).
. macro

The three macro variables, IN_PRM, LOC_REG, and
OUT_PRM are used to establish offsets into the stack
for input, local, and output arguments. These macro
variables are set only if the corresponding value of the
parameter is not equal to zero .

PROLOGUE,INCNT,LOCCNT,OUTCNT

· if (INCNT)
.set IN_PRM, (2 + ALLOC_CNT + Ox80)
.endif
· if (LOCCNT)
.set LOC_REG, (2 + OUTCNT + Ox80)
.endif
· if (OUTCNT)
.set OUT_PRM, (2 + Ox80)
.endif

The values of ALLOC_CNT and SIZE_CNT are computed from the parameters.

.set

ALLOC_CNT, «2+0UTCNT+LOCCNT+l)&-1)

.set

SIZE_CNT, (ALLOC_CNT+2+INCNT)

ALLOC_CNT is the amount of space on the stack that
must be newly allocated by the Prologue for the procedure's activation record. SIZE_A is the amount of space
that must be accessible by the procedure, that is, the
size of its activation record.

I~ the abo.ve, a macro variable is set equal to an expression that IS evaluated to a local register number when
the program is assembled. The macro variable can then
be used as the base register for offset addressing of
parameters of that type (as shown in Figure 5). The
"Ox80" provides the 125-word offset required for a local
register access.

The expression for ALLOC_CNT does not use INCNT,
because incoming parameters were already allocated
space on the stack as the outgoing parameters (OUTCNT) of the calling procedure. "2" is the number of

1..IN_PRM+1
IN_PRM+O
oldfp
old raddr

LOC_REG+1
LOC_REG+O
OUT_PRM+1
OUT_PRM+O
fp

RSP--'

- r--

raddr
11031A-05

Figure 5. Prologue Parameters

3-26

Implementation of an Am29000 Stack Cache
The body of the PROLOGUE macro has three instructions:

In the following discussion of SpiliHandler, we assume
the reader is familiar with the processor's response to
traps. If not, referto the section called Interrupt and Trap
Handling in the Am29000 User's Manual.

rsp, rsp, (ALLOC_CNT« 2)
V_SPILL, rsp, rab
fp, rsp, (SIZE_CNT« 2)

sub
asgeu
add

The first three .reg directives assign symbolic names
to the three temporary system registers used by
SpillHandler.

In the above instructions, ALLOC_CNT and SIZE_CNT
are shifted left by two to convert them from word quantities to the required byte quantities (the stack registers, whose contents will be modified, contain byte
addresses).

.reg

R_Cnt, %% (SYS_TMP+O) :temp for
: count
R_TmpPCO,%%(SYS_TMP+l);temp for
:PCO
R_TmpPC1,%%(SYS_TMP+2);temp for
;PCl

.reg

The first instruction allocates an activation record by
decrementing RSP by the amount ALLOC_CNT.

.reg

The second instruction asserts that RSP of the new activation record is greater than or equal to RAB. If this is
not the case, (that is, RSP has been decremented
beyond RAB), an overflow trap occurs, and there is a
transfer of control to the trap handler routine,
SpiliHandler, pointed to by the vector V_SPILL. The trap
handler will move (spill) the contents of the required
number of local registers to the register stack in memory
and adjust RFBand RAB, as described in the Overflow
Trap Handler section.

The old PCs are saved in two of the temporary registers
just declared.

mfsr
mfsr

:save the PCs

The CPS (Current Processor Status Register) is set to
the value Ox73. This clears the FZ (Freeze) bit, which
was set by hardware when the trap was taken (see
Figure 6), so that the trap handler can execute a Store
Multiple instruction. (Note that the PCs must be saved
before the FZ bit is cleared.) The DA (Disable All Interrupts and Traps) bit remains set, which prevents the
processor from taking any traps except the *WARN,
Instruction Access Exception, Data Access Exception,
and Coprocessor Exception traps. PD, PI, SM, and DI
also remain set.

The third instruction sets FP to point to the location just
above the new activation record, so it can be used
for underflow checking in the EPILOGUE macro of a
procedure that is called by this procedure (see Epilogue
section).
After the prologue, mainO calls recurseO. The expansion of PROLOGUE for recurseO is shown in Appendix
A-7, Prologue Expansion for RecurseO.

mtsrim

cps,Ox73

: PO, PI, SM, DI, DA

Now we can use the Store Multiple instruction to store
the required number of local registers into the register
stack in memory. This instruction requires a source, a
destination, and a count.

OVERFLOW TRAP HANDLER
. On the 21st call to itself, recurseO causes an overflow
trap. The code that services this trap is shown in Appendix A-8, Overflow Trap Handler, and is described below.

31

R_TmpPCO, pcO
R_TmpPC1, pcl

'5

0

I I01 0I01 0I01 0I01 0I01 0I01 0I01 0I01 0I01 0I01 01,1 0I01 01,1,1,1 0I01,1,1
0

I II I I I i i
CA

TE
IP

TU
TP

LK
FZ

WM
RE

PO

PI

1M
SM

OA
01
11031A-06

Figure 6. Current Processor Status Register After an Interrupt or Trap

3-27

29K Family Application Notes

As explained earlier (and shown in Figure 1), the area
between RSP and RAB represents the local registers available for allocation when a procedure is called.
Because there has been an overflow and RSP has been
decremented beyond RAB, we can compute the size of
the required spill (the count for the Store Multiple) by
subtracting RSP from RAB.
sub

R_Cnt, rab, rsp

;R_Cnt = number of
. ; bytes to spill

Then we use R_Cnt to adjust RFB, so that it correctly
reflects the area in the register stack that will be mapped
to the local registers.
sub

rfb,rfb,R_Cnt

;move down the
;frame bound

The local registers that have to be spilled are those corresponding to register-stack locations between RSP
and RAB, because they are the local registers that must
be occupied by the new activation record. So the instruction source will be IrO, which corresponds to RSP.
The instruction's destination will be the register-stack
location pointed to by the previously modified RFB, because that is the register-stack location at the correct
512-byte offset from RSP.
storem 0, 0, IrQ, rfb ;spill from the
;allocated area

Then we set RAB to point to the top of stack, because
that is now the lowest stack address currently cached in
local registers.
add

Before using the Load Multiple instruction, R_Cnt must
be written as a word amount into the CR field of the
Channel Control register, which is used by the processor to determine the number of loads to memory. So we
convert R_ent from a byte to a word amount using the
Shift Right Logical instruction.

rab, rsp, 0

We set CPS to the value Ox473. This sets the FZ bit,
which must be set before we restore PCO and PC1. PO,
PI, SM, 01, and OA remain set.
mtsrim

cps,Ox473

;R_Cnt = count of
;words to spill

Because the CR field is zero-based, we subtract one
from R_Cnt
sub

R_Cnt,R_Cnt, 1

;FZ, po, PI, SM,
;DI, DA

Then the two PCs are restored and the IRET (Interrupt
Return) instruction restores the previous contents of
CPS from the Old Processor Status Register, unfreezes
the processor, and begins fetching from PCO and PC1.

;correct for storem

and then use the Move to Special Register instruction
to write it to the CR field.
;set up count for
;storem

3-28

;move down the
;allocate bound

mtsr
mtsr
iret

pcQ, ~TmpPCO
pcl, R_TmpPCl

;restore the PCs

Implementation of an Am29000 Stack Cache

EPILOGUE

UNDERFLOW TRAP HANDLER

When recurse has called itself 86 times, it returns and
executes an Epilogue. The EPILOGUE macro is shown
in Appendix A-9, EPILOGUE Macro.

On the 22nd return of recurseO to itself, an underflow
trap occurs. The code that services this trap is shown in
Appendix A-11, Underflow Trap Handler, and is discussed below.

EPILOGUE's first instruction de-allocates the procedure's activation record by adding ALLOC_CNT to RSP.
This is followed by a NOP, because a change in the
value of RSP must be separated by at least one cycle
from an instruction that references a local register (in
this case, the instruction JMPI, whose operand raddris

The two old PCs are saved in temporary registers
declared in the SpillHandler routine.

mfsr
mfsr

R_TmpPCO, pcO
R_TmpPC1, pc1

isave the PCs

frO).
add
nap
jmpi

rsp, rsp,

(ALLOC_CNT«

2)

raddr

Before the Jump Indirect instruction finishes executing,
the next instruction, ASLEU, is executed. This instruction asserts that the caller's FP, now restored because
the caller's RSP has been restored, is less than or equal
to RFB. If the assertion is false (which means that FP is
pointing to an unmapped, previously spilled registerstack location), an underflow trap occurs, and control is
transferred to the trap handler routine, FiIIHandler,
pOinted to by the vector V_FILL. The trap handler will
move the contents of locations in the register stack
to the local registers and adjust RAB and RFB, as
described in the Underflow Trap Handler section.

asleu

V_FILL, fp, rfb

At the end of the Epilogue, the parameters are set to an
illegal value. This ensures that if they are used again
before they are explicitly set, an assembly-time error will
be reported.

(1024)

.set

IN,,-PRM,

.set

LOC_REG,

(1024)

.set

OUT_PRM,

(1024)

.set

ALLOC_CNT,

(1024)

iillegal, to
icause
ierr on ref
iillegal, to
icause
ierr on ref
i illegal, to
icause
ierr on ref
iillegal, to
icause

The CPS (Current Processor Status Register) is set to
the value Ox73. This clears the FZ bit, so that the trap
handler can execute a Load Multiple instruction. The DA
bit remains set, which prevents the processor from taking any traps except the ·WARN, Instruction Access
Exception, Data Access Exception, and Coprocessor
Exception traps. PO, PI, SM, and 01 also remain set.

mtsrim

cps, Ox73

iPD, PI, SM, DI, DA

We will use the Load Multiple instruction to load locations in the register stack into the local registers. The
Load Multiple instruction requires a source, a destination, and a count.
Clearly, the source for the Load Multiple instruction is
the location pointed to by RFB, since RFB points to the
first location in the register stack that was previously
spilled from the local registers.
The destination of the Load Multiple instruction will, of
course, be the local register corresponding to RFB.
Local registers may be specified as instruction operands in one of two ways: using a local register number
(in the range from 0 to 127), or using the absolute register number (in the range 126 to 255) in an Indirect
POinter Register. With the first method, the local register
number is computed as a positive word offset of RSP.
This option is not available to us because the trap handier has no way of knowing the offset from RSP (that is,
the local register number) corresponding to RFB.
So we will convert the address in RFB to an absolute
local register number, put this nu mber in Indirect Pointer
A (because the destination operand uses Indirect
Pointer A), and then specify Global Register 0 (which
indicates an indirect pointer access) as the destination
register in the Load Multiple instruction.

ierr on ref
The expansion of EPILOGUE for recurseO is shown in
Appendix A-10, Epilogue Expansion for RecurseO.

To convert the address in RFB to an absolute local register number, we OR it with 512. This sets bit 9, which

3-29

29K Family Application Notes

selects a local registe~; bits 2-8 give the absolute local
register number.

Because the CR field is zero-based, we subtract one
from R_Cnt

const

R_Cnt,Sl2

sub

or

R_Cnt,~Cnt,rfb

;make local reg
;ip
;from rfb

Then we use the Move To Special Register instruction to
put this value in the Indirect Pointer A Register.
:set up indirect
;ptr
;for loadm

mtsr

Recalling that the underflow trap was signaled because
FP is pointing to an unmapped and previously spilled
register stack location at a higher memory address than
RFB, we can compute the numberof local registers to fill
by subtracting RFB from FP.
;R_Cnt = t of
;bytes to fill

sub

We use the just-computed value to adjust RAB, so that
it correctly points to the new lower bound of the register stack mapped to local registers. We perform this
operation now because it requires a byte amount, and
R_Cnt will be converted to a word amount in the next
instruction.

R_Cnt, R_Cnt,l

and then use the Move to Special Register instruction to
write it to the CR field.
mtsr

;set up count for
;loadm

Now we use the Load Multiple instruction to transfer the
contents of the register stack in memory to the local registers, specifying RFB as the address in the register
stack from which to load, and grO (Indirect Pointer A) as
the local register number at which to begin the fill.
loadm

O,O,grO,rfb

rab, rab, R_Cnt

add

rfb,fp,O

srI

3-30

R_Cnt,R_Cnt,2

;R_Cnt = number of
;words to fill

;move up frame bound

We set CPS to the value OX473. This sets the FZ bit,
which must be set before we restore PCO and PC1. PO,
PI, SM, 01, and OA remain set.
cps,Ox473

;move up the
;allocate bound

Before use of the Load Multiple instruction, the count
must be written as a word amount into the CR field
of the Channel Control Register. Hence, we convert
R_Cnt from a byte to a word amount using the Shift
Right instruction.

;fill area freed

After the registers have been filled, we update RFB so
that it correctly points to the upper bound of the register
stack that is currently cached.

mtsrim
add

:correct for loadm

;FZ, po, PI, SM,
:DI, DA

Then the two PCs are restored and the IRET (Interrupt
Return) instruction restores the previous contents of
CPS, unfreezes the processor; and begins fetching from
PCO and PC1.
mtsr
mtsr
iret

pcO,R_TmpPCO
pcl, R_TmpPCl

;restore the PCs

Implementation of an Am29000 Stack Cache

APPENDIX A:
PARTIAL LISTINGS EXTRACTED FROM EXAMPLE PROGRAM
A-1. REGISTER DECLARATIONS

,-----------------------------------------------------------------------; Global registers

,-----------------------------------------------------------------------.equ
.reg
.reg
.reg
.reg

SYS _TMP, 80
rsp, grl
msp, gr125
rab, gr126
rfb, gr127

system temp registers
local register stack pointer
memory stack pointer
register allocate bound
register free bound

,-----------------------------------------------------------------------; Local compiler registers
(only valid if frame has been established)
.reg
.reg

fp, lrl
raddr, IrO

frame pointer
; return address

i----------------------------------------------------- -----------------; Vectors
,-----------------------------------------------------------------------.equ
V_SPILL, 64
.equ
V_FILL, 65

3-31

29K Family Application Notes
A-2. START CODE
. include
.equ
.text

"regdcl.h"
TOP_STK, (Ox5000
.global start

& -7)

create double word aligned value

start:
.reg
.reg
const
add
const
const
const

tmp1, %%(SYS_TMP + 0)
tmp2, %%(SYS_TMP + 1)
rsp, (TOP STK-8)
rsp,rsp,O
rab, (TOP STK-512)
fp,TOP_STK
rfb,TOP_STK

set correct mode
cps, Ox72
mtsrim
cfg, Ox10
mtsrim
mtsrim
vab,O
connect up spill
const
consth
const
sll
store

handler
tmp1,SpillHandler
tmp1,SpillHandler
tmp2,V_SPILL
tmp2,tmp2,2
0,0,tmp1,tmp2

set
set
set
set
set

stack ptr
shadow rsp
reg alloc bound
frame ptr
reg free bound

PD, PI, SM, DI
VF

compute vect addr
write spill vector

connect up fill handler
tmp1,FillHandler
const
tmp1,FillHandler
consth
tmp2,V_FILL
const
tmp2,tmp2,2
sll
0,0,tmp1,tmp2
store

compute vect addr
write fill vector

call main program
call
raddr,main
nop
halt

halt after successful completion

3-32

Implementation of an Am29000 Stack Cache
A-3. CODE FOR MAINO
. include
. include
.global

"regdcl.h"
"macros.h"
main

main ()
recurse(1,42);

main:
PROLOGUE

0,0,2

invoke macro

° ic, ° loc,

2 og

name outgoing args
.reg
M_out n, %%(OUT_PRM + 0)
.reg
M_out_m, %%(OUT_PRM + 1)
recurse(1,42)
const
call
const

M_out_m, 42
raddr,recurse
M_out_n, 1

EPILOGUE

3·33

29K Family Application Notes

A-4. CODE FOR RECURSE(}
.global

recurse

recurse (n,m)
{

int i, j;
if (n > 85) return;
i = n + 1;
recurse(i,m);

recurse:
PROLOGUE

2,2,2

invoke macro 2 ic, 2 loc, 2 og

name ic args
.reg
.reg

R_in_n, %%(IN_PRM + 0)
R_in_m, %%(IN_PRM + 1)

name locals
.reg
.reg

R_i, %%(LOC_REG + 0)
R_j, %%(LOC_REG + 1)

name outgoing args
.reg
R_out n, %%(OUT_PRM + 0)
.reg
R_out_m, %%(OUT_PRM + 1)
name temporary register
R_tmp, IrO
.reg
i f (n > 85)

cpgt
jmpt

i

= n

return
R_tmp, R_in_n, 85
R_tmp, rec_01

+ 1
add

recurse(i,m)
add
call
add

EPILOGUE

3-34

R_out_m, R_in_m,
raddr, recurse
R_out_n, R_i,

°

°

Implementation of an Am29000 Stack Cache
A-S. PROLOGUE MACRO
macro PROLOGUE
Parameters: INCNT
LOCCNT
OUTCNT
.set
.set
.set
.endif

input parameter count
local register count
output parameter count

ALLOC_CNT, «2 + OUTCNT + LOCCNT + 1) & -1)
SIZE_CNT, (ALLOC_CNT + 2 + INCNT)
IN_PRM, (2 + ALLOC_CNT + Ox80)

(LOCCNT)
.set
.endif

LOC_REG,

(2 + OUTCNT + Dx8D)

(OUTCNT)
.set
.endif

OUT_PRM,

(2 + Ox8D)

. if

. if

sub
asgeu
add
.endm

rsp, rsp, (ALLOC_CNT« 2)
V_SPILL, rsp, rab
fp, rsp, (SIZE_CNT« 2)

A-6. PROLOGUE EXPANSION FOR MAINO
main:
PROLOGUE
.set
.set
.set
sub
asgeu
add

; invoke macro
0,0,2
ALLOC_CNT, «2 + 2 + 0 + 1) & -1)
SIZE_CNT, (ALLOC_CNT + 2 + 0)
OUT_PRM, (2 + Ox80)
rsp, rsp, (ALLOC_CNT « 2)
V_SPILL, rsp, rab
fp, rsp, (SIZE_CNT « 2)

3-35

29K Family Application Notes
A-7. PROLOGUE EXPANSION FOR RECURSEO

recurse:
PROLOGUE

2,2,2

; invoke macro

.set
.set
.set
.set
.set

ALLOC_CNT, «2 + 2 + 2 + 1) & -1)
SIZE_CNT, (ALLOC_CNT + 2 + 2)
IN_PRM, (2 + ALLOC_CNT + Ox80)
LOC_REG, (2 + 2 + Ox80)
OUT_PRM, (2 + Ox80)

sub
asgeu
add

rsp, rsp, (ALLOC_CNT« 2)
V_SPILL, rsp, rab
fp, rsp, (SIZE_CNT« 2)

A-8. OVERFLOW TRAP HANDLER

.reg
.reg
.reg

R_Cnt, %%(SYS_TMP + 0)
R_TmpPCO,%%(SYS_TMP + 1)
R_TmpPC1,%%(SYS_TMP + 2)

.global

SpillHandler

temp for count (shared)
temp for PCO
temp for PC1

SpillHandler:
This routine handles a false assertion in the standard prologue.
In:

rab > rsp
Ir1 <= rfb
rfb
rab + 512

(requiring an allocation)

Out:

rab == rsp
Ir1 <= rfb
rfb
rab + 512

(just enough allocated)

mfsr
mfsr
mtsrim
sub
sub
srI
sub
mtsr
storem
add
mtsrim
mtsr
mtsr
iret

3-36

R_TmpPCO, pcO
R_TmpPC1, pc1
cps, Ox73
R_Cnt, rab, rsp
rfb, rfb, R_Cnt
R_Cnt, R_Cnt, 2
R_Cnt, R_Cnt, 1
cr, R_Cnt
0, 0, IrO, rfb
rab, rsp, 0
cps, Ox473
pcO, R_TmpPCO
pc1, R_TmpPC1

save the PCs
PD, PI, SM, DI, DA
R_Cnt =
of bytes to spill
move down the frame bound
R_Cnt = count of words to spill
correct for storem
set up count for storem
spill from the allocated area
move down the allocate bound
FZ, PD, PI, SM, DI, DA
restore the PCs

*

Implementation of an Am29000 Stack Cache
A-9. EPILOGUE MACRO

; macro EPILOGUE
.macro EPILOGUE
add
nop
jmpi
asleu
.else
jmpi
nop
.endif
. set
.set
.set
.set
.endm

rsp, rsp,

(ALLOC_CNT«

2)

raddr
V_FILL, fp, rfb
raddr

illegal,
illegal,
illegal,
illegal,

IN_PRM, (1024)
LOC_REG, (1024)
OUT_PRM, (1024)
ALLOC_CNT, (1024)

to
to
to
to

cause
cause
cause
cause

err
err
err
err

on
on
on
on

ref
ref
ref
ref

A-10. EPILOGUE EXPANSION FOR RECURSEO

EPILOGUE
add
rsp, rsp, (ALLOC_CNT«
nop
jmpi raddr
asleu V_FILL, fp, rfb

2)

3-37

29K Family Application Notes
A-11. UNDERFLOW TRAP HANDLER
.global

FillHandler

FillHandler:
iThis routine handles a false assertion in the standard epilogue.
iIn:

iOut:

lrl > rfb
rsp >= rab
rfb
rab

+ 512

lrl == rfb
rsp >= rab
rfb
rab + 512
mfsr
mfsr
mtsrim
const
or
mtsr
sub
add
srI
sub
mtsr
loadm
add
mtsrim
mtsr
mtsr
iret

3·38

(requiring de-allocation)

(just enough freed)

R_TmpPCO, pcO
R_TmpPC1, pcl
cps, Ox73
R_Cnt, 512
R_Cnt, R_Cnt, rfb
ipa, R_Cnt
R_Cnt, lrl, rfb
rab, rab, R_Cnt
R_Cnt, R_Cnt, 2
R_Cnt, R_Cnt, 1
cr, R_Cnt
0, 0, grO, rfb
rfb, lrl, 0
cps, Ox473
pcO, R_TmpPCO
pcl, R_TmpPCl

save the, PCs
PO, PI, 8M, DI, DA
make local reg ip
from rfb
set up indirect ptr for loadm
R_Cnt = # of bytes to fill
move up the allocate bound
R_Cnt = number of words to
correct for loadm
set up count for loadm
fill area freed
move up frame bound
FZ, PO, PI, 8M, DI, DA
restore the PCs

Implementation of an Am29000 Stack Cache

APPENDIX B:
COMPLETE LISTING OF EXAMPLE PROGRAM
. include

"regdcl.h"

.equ

TOP_STK,

(Ox5000

&

-7)

;create double word
;aligned value

.text
.global start
start:
.reg
.reg

tmp1, (SYS_TMP + 0)
tmp2, (SYS_TMP + 1)

const
const
const
const

rsp, (TOP_STK-8)
rab, (TOP_STK-512)
fp,TOP_STK
rfb,TOP_STK

;set correct mode
mtsrim
cps, Ox72
mtsrim
cfg, Ox10
mtsrim
vab,O

;set
;set
;set
;set

stack ptr
reg alloc bound
frame ptr
reg free bound

;PD, PI, SM, DI
;VF

; connect up spill handler
const
tmp1,SpillHandler
consth
tmp1,SpillHandler
const
tmp2,V_SPILL
sll
tmp2,tmp2,2
store
0,0,tmp1,tmp2

;compute vect addr
;write spill vector

;connect up fill handler
const
tmp1,FillHandler
consth
tmp1,FillHandler
const
tmp2,V_FILL
sll
tmp2,tmp2,2
store
0,0,tmp1,tmp2

;compute vect addr
;write fill vector

;call main program
call
raddr,main
nop
halt

;halt after successful completion

;The routines below handle overflow and underflow conditions.
iThe temps which they use are given below.
.reg
.reg
.reg

R_Cnt, (SYS_TMP + 0)
R_TmpPCO, (SYS_TMP + 1)
R_TmpPC1, (SYS_TMP + 2)

;temp for count (shared)
itemp for PCO
itemp for PC1

3·39

29K Family Application Notes
.global 8pillHandler
8pillHandler:
iThis routine handles a failed assertion in the standard prologue
iIn:rab > rsp(requiring an allocation)
ifp <= rfb
irfb == rab + 512
iOut:rab == rsp(just enough allocated)
ifp <= rfb
;rfb == rab + 512
mfsr
mfsr

R_TmpPCO, pcO
R_TmpPC1, pc1

isave the PCs

mtsrim

cps, Ox73

iPD, PI, 8M, DI, DA

sub
sub
srI
sub
mtsr
storem
add

R_Cnt, rab, rsp
rfb, rfb, R_Cnt
R_Cnt, R_Cnt, 2
R_Cnt, R_Cnt, 1
cr, R_Cnt
0, 0, IrO, rfb
rab, rsp, 0

iR_Cnt =
of bytes to spill
imove down the frame bound
;R_Cnt = count of words to spill
icorrect for storem
iset up count for storem
ispill from the allocated area
imove down the allocate bound

mtsrim

cps, Ox473

iFZ, PD, PI, 8M, DI, DA

mtsr
mtsr

pcO, R_TmpPCO
pc1, R_TmpPC1

irestore the PCs

*

iret

.global FillHandler
FillHandler:
iThis routine handles a failed assertion in the standard epilogue
iIn:fp > rfb(requiring de-allocation)
irsp >= rab
irfb == rab + 512
iOut:fp == rfb(just enough freed)
irsp >= rab
irfb == rab + 512

3·40

mfsr
mfsr

R_TmpPCO, pcO
R_TmpPC1, pc1

isave the PCs

mtsrim

cps, Ox73

iPD, PI, 8M, DI, DA

const
or
mtsr

R_Cnt, 512
R_Cnt, R_Cnt, rfb
ipa, R_Cnt

imake local reg ip
ifrom rfb
iset up indirect ptr for loadm

Implementation of an Am29000 Stack Cache
sub
add
srI
sub
mtsr
load
add

R_Cnt, fp, rfb
rab, rab, R_Cnt
R_Cnt, R_Cnt, 2
R_Cnt, R_Cnt, 1
cr, R_Cnt
mO, 0, grO, rfb
rfb, fp,

iR_Cnt = # of bytes to fill
imove up the allocate bound
iR_Cnt = number of words to fill
icorrect for loadm
iset up count for loadm
ifill area freed
imove up frame bound

mtsrim

cps, Ox473

iFZ, PO, PI, SM, DI, DA

mtsr
mtsr

pcO, R_TmpPCO
pcl, R_TmpPCl

irestore the PCs

°

iret

,._---------------------------------------------------------------------

3-41

Introduction to the Am29000
Development Tools
Application Note
by Doug Kern and Douglas Walton

INTRODUCTION
The development of a microprocessor-based system is
a complicated and detailed undertaking that requires
skilled personnel and efficient test equipment. Because
of the sophistication of modern microprocessing systems, they usually cannot be flawlessly designed on the
first iteration, and nearly always require extensive
debugging and testing time. Experienced developers
know that few designs function perfectly at power-up.
Faults occur due to erroneous logic, poor assembly, or
defective parts, so some debugging is virtually always
necessary. Therefore, every effort should be made to
plan the debugging and testing process before the first
prototype is built. Without advance planning, the
deSigner may find that the circuit either cannot be successfully debugged, or that the necessary debug time is
prohibitive.
Planners should keep in mind that testing and debugging continues throughout the life of the product.
Because different phases in the product life cycle have
different characteristics, the requirements for each must
be considered. The major phases of the product life
cycle are development, production (pilot, limited, and
large-scale), and field service.
Apart from the skill of the personnel, the efficiency of test
equipment is a critical area that affects the testing time
in every phase. Outdated or ineffective equipment will
slow down even the most highly trained personnel. More
importantly, expensive, state-of-the-art test equipment
will be wasted if its use is not preplan ned. Careful consideration must be given to the type of equipment
needed to service the product, as well as its cost and
how it will be disbursed to the field.

actual system hardware. They normally are used with a
prototype or production system to determine the cause
of failure, and are distinguished from the 29KTM tools
used to prepare programs for execution on a target
system (see the 29K Tool Chain section).
Figure 1 shows the relationship of these development
tools to the application and each other. The components
are described below:
ADAPT29K-Advanced Development and Prototyping
Tool. ADAPT29KTM is a standalone system that inter-

faces to the application like an in-circuit emulator. It provides a wide range of debugging functions without
intruding on the application's execution.
MON29K-Target Resident Monitor.' MON29KTM is a
monitor program that executes on the target Am29000.
It provides many of the same ~ebugging functions as the
ADAPT29K, even though it is a software product.
XRAY29K-5ource-Level Debugger. XRAY29!

Specific help on an individual command can be obtained
by entering H followed by the letter of the command. All
command explanations show the complete command
syntax and give a short description of how the command
functions.
HOW THE ADAPT29K WORKS
The ADAPT29K runs on a different processor than the
target. It performs all operations on the target by controlling the target Am29000. A buffered cable connects the
ADAPT29K to the target's Am29000 socket. Figure 10
shows the signals carried on the cable. Note that
although the ADAPT29K traces the address bus, it cannot drive it, and, consequently, cannot provide an overlay memory. It uses the target Am29000 to set up all
memory addresses before it can access them.
Execution Control
The execution state of the target Am29000 is controlled
by using the CNTLO and CNTL 1 signals. By asserting
different combinations of the two signals, the Am29000
can be placed in one of four states: RUN, HALT, STEP,
and LOAD TEST INSTRUCTION. How these states
affect the processor is explained in detail in the
Am29000 User's Manual, order #1 0620.

Introduction to the Am29000 Development Tools
The LOAD TEST INSTRUCTION state should be noted
due to its importance to the ADAPT29K. Because the
LOAD TEST INSTRUCTION state interrupts normal
sequential processing and permits a sequence of
instructions to be loaded into the processor's instruction
stream, the ADAPT29K, using the LOAD TEST
INSTRUCTION STATE, can force the processor to
perform operations on the target.

Memory Access
Due to the high speed of the Am29000, the ADAPT29K,
unlike some in-circuit emulators, does not provide any
overlay memory. To maintain real access times, the
processor must be kept as physically close to its memory as possible. There is no time available for the propagation delay that would be experienced in accessing
memory across the interface cable to the ADAPT29K.

ADAPT29K

Target

~

Data Bus 0-31
CNTL,-CNT~

~

V...
y

STAT,-STATft

....
....

........
--......

lEST
RESET

~

DRDY
DERR

Instruction Bus 0-31

Vt

~

)

All Am29000 Signals
(except INCLK, SYSCLK,
CNTL" TEST, RESEl)

CNT~,

11014A-10

Figure 10. The ADAPT29K·to-Target Interface

3·51

29K Family Application Notes
All target code and data is stored on the target. When
the ADAPT29K is commanded to display a data object,
it places the target Am29000 in the LOAD TEST
INSTRUCTION state. Then a sequence of instructions
is inserted to store the present Am29000 state, set up a
new memory address, load the data into an Am29000
register, store the data to the ADAPT29K, and restore
the Am29000 state.
This method imposes certain requirements. Because
data is transferred between the ADAPT29K and the
target over the data bus, the target memory must be
protected from corruption. To prevent inadvertent
changes to the target memory, it must be disabled from
responding when the ADAPT29K and the target processor are transferring data. There are two ways of doing
this: (1) the memory can be disabled by a low state on
the PIN169 alignment pin (pin D4), or (2) the target
memory can be disabled when an 06 hex is decoded on
the OPT2-0PTo pins.
When the contents of instruction ROM must be
displayed, the ADAPT29K must instruct the processor
to read instruction ROM as data. Hence, a hardware
path must exist for data stored in the instruction ROM
space (on the instruction bus) to be loaded into an
Am29000 register from the data bus.
Similarly, when the ADAPT29K is used to download a
program, the code will be written word-by-word to the
target Am29000, which then writes the instructions into
proper memory space. Suppose, for example, code is to
be written into the instruction/data RAM. Because the
ADAPT29K has no means for virtual translation of
addresses, it will use Store instructions to write the code
into the absolute address in the instruction/data space.
When the Am29000 goes to execute the code, it will expect to fetch its instructions over the instruction bus.
This requires that there be a hardware path from the
data bus to the instruction bus and a one-to-one correspondence between addresses on the data bus and the
addresses on the instruction bus. This occurs because
the instruction is stored at an address on the data bus,
but is fetched via the instruction bus. In other words, instructions fetched from an address in the instruction
RAM space via the instruction bus must produce the
exact information as would be retrieved from the same
address in the data RAM space via the data bus.

Breakpoints
Because the Am29000 is one of the fastest commercial
processors available, there is no practical way to read
each address on the address bus and compare it
against a breakpoint table to determine if a break should
occur, as is done in an in-circuit emulator. The method
used by the ADAPT29K is to swap a halt instruction into

3-52

memory at the location of the breakpoint. When the
executing processor encounters the breakpoint, it halts.
Then, the ADAPT29K, upon detecting the halt, compares the halt address with the breakpoint table and
determines if there is a match. If there is, it swaps the
original instruction back into memory and informs the
operator that a breakpoint has occurred.
This method of setting breakpoints also contributes to
the requirement for a one-to-one translation of addresses between the data bus and the instruction bus.
For example, when the ADAPT29K sets a breakpoint in
the instruction ROM space, it does so by using the target
Am29000 to read the original instruction, then writes the
halt into the address location. This is performed as a
data movement operation, using the bi-directional path
to the instruction bus discussed in the Memory Access
section. For the breakpoint to be effective, the executing
program must encounter the breakpoint at the same address at which it was stored.
TARGET DESIGN REQUIREMENTS
Throughout the preceding discussion, it should be clear
that the ADAPT29K only interfaces to the target via the
target Am29000, and uses only the target memory for
storage of the application program. This places certain
hardware requirements on the application. These are
listed below. Fora specific example, see the Standalone
Execution Board section.
1. The physical device in the instruction ROM space
must be a RAM device if code is to be downloaded
to the instruction ROM space, or if breakpoints will
be set in the instruction ROM space.
2. A bi-directional path must exist between the instruction and data buses.
3. There must be a one-to-one translation between
instruction bus addresses and data bus addresses.
4. The ADAPT29K must be able to disable the target
memory using a low signal on the PIN169 alignment
pin (D4), orwhen OPTo-OPT2 are 06 hex.
5. Physical clearance must be provided for the connection of the interface cable at its proper orientation.
6. Signals driven by the ADAPT29K (see Table 5)
must be open-collector or tri-state.

Introduction to the Am29000 Development Tools

MON29K TARGET RESIDENT
MONITOR

Table 5. Am29000 Signals Driven by the
ADAPT29K
Configuration

Pin

Alignment pin
031-00

131-10
OERR
RESET
OROY
STAT1-STATo
TEST

MON29K is a target-resident monitor that has functionality similar to the AOAPT29K monitor. MON29K
provides many important debugging capabilities, including memory display and alteration, code uploading and
downloading, and assembly and disassembly. However, unlike the AOAPT29K, MON29K is an entirely software product. It resides completely in the target memo ry
and executes on the target Am29000 (see Figure 11).

(Input with pull-up resistor)I.2
(Tri-state)
(Tri-state)
(Input with pull-up resistor)1
(Open coli. pull-up with 1 K ohm
resistor)
(pull-up resistor) 1
(Input)
(Open collector)3

1. Pull-up resistors should be 330 to 1000 ohms.
2. This is an optional configuration. It is used if memory will be
disabled by the alignment pin (PIN169).
3. Note that mT is active longer than ~. Since all outputs
will be in a high-impedance state, it may be prudent to pull up all
Am29000 outputs to avoid ambiguous inputs (to other devices).

MON29K has I/O driver routines to handle two serial
ports. Either port can be used to receive commands,
although the hardware must be supplied by the target.
With the proper hardware, MON29K can receive commands from an ASCII terminal or a remote host. It also
can act as the interface between XRAY29K and the
target. MON29K is supplied in C source code form so
the I/O drivers and service routines can be modified to fit
the particular hardware environment.
Since it is entirely software, MON29K can be permanently embedded in the product. It takes only 256K of
address space in instruction ROM; thus, it can remain
with the application and be used to diagnose problems
at all stages of the product life cycle, from development
to field support .

• 11111111111
~----------------~

Modem

Communications Link
PC or Terminal
Modem
Host Computer
System

DIE RS232 Port

MON29K
Installed

DCE RS232 Port

Target
System
11014A-l1

Figure 11. MON29K System Connections

3-53

29K Family Application Notes
Table 6. MON29K Commands

FEATURES
MON29K provides powerful testing capabilities. Many
of MON29K's features are, in fact, the same as the
ADAPT29K. These include:
• Display and alteration of memory, I/O ports, and
registers. Using MON29K, target data can be
displayed, set, or altered. All Am29000 memory
spaces may be accessed, including: Am29000 internal registers (global, local, and special), coprocessor
registers, instruction/data RAM, or instruction ROM.
• In-line assembly and disassembly. MON29K comes
with a built-in, in-line assembler/disassembler.
Am29000 instruction mnemonics can be converted to
machine codes and stored at a specified location, or
ranges of addresses may be disassembled and
displayed in mnemonic form.
• Uploading and downloading of programs. MON29K
can use two serial ports, assuming they are provided
by the target hardware. One port is a data communications equipment (DCE) port; the other is a data
terminal equipment port (DTE). Files may be
uploaded or downloaded in Motorola or Tektronix
formats. Also, XRAY29K can communicate with
MON29K through one of the ports.
• Execution Control. MON29K can control target execution. It can initiate full-speed execution, or singlestep the processor.
• SeVReset Breakpoints. Both permanent and temporary breakpoints are supported.
• On-line help. On-line help that shows the complete
syntax is available for all commands.
MON29K Commands
Many of the MON29Kcommands (and consequently
the features) are identical to those of the ADAPT29K.
The MON29K commands, all of which are implemented
in ADAPT29K, are listed in Table 6.

Command
A
B
C
D
E
F
G
I
L
M
N

o

R
S
T
V
X
XC
XP
Xl
XU
Y

Description
Assemble in memory
Breakpoint display, set, and reset
Check execution state
Display registers/memory
End execution command list
Fill registers/memory
Go (start program execution)
Input from a port
List memory
Move memory
Change the "normal character"
Output to a port
Enter remote mode
Set registers/memory
Trace (single-step) instructions
Save memory to a file
Display key registers
Display/set co-processor registers
Display/set protected registers
Display/set TLB registers
Display/set unprotected registers
Load a file to memory

Differences Between MON29K and ADAPT29K
Because MON29K runs on the target processor, not as
a separate unit, it has limitations that the ADAPT29K
does not have. In particular, MON29K has no K (Kill), S
(Jam), Z (Trace), or W (interface diagnostics) commands.
MON29K is not able to assert a kill command because
when the application is running, the application controls
the processor. Clearly, when MON29K is not in control
of the processor, it has no means of evaluating serial
input and taking 29K polled the serial 110 device, but
such continuous polling would hinder real-time execution. Instead, to allow programs to be forcefully terminated, MON29K can be configured to respond to
interrupt-driven serial I/O. When MON29K is initialized
to respond to interrupt-driven serial I/O, it intercepts a
CTRL-C and passes control to a handler that recovers
the processor to MON29K. This technique is effective in
most cases, except if the application program has
reached a HALT instruction. Then, the system must be
reset. Usage of interrupt-driven serial 110 is determined
as an option of the
command (not present on the
ADAPT29K).

a

3-54

Introduction to the Am29000 Development Tools
TARGET DESIGN REQUIREMENTS
MON29K does place some requirements on the target
design. They are listed below. For a sample implementation of the compatibility requirements, see the Standalone Execution Board section.
1. The physical part in the instruction ROM space
must be a RAM device if the code will be downloaded to the instruction ROM space, or if breakpoints will be set in the instruction ROM space.
2. The Am29000 cannot write on the instruction bus,
so a bi-directional path must exist between instruction and data buses.
3. Instruction bus addresses must produce the same
data as data bus addresses.
4. As a target-resident monitor, MON29K does take up
some of the target memory; thus, sufficient memory
must be provided for MON29K. An application using
MON29K must have 256 Kbytes of memory in the
instruction ROM space for the program, and a 64Kbyte workspace in Instruction/data RAM. Both
spaces must begin at address 0 (Or and Od).
5. If program control must be recovered from the application before it ends or returns control normally,
accommodations must be made to use interruptdriven serial I/O. When interrupt-driven serial I/O is
used, a MON29K interrupt routine will handle a
CTRL-C by terminating the application program and
returning control to MON29K.
6. MON29K expects the serial I/O driver to be an 8530
serial communications controller. Using a different
I/O driver will require modifications to be made to
MON29K.
7. AMD cannot anticipate every possible scenario in
which the Am29000 will be introduced, and it is
possible that MON29K will require some modifications to the I/O drivers and service routines before it
can run on the target. Although binary code is available from AMD, MON29K is supplied in source code
form. Of course, any changes will have to be compiled using a C compiler that produces object modules for the Am29000.

XRA Y29K SOURCE-LEVEL
DEBUGGER
XRAY29K, the high-IeveVassembly-ievei debugger, is a
program that provides an interactive, windowed environment for debugging Am29000-based systems.
Using XRAY29K, program statements may be read in
source language, and data objects may be modified and
changed by referencing symbol names. Thus, target op-

erations can be performed using source-level
constructs, rather than machine codes and numeric
addresses. To further clarify the target environment,
XRAY29K's muhi-window interface simultaneously
displays user-selected program information.
Commands are issued to XRAY29K using a comprehensive debugger command language. The language
supports a wide range of functions, including setting
breakpoints, single-stepping, and examining or altering
any C- or assembly-language variables. The language
syntax is very similar to C, and also supports debugging
commands, creation of symbols during a debugging
session, and convenient specification of address
ranges.
XRAY29K resides on a host system and communicates
with the target system through either the ADAPT29K or
MON29K. Frequently, the host system is an engineering
workstation attached to the ADAPT29K, as shown in
Figure 12. In that system, XRAY29K provides a comfortable user-interface, while operations are asserted on
the target by the ADAPT29K. Alternately, XRAY29K
could reside on a mainframe and communicate with a
target running MON29K. The user interface could then
be done via an ASCII terminal.
FEATURES
XRAY29K supports source-level debugging in either of
two modes: high-level or assembly-level. In high-level
mode, an application can be debugged using Clanguage expressions and statements. In this way, C
variables and expressions replace numeric addresses
for memory access, and the code can be viewed by line
number or procedure name.
In assembly-level mode, an application can be
debugged using assembly-language statements. The
assembly-level mode additionally allows machine-level
register and status bit manipulation.
Commands are given to XRAY29K using its powerful
debugger language, thus gaining access to XRAY29K's
full range of debugging services. The services include:
• Setting and examination of memory and register
contents using the declared format and the variable
name .
• Simple and complex breakpoints that can be set and
removed in either C-Ianguage or assembly-language
source code.
• Single-step and full-speed program execution.
• Assembly and disassembly of object code.
• Simulated I/O and interrupts.
• Execution time measurement.

3-55

29K Family Application Notes

XRAY29K

DCE RS232 Port

~runnlngon

the PC-

PC or Terminal
ADAPT29K
Target

11014A-12

Figure 12. XRAY29K System Connections
The commands for manipulating memory and registers
are shown in Table 7.

Table 7. XRAY29K Memory and
Register Commands
Command

Description

compare
copy
fill
search
setmem

Compare two blocks of memory
Copy a memory block
Fill a memory block with values
Search a memory block for a value
Change the values of memory .
locations
Change a register's contents
Examine memory area for invalid
values

setreg
test

Commands for controlling program execution are listed
in Table 8; otherdisplaycommands are listed in Table 9.

Table 8. XRA Y29K Breakpoint and
Execution Commands
Command

Description

breakinstruction
clear
go
gostep

Set an instruction breakpoint
Clear a breakpoint
Start or continue program execution
Execute macro after each
instruction step
Execute a number of instructions or
lines
Step, but execute through
procedures

step
stepnocall

3-56

Table 9. XRAY29K Display Commands
Command

Description

disassemble
dump
expand

Display disassembled memory
Display memory contents
Display a procedure's local
variables
Search for a string
Open a file or device for writing
Print formatted output to a viewport
Display C source code
Monitor variables
Find string's next occurrence
Discontinue monitoring variables
Print formatted output to command
viewport
Print a variable's value

find
fopen
fprintf
list
monitor
next
nomonitor
printf
printvalue

Windowed Information Display
XRAY29K shows all critical program information at once
in multi-windowed displays. The contents of the runtime stack, the selected general-purpose registers, the
current source lines being executed, or virtually any
other program information, can be checked at a glance,
without the need to constantly request each piece of
information individually.
Information is grouped into screens, which are composed of one or more windows of specific data called
viewports. There are three predefined screens: highlevel, assembly-level, and standard 110. Distributed
among these screens are the 17 pre-defined viewports
listed in Table 10.

Introduction to the Am29000 Development Tools
Figure 13 shows the high-level mode screen display. It
has four viewports: data, trace, code, and command.
This screen is displayed when an object module generated by a C source program is executed.

Figure 14 shows the assembly-level mode screen
display. It has five viewports: data, stack, disassembled
code, registers (Am29000), and command. This screen
is displayed when an object module generated by an
assembly-language program is executed.

Table 10. XRA Y29K Predefined Vlewports
Viewport

Description

Command(2)

Debugger commands are submitted to XRAY29K from this viewport. There is a command viewport for both high-level and assembly-level modes.
Code(2)
Displays source code in high-level mode or disassembled instructions in assembly-level mode.
Data(2)
Displays monitored variable expressions in high-level and assembly-level mode.
Trace
Shows the procedure calling chain (high-level mode only).
Stack
Shows stack contents beginning from the stack pointer (assembly-level mode only).
Displays current values of Am29000 registers (assembly-level mode only).
Register
Status Line(2) Used for debugger command information such as CPU type, current module name, and current
operation. This viewport is present in both high-level and assembly-level modes.
Standard I/O Shows interactive information being received from the std.ln or sent to the std.out.
Break
Shows breakpoint information such as number, address, module name. Temporarily overlays top
of screen when breakpoint is encountered.
Error
Appears when an error occurs to indicate type and source of error.
Help
Shows on-line help information when requested.
Log
Displays logged keystrokes.
Journal
Shows all previous comma\1ds and their results.

DATA

======== 3 ::;];'1.000018C4!??????\\
1===== TRACE
4=
0.00010004:CRTO_S\\start

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CODE

1
2
3
4
5
6

7
8

1* sievex.c -- scaled down sieve with maxprime_2 instead of 8091 */
/* Eratosthenes Sieve prime number calculation *1
#define maxiter 1
#define maxprime_2 9
extern void printi\(\);
extern void prints\(\);

9

10
Command

>
>

extern char

out ut;

29000 MODULE:

CRTO_S

BREAK #:

0

HELP=FS

CO
M
ND======
Note: in startup routine. Press
F9 M
to go
toA
main.
host

ll.
V#

1.0

11014A-13

Figure 13. The Standard High-Level-Mode Screen

3-57

29K Family Application Notes

~

DATA===12]

r:::=== STACK =
LR5
LR4
LR3
LR2
LR1
126->LRO

I

00010004
00010008
0001000C
00010010
00010014
00010018
0001001C
00010020
00010024
00010028
0001002C

Command

25010110
5E40017E
15810118
0300838C
02008301
03008240
03017921
72450101
030083BO
02008301
03008241

CODE
SUB
ASGEU
ADD
CONST
CONSTH
CONST
CONST
ASNEQ
CONST
CONSTH
CONST

11

REGISTERS

gr1,gr1,Ox10
Ox40,gr1,gr126
lr1,gr1,Ox18
lr3,Ox8c
lr3,Ox10000
lr2,Ox40
gr121,Ox121
Ox45,gr1,gr1
lr3,OxbO
lr3,Ox10000
lr2,Ox41

29000 MODULE:

14
=00000000
=00000000
=00000000
=00018000
=00080000
=000018C4

cha=000019FC
chd=OOOOOOOO
chc=00008116
q =00000000
pcO=00010008
pc1=00010004
pc2=00010004
grO =00000000
gr64=00000B84
gr65=00000000

CRTO_S

BREAK #:

0

13

=-

vab=OOOO
mu =301
ops=0060
lru=OO
cps=0060
alu=OOO
cfg=Ol-11 bp =00
rbp=003F
fc =00
tmc=FF62
cr =00
tmr=OFFFF62
gr1
gr96
gr97
HELP=F5

=0007FFF8
=00000210
=00000000
V#

1.0

C O M M A N D = = = = = = 10 ~
auto halt at address Ox00010004
Note: in startup ro~tine. Press F9 to go to main.'

>
11014A-14

Figure 14. The Standard Assembly-Level Mode Screen

The standard I/O screen has one regular viewport: the
standard 1/0 viewport, although the breakpoint, error,
and help viewports also will appear. The standard 1/0
screen is used when interactive input is requested from
the standard input device, orwhen output is directed to
the standard output device.
The viewport commands, shown in Table 11, con!rol the
way information is displayed on the screen. By uSing the
viewport commands, a viewport's size, color, and cursor
position can be changed. Viewports can be added or
deleted, and custom screens and viewports can be
defined.

Table 11. XRAV29K Viewport Commands
Command

Description

vactive
vclear
vclose

Activate a viewport
Clear data from a viewport
Remove user-defined viewport or
screen
Select viewport colors
Attach a macro to a viewport
Create a screen or viewport or change
size
Set a viewport's cursor position
Increase or decrease a viewport's size

vcolor
vmacro
vopen
vsetc
zoom

Utility Functions
In addition to its powerful features for execution control
and display of system information, XRAV29K provides
several utility features. These features ease debugging
by streamlining the routine operations. The services
include command keys, macros, and command files.
Command Keys
The most· frequently used XRA Y29K functions have
been assigned to a key combination referred to as a

3·58

Introduction to the Am29000 Development Tools
"command key." By using command keys, common
debugger commands can be entered with the minimum
number of keystrokes, often only one key or a CTRL-key
combination.
Macros
XRAY29K has a powerful, multifaceted macro facility.
Because a macro may contain complex user command
procedures, which are executed by entering the macro
name on the command line, the facility can be used for
several purposes. Table 12 shows the debugging
language's macro-related commands.
Table 12. Macro Commands
Command

Description

define
show

Create a macro
Display the macro source

Macros can be invoked when a breakpoint is encountered. Powerful conditional and looping statements in
the command language allow the macro to evaluate
program or register variables, and alter program flow
depending on their condition. Hence, macros can be
used to establish very complex breakpoints that take
specific action, depending on their environment.
Macros also can be attached to user-defined viewports.
When the associated window is opened, the macro will
execute. This type of macro can write specific data into
the window, which is useful for monitoring environmental information.
Command/Batch Flies
XRAY29K can process command files. A command file
contains one or more debugger commands that can be
processed by XRAY29K automatically, without the need
for user interaction. This is also called batch-mode
operation. Command files can be used to recreate a
debugging session, easily implement automated test
procedures, and eliminate reentering of frequently used
command sequences.
Other XRAY29K Utility Functions
XRAY29K possesses several other utility functions.
These include services for manipulating symbols,
evaluating expressions, setting display and recording
modes, and contrOlling the session. Table 13 lists
the symbol commands, Table 14 lists the miscellaneous utility commands, and Table 15 lists the session
commands.

Table 13. Symbol Commands
Command

Description

add
delete

Create a symbol
Delete a symbol from the symbol
table
Display symbol, type, and address
Specify current module and procedure scope

printsymbols
scope

Table 14. Miscellaneous Utility Commands
Command

Description

cexpression
erro
help
include
log

Calculate an expression'S value
Set include file error handling
Display on-line help screen
Read in and process a command file
Record debugger commands and
errors in a file
Select debugger mode (high-level or
assembly-level)
Set debugger options for this session
Pause simulation
Simulate processor reset
Reset the program starting address
Save the default start-up options

mode
option
pause
reset
restart
startup

Table 15. Session Command
Command

Description

host

Enter the host operating system environment
Load an object module for debugging
End a debugging session

load
quit

TARGET DESIGN REQUIREMENTS
XRAY29K itself places no restrictions on the target
hardware design. However, being strictly a software
product, XRAY29K needs a hardware connection to
the target. For debugging Am29000-based systems,
XRAY29K must be used in conjunction with either
ADAPT29K or MON29K; the target design requirements for those tools apply.

3-59

29K Family Application Notes

XRAY29K requires a host system. Versions of
XRAY29K currently exist for UNIX and DOS environments.
XRAY29K works only with object files that have been
compiled in such a way that they contain debugger information regarding line numbers, etc. Thus, to use
XRAY29K, either the ASM29K macro-assembler or
HighC29K cross-compiler must be used, as well as the
ASM29K linker. These are explained in the "29K Tool
Chain" section.

Am29000 PROBE INTERFACE
The Am29000 probe interface provides a non-intrusive,
low-capacitance connection to an Am29000. Inserted
between the processor and its socket, the probe interface makes the Am29000 pins available for convenient

attachment to a logic analyzer or other test equipment.
Figure 15 shows the probe interface.
The software available with the probe interface supplies
configuration information about the Am29000 pins and
instruction mnemonics to either an HP 1650 or 16500
log ic analyzer for display formatting. Whe n the display is
formatted, the logic analyzer will disassemble instructions into mnemonics and display processor, bus, and
error status, as well as data bus activity. Figure 16
shows how the probe interface is connected between
the logic analyzer and the target.
Although the probe interface was designed for the HP
1650 or 16500 logic analyzer, any type of test equipment can be attached to it. The following discussion
assumes a connection to an HP 1650 or 16500 logic
analyzer, unless otherwise stated.

11014A-15

Figure 15. The Probe Interface

3-60

Introduction to the Am29000 Development Tools
Logic Analyzer

I

I

Am29000-Based
System

11014A-16

Figure 16. Connection of the Probe Interface
FEATURES

DISPLAYS

The probe interface can add important event-triggering and high-speed (10 ns) resolution capabilities,
including:

Figure 17 shows data bus information, as would be
shown on an HP 16500 logic analyzer. Figures 18 and
19 show signal state and timing screens and the disassembly screen for the 16500 analyzer.

• Convenient connection to the target.
• Low-capacitance probing.

TARGET DESIGN REQUIREMENTS

• Completed status information, including identification
of burst, pipeline, and simple accesses.

Because the probe interface only monitors Am29000
signals, there are no particular target compatibility
requirements except for sufficient clearance to install
the probe interface. Most applications will not be
affected by low-capacitance, high-impedance connection; however, see the probe interface data sheet for
electrical and physical specifications.

• Status reporting of bus conditions, such as slave
accesses, wait states, and co-processor transfers.
• User-configurable setup and hold parameters allow
triggering on a specific target condition.
• Monitoring of all Am29000 signals except INCLK.
The probe interface comes with the disassembler,
configuration files, and a user's manual.

Apart from supporting the physical size and electrical
specifications of the connection, a logic analyzer is
needed. The logic analyzer should have 80 to 160 state
channels. Some termination adapters also are needed,
. depending on the number of state channels on the logic
analyzer.

3-61

29K Famll~ Aee"catlon Notes
(

. Statemming C

(

Markers
Off

) (

Listing 1

(

)

II
II

AM29000 Data Bus
access type

bus status

simple acc.

read
data
data
data
data
read
data
data
data
data
read
data
data
data
data
read

Ox25788902

Ox4B79780E

OxACOO7915

OX257D7D24

10

Group Run

)

)
data

-5
-4
-3
-2
-1

)(

Cancel

simple acc.

simple acc.

Simple acc.

II R~W

STAT

II

Hex
C36B
c76B
C76B
C76B
C76B
C36B
C76B
C76B
C76B
C76B
C36B
C76B
C76B
C76B
C76B
C36B

wait
wait
wait
wait
wait
wait
wait
wait
wait
wait
wait
wait

symbol

RD

RD

RD

RD

11014A-17

Figure 17. HP 16500 Data Bus Information Display

(

Statemming 8

(

"'rum"'a" OH

)

)(
(

Waveform 1

At

)(

)

(

Cancel )

(

Group Run )

IIREO)

88GG 8 8
IIREO

IIBREO
IMBACK

IRDY
STATO-Q
STAT0-1
STAT0-2
IBGRT

J

11014A-18

Figure 18. HP 16500 Signal and Timing Display

3-62

Introduction to the Am29000 Development Tools

29K rNST
Markers

-

State Listing
off

Label
Base
-0247
-0246
-0245
-0244
-0243
-0242
-0241

~
-0239
-0238
-0237
-0236
-0235
-0234
-0233

AM29000 Disassembly
mnemonics
000018AO
000018AO
000018AO
000018AO
000018M
0000IBAO
00001BM
000018M
000018AO
00004000
00004000
00004000
00004000
00004000
00004000

CONSTH
MTST
CON 5TH
MTST
CONSTN
IRET
ASNEQ
JMP

SUB
ASGEU

GR85.0xOOFF
TMC.GR85
GR85.0xOlff
TMR.GRB5
GRB4,-OxOOOl
6B,SP,SP
-OxOOO04+PC
rBUS = 70400101
rBUS = c67AOBOO
rBUS = CEOOOBSO
rBUS = CEOOOB50
rBUS = CEOOOB50
SP,SP,Oxl0
64,SP,GRI26

*cont. brst
*cont. brst
*cont. brst
*cont. brst
*cont. brst
*cont. brst
*cont. brst
*cont. brst
*int ret
wait state
wait state
wait state
wait state
brst init
cant. brst

E747
E747
E747
E747
E747
E747
E747
E747
E7SF
64D6
61D6
61D6
61D6
6146
6147

11014A-19

Figure 19. HP 16500 Disassembly listing

SUMMARY OF THE TOOLS
From the sections on ADAPT29K, MON29K, XRAY29K,
and the probe interface, it should be clear that a comprehensive range of tools exists for developing
Am29000-based systems. Each of the available tools
has unique characteristics that make it more advantageous in particular situations. Depending on the characteristics of the application, one or all of the tools may be
needed. This section summarizes the information
presented in the previous sections with emphasis on
highlighting what conditions are most appropriate for a
particular tool or tool combination, and what compatibility requirements are placed on the target as a result of
the tool selection.
SELECTION GUIDE

certain situations if they are combined with XRAY29K
and/or the probe interface with a logic analyzer. The
following questions highlight the critical target characteristics that suggest the optimum tool selection.

How much memory does the target have?
Perhaps the most crucial factor in deciding whether the
ADAPT29K or MON29K is most appropriate depends
on the size of the available target memory. This determines whether or not MON29K can be used. Because
MON29K is target resident, it is necessary that the
target have at least 256 Kbytes of space in instruction
ROM, and 64 Kbytes of instruction/data RAM for
MON29K's workspace. An application without this
memory space will not be able to use MON29K, and will
have to use the ADAPT29K.

In the development phase of virtually any Am29000based system, either the ADAPT29K or MON29K will be
needed. It is possible to debug a microprocessor system
with only a logic analyzer and a PROM programmer, but
this method is not very practical when compared against
the following ADAPT29K and MON29K features:

For systems with sufficient memory, MON29K,
ADAPT29K, or both may be used. While both have
excellent debugging features, the ADAPT29K has some
features MON29K does not, including:

• Memory display and modification, including special
registers.

• Provides a bus trace facility

• Uploading and downloading of programs.
• Execution control, including setting breakpoints and
single-stepping.

• Can halt a failing program

• Can force execution of an Am29000 instruction
• Provides memory diagnostics
• Can be used with a target that cannot run its
program

Apart from the advantages gained from MON29K and
the ADAPT29K, their performance can be augmented in
3-63

29K Family Application Notes

It should be noted that in most cases (see the Differences Between MON29K and ADAPT29K section),
MON29K can halt a crashed program if an interruptdriven serial 110 is provided on the target, and the target
still is responding to interrupts.
How many units will be produced?
The number of units to be produced determines the
volume over which the development and servicing costs
can be defrayed. The ADAPT29K, while more powerful
than MON29K, costs more and will raise the amount
of nonrecurring charges that must be recovered. Of
course, the difference will be insignificant for the advantages gained in large volumes. In fact, it may be advisable to use the AOAPT29K when the product is in
development and final test, using MON29K for field
service.
How and where will servicing be performed?
Servicing can be performed on-site or at service centers. Often this depends on the size, function, and value
of the application system. If the system is moved to a
service center for repair, the AOAPT29K will provide the
most capabilitie~, particularly when coupled with the
probe interface and XRAY29K.

However, the AOAPT29K may be too bulky to perform
maintenance on-site. MON29K can be embedded in the
application and used to diagnose faults via a portable
ASCII terminal or PC (which could run XRAY29K).
How complex is the program?
If the program is complex, XRAY29K should be considered. Debugging complex programs using hex values
and physical addresses can be very time consuming
and error prone, especially programs containing many
modules. Often, XRAY29K's windowed interface and
source-level debugging language will greatly reduce
time spent tracking down errors encountered in address
calculations, decimal to hex conversions, or just looking
up values in a listing.

ADAPT29K

1. The target must support RAM in instruction ROM.
2. A bi-directional path must exist between the instruction and data buses.
3. There must be a one-to-one translation of
addresses between buses.
4. Target memory must be disabled either by a low
signal on the alignment pin (04), or when OPT2OPT1 are 06 hex.
5. There must be physical clearance for the connection of the interface cable at the proper orientation.
6. The signals driven by the AOAPT29K must be opencollector or three-state.
MON29K

1. The target must support RAM in instruction ROM.
2. A bi-directional path must exist between the instruction and data buses.
3. There must be a one-to-one translation of
addresses between buses.
4. The system memory must include 256 Kbytes in
instruction ROM beginning at Address 0 to store the
MON29K program, and 64 Kbytes of instructionl
data RAM at Address 0 for MON29K's workspace.
5. If program control must be recovered from the
application without it ending or returning control
normally, accommodations must be made to use
interrupt-driven serial 1/0.
6. The 1/0 drivers may have to be modified.
XRAY29K

1. Requires a host system, such as an engineering
workstation.
2. Requires MON29K or AOAPT29K.

SUMMARY OF COMPATIBILITY REQUIREMENTS

Once a combination of tools has been selected, it is
important to ensure that they will be compatible with the
target system. The following lists summarize the compatibility requirements for each tool. More detailed
explanations can be found in the specific sections
related to the particular tool.

3-64

Probe Interface

1. Requires a logic analyzer (an HP 1650 or 16500 is
recommended).
2. Requires termination adapters.
3. There must be sufficient phYSical clearance to allow
the probe to be attached to the target.

Introduction to the Am29000 Development Tools

A COMPATIBILITY EXAMPLE:
STANDALONE EXECUTION BOARD
The Standalone Execution Board (STEB) is an excellent
example of compatibility with all the development tools.
It is a complete Am29000-based system that can run
many types of programs, including the software packages MON29K and VRTX32129000®.
The STEB can also be used with the ADAPT29K and/or
the HP probe interface. STEB also can be used as an

execution vehicle for application software or a comparison system for isolating hardware faults.
This section focuses on how the STEB's design
achieves compatibility with the development tools. The
major areas of the STEB are discussed, with emphasis
on how each area contributes to compatibility. See
Figure 20 for a block diagram of the STEB.

System
Address
Bus
Data
Bus

Am29000~------------~

Processor 1I'v-----------,/I

Buffered
Address
Bus
Instruction!

Data RAM
Space
Bank #0
Bank #1
Bank #2
Bank #3

11014A-20

Figure 20. Block Diagram of the STEB

3-65

29K Family Application Notes

A 9513A timing controller is installed at U55-58 , and
U64 on Sheet 10. The 9513A supports up to five 16-bit
counters. Address decoding for various timer functions
is provided by a PAL (U56 on Sheet 10). The clock
source can be from the Am29000, a hardware oscillator,
or a crystal oscillator.

FUNCTIONAL DESCRIPTION

Mounted on a single card, the STEB contains an
Am29000 with memory, 110, and system timing
resources. See Appendix A for schematic diagrams,
'Sheets 1 through 12. In addition to the Am29000 (U51
on Sheet 2), the STEB supports the Am29027 arithmetic
accelerator (U1 0 on Sheet 3). The Am29027 is capable
of high-speed, single-precision and double-precision
arithmetic using fixed and floating-point numbers. It can
be operated in pipelined or non-pipe lined (flow-through)
mode, depending on system capability and requirements. The pipelined mode maximizes the overall
execution time for scalar operations.

Power to the STEB is provided by a series-regulated
power supply that provides a regulated +12 VDC and
+5 VDC to the board. Connectors are furnished for attachment to the type of power supply used with PCs.
CIRCUIT AREAS CONTRIBUTING TO
COMPATIBILITY

System timing can be provided by one of two methods.
The Am29000 itself can generate the system clock,
which is output on the SYSCLK pin; or Circuitry on the
board (U8, U9 on Sheet 4) can generate an external
clock Signal that can be applied to the SYSCLK pin of the
processor. Clock selection is done by jumpers.

In the following section, circuit sections related to
compatibility issues are described. The circuit sections
are referenced by their locations on the STEB, as
indicated in Figure 21.

Memory is supported in both the instruction ROM and
instruction/data RAM spaces. By using dip switch (SW3
on Sheet 7), between 0-7 wait states may be selected.
Each space has its own wait-state generator, and may
be configured separately, depending on the access
speed of the installed memory devices.

Because the ADAPT29K and MON29K are very similar
to each other, several STEB design aspects simultaneously address their compatibility requirements. These
include the type of memory supported, and the bus
architecture for accessing memory.

r

~

...

P'"

l~
Buffers

ADAPT29K and MON29K Compatibility

II.

Am29000
Processor

J'-

T•

Wait State
SW3

~

~

Instruction
Bus

Buffered
Address

~=1
DREOT,-OREOTo=OO
OPTrOPTo= 001
RlW=O

Data
Bus

Bus

...
~

ROM'
Space
EPROM
or RAM
Bank #0
Bank #1

J

~

...

Swap

I Buffers

.
...

Instruction/Data
RAM Space
Bank #0
Bank #1
Bank #2
Bank #3

I~

I"

...

...

Figure 21. Data Read from Instruction/Data RAM

3-66

"
y

'""-

11014A-21

Introduction to the Am29000 Development Tools
Support for RAM Devices in the Instruction ROM
Space

the result to the transceiver, the STEB channels data
between the buses at the appropriate time.

The STEB supports RAM in the instruction ROM (U25,
U32 on Sheet 5) space and the instruction/data RAM
(U33-U43 on Sheets 6 and 7) space. The instruction
ROM space has a maximum capacity of 1024 Kbytes
and uses 27010 EPROMs. The instruction/data RAM
space has a maximum capacity of 512 Kbytes and uses
32-Kbyte x 8 static RAMs.

The swap buffer is not required in many straightforward
operations . .For example, when assembling/disassembling instructions or reading/writing other data into the
instruction/data RAM space, data is written directly to
the instruction/data RAM space overthe data bus. likewise, a standard instruction fetch from the instruction
ROM space does not require the swap buffer, as instructions may be loaded directly into the processor's instruction pre-fetch buffer from the instruction bus.

Instructions may be executed from either space. So that
programs can be downloaded via the AOAPT29K or
MON29K, the instruction ROM area can be constructed
from 32-Kbyte x 8 static RAMs. However, the maximum
memory size using RAM is limited to 256 Kbytes.

However, when disassembling instructions in the
instruction ROM space, the instructions must be read as
data, which makes the swap buffers necessary. The
configuration of the IREOT bits causes an instruction
to be accessed from the instruction ROM, gated onto
the data bus, and read into the processor. Note the
combination of control signals indicated on the side of
the figure. They are used to select the path for data
movement.

Swap Buffer
On the STEB, a swap buffer provides the necessary
bi-directional path between the data bus and the instruction bus (U11-U14 on Sheet 2). The swap buffer is
created from four 74ALS245 octal bus transceivers.
Transfer direction and timing are controlled by the
transceiver's ENA and A~B inputs. By decoding the
OREOT1-0REOTo, IREOT, o PT2-0PTo, OREO, and
IREO signals (U17, U18, U49 on Sheet 4) and applying

~

I
Buffers

It....
I~

L..-_ _....1 "

Similarly, when instructions are fetched from the instruction/data RAM, they must be transferred to the instruction bus from the data bus. The direction of data
movement is shown by the darkened path in Figure 22.

~

Am29000
Processor

••••••"~

I.~,.

•

..

.....
1

DREO= a
IREOT = 0
OPT2-OPTo = XXX

R!iJ=X

Wait State ....._ _...
SW3

Instruction
Bus

Buffered
Address
Bus
ROM
Space
EPROM
or RAM
Bank #0
Bank #1

k"':::==::j....l
A

.
...

Instruction/Data
RAM Space
Bank #0
Bank #1
Bank #2
Bank #3

Swap

Data
Bus

IL.I I•
I~ •"

Buffers"

.
..

'-4

11014A-22

Figure 22. Instruction Fetch from Instruction/Data RAM

3-67

29K Family Application Notes

One-To-One Address Translation
Note that addresses in both memory spaces have a
one-to-one translation. This means that when a data object is stored at a given address in the instruction/data
RAM space, the exact same data object will be retrieved
when the same address is asserted by an instruction
fetch to the instruction/data RAM space. This is an
important requirement for assuring compatibility with
the ADAPT29K and MON29K because when they are
downloading programs, they store instructions as data
over the data bus. Neither tool has the capability to
translate a virtual address, so when the program is
executed it must find its instructions at their absolute
addresses.
ADAPT29K Compatibility
In addition to the elements discussed in the ADAPT29K
and MON29K Compatibility section, certain considerations were added to the STEB's design strictly for the
ADAPT29K. These include tri-stating the control lines
driven by the ADAPT29K and disabling memory during
data transfers to and from the ADAPT29K.

Trl-Stated Control Lines
The STEB must relinquish some control lines to the
ADAPT29K when it is operating. Therefore, these lines
are tri-stated or open-collector, as was described in
Table 7, thus preventing contention that they may cause
unpredictable results.
When the ADAPT29K is not connected to the target, the
CNTLo and CNTL, lines are pulled high'to ensure that
the processor is in a normal mode of operation. When
the ADAPT29K is connected to the target, it isolates the
CNTL,-CNTLo signals from the board. Any use of those
signals by the application will be inhibited.

Memory Disable
The STEB supports both methods of disabling memory
for ADAPT29K accesses. Via a jumper selection, the
STEB can be configured to either decode an 06 hex on
the OPT bits or disable memory when the alignment pin
is low.
When Jumper JP7 (on Sheet 7) has pins 1 and 2
connected together it causes the SEL_OP signal to PAL
U20 (on Sheet 7) to be high. The ROM/RAM decode
circuit (composed of U15, U20, U21, and U240nSheets
6 and 7) then decodes the OPT~OPTo pins to determine whether or not memory should be enabled.
Memory is disabled by a low state on the alignment pin
(D4) when jumper JP7 is used to connect pins 2 and 3
together. The low condition is decoded by the ROM/

3-68

RAM decode circuit, which then disables memory.
When the ADAPT29K is not installed, the alignment pin
is pulled high to prevent inadvertent and/or intermittent
memory disables.
MON29K Compatibility
Apart from the requirements mentioned in the
"ADAPT29K and MON29K Compatibility" section,
MON29K needs at least one, and preferably two, serial
port(s) to communicate with the hosVoperator. It also
needs sufficient memory to contain the software.

Serial Ports
The serial ports are provided by the 8530 serial communications controller (SCC) and support circuits located
at U1, U2, and U5-U7 (on Sheet 8). The SCC is a dualchannel, multi-protocol data communications peripheral
designed for use with 8-bit and 16-bit microprocessors.
The interrupt request line INT can be wired to provide
a trap or interrupt to the processor for MON29K.
Dip switches on the board are used to select port
characteristics.
Because the 8530 is a dual-port device, it supports both
the DTE and DCE RS232 ports on the STEB. The ports
are standard RS232 ASCII ports. The DCE can be used
to communicate with an ASCII terminal or PC running a
terminal emulator; the DTE port can communicate with a
remote host such as a UNIX machine.
Because the C language does not differentiate between
address spaces, the serial ports must be memorymapped into the Am29000 data space. This requirement allows C code to be used in place of assembly
language.

SuffiCient Memory Space
Sufficient memory is provided on the STEB for
MON29K. There is also room for additional application
programs in the ROM space. The space normally is configured with MON29K in EPROMs (Bank 0), and RAM in
the remaining banks. MON29K then could be used to
download an application into the RAM in the instruction
ROM space.
MON29K also uses 64 Kbytes of workspace in RAM.
This is provided for, with additional space available for
use by the application program.
Built-In Probe Interface
The STEB includes built-in probe interface connectors.
Thus, test equipment like the HP1650 or 16500 logic
analyzer can be connected directly to the STEB, eliminating the requirement for a separate probe interface.

Introduction to the Am29000 Development Tools

Appendix A: STEB Schematic Diagrams

PU1
6,9
R9
2K

+l2V

C7

41UF

1

Pll

2

~
CII

47UF

- v

3-69

29K Family Application Notes

-o
c·

J
~.....

!

~~~~~~~~ ~=~:~J ~~J ~~J

W0/'VVWvW (

~

~~~i~~·
.~.

r- .

~

.. ..
~

~

..

-------

~
<

.::.

,.

::J
CD

xc.

,.

..........

••

...,

~

••

..

.,

N

•

en

::.

_

..,

..

.,

en

;:)
IDI

I~V

..

()
.:i

• cen

___

i2~;;!

~fXQ'

0

__
In
..,
......
_&II _
.. _
IN _
•
_
••
It)

0

0

•

I_V'

xc. r,.wl
xc. I_V'
xc. z~v •
xc. z~v •

• . . . . . . . . . . . . . ::::':::: '!! '!!:::

'!!:::::::::::::::

~
~

~ ~:

;;::;:-

~::

~

~::

-

~::
~;:
~::

~
N"_

..

..

~"

(

S

~

~:
~:

lS::

~L

;1.....,-~"---+f++

-

xc.
xc.
xc.
xc.
xc.

'9_V

'.dV
IIdV

;;

.
.

o.

"_V •
II_v'

xc..
•
~f- I_V

u

(U
.;

3-70

0-

l~

~~

v,=::
f=;:

N

.: ~..
c-1

1'

.---

Introduction to the Am29000 Development Tools

:=0

C>
_NG)C\I

00:..:

,,-0

~

I--...::-;;-..:-VV\;L.....-

..

>

0
:;:)

..
II

rN
0
(j)

N

::E

«

..
..

t til
Otll
8Z11
8Z11
LZII
9Z11
!iZII

U~

n

n
n
n

II

8

ZZIl
tZII
OZII
Ull
lUll
Hll
9tll
!itll
ttll

~t~

ltll
Otll
811
811
LII
911
!ill
til
til
ZII
til
011

n

....
.1

I.
AI
I.

..
I.

8

II
I.

3-71

I~

~I

."

Q)
I_WEe

wrAOO-

SA_BUS(I :0)

[>S,l

~
WEAlh
WEA24.
IILI
01

'

'>R5
lOOk

,

.
,

BRESET

R'
100

~I
C3
T4.1UF

I
2
3
4
5

1

un

11

, SW4

Rl

S,l
S,l

INCLK

Z1

2,12

PI
PI
PI
PI
PI
PI
PI
PI

, 14 32

SIPCLK4
Ut

• 14Fn

UIOO

.

1,11

'~t.

1,10

OPII
OPI2

I_Ofe

R_W

I_[H.

S,S,l

SIPCLkO
VCC

OP

I

II~.

lJ49

,1!.,.

CLkOPI

vce
SIPCLk2

r - - RON

r - - RON

BANk 0

BANk 1 - - - -

tROY.

PU2
P8
RONIPU
IRED'

RON1P27
RON,"

RONCEI

RONOPU

RONCEO

RONOP27
RONOPI

.1,..

I

~OTG~E

JP 1 THRU JP6 TO SELECT RAM OR ROM.

DIRECTION

BA(15:U}

5 TO 3
5 TO 4
B.

R~~:~?M,

C. JP8

L. TU

L:. (0

2

INCLK DIUvtN

3

SiStLR DRIvEN

, • DEFAULT

8K X 8

ON THE SAME BLOCK ALL JUMPERS SHOULD POINT TO THE SAME DIRECTION.

\'

a
0

::::I

2-

It)

en
SIPCLk2

H

0

Z

SIPCLU

U.

14HCT2H

OSC_32.000NHZ

~

6,1

2,3,12

*

SEE NOTE 2

*

*

AM27C512

.-

*

AM27C512

.AM27C512

*

.-

I_8Ui(lI:0)

C>

2,12

AM27C512

001
DQOI'
.'•
002
D03
OQ4
.•
OQl
.

U25

ROMtEO

U26

Doe

•

001

'

U27

ROM BANK 0

U28

______--'L_O~._
L...J

ROMOP1

r-

ROMOP27

,.....

ROMOP28

~

*

rr.===i3iJ'

*

AM27C512

*

AM27C512

*

AM27C512

AM27C512

AU

AU
AU
Alt

OQ1
OQOI'
.'•
DOt
003'
•
DQ4'
OQ5
.•

All
Al0
At
AI
A7
A8
A5
AI

DQI
DQ7

•
'

AS
AZ
AI

5'

ROM BANK 1

a
c.
c:

!l
5'
::J

o
s-

AO

U29

U30

U31

U32

(I)

l>

3

1'1)
(0

o
o
o

C

(I)

<

NOTES

I. ADDITIONAL PIN CONNECTIONS fOR 27010 (IUK X B EPROM):

(I)

2.
ROMS CAN BE 27010,

VCC-r-----,

BA18

w

~

OR 12K X 8 RAM,
RP3 PIN 10

21512,

21258 EPROM

BK X B RAM

5"

"3

(I)

a

-t
o
o

ii)

29K Family Application Notes

__

1

1

:J

L
~lllllll CCCCl C
-'-----

'!-

1llllll111tlF

..
><
~

'"

::
:;

r--::--

I

~

'"

:;
~

x

<

~
~

<

=: =£
~

~

- -~
~

~

~

.;

..:

~

~

:::

.;

~

~

<.>

'"
z

\llllllCIC XC:::

~~

E(~

'----

u

-

I

~lllllllCCCCl CL::::.;

.:

~

.:

-

i;;;
J

............ .
----------

1--:--

-J
3-74

U

~a
RESET.

EH_DRDY.

2

OREQT!
SA_BUS3!
IREQ.
IREQT

" 2
2

RAM BANK

O_EN.

55257P_12

55257P_12

55257P _12

55257P_12

"6

U43

~"

,fII

I

Lt~:~,;t

III !I

,

Ill!1 I,

55257P_12

55257P_12

III 'U

~w

55257P_12

DIPSW8

D4
C-..r--..

ii'

'-0-4
5

102
103

•
,

I

106
107
106

•
•
•

i

10
11
12

::

55257P_12

"

0
!
2

I
4
5
I
1
I

i, lOll'
18:
:

"~ r.n~~t

SW3

III'

III !I I,

-iUZZ

U21

~."..-'

1,1

,

10

"'I
102
103 '
104 '
105
108
107
lOS

~

It

a

II

..

n

Co

II

..

c:
2
o
::s

..

U48

" U45

o

1

4

D-L>4

;;
,(--RAM BANK 3

(I)

'

DI_BUS(31:0)

'---

l>

3I\)
<0

o
o
o

C

(I)

<

(I)

o
'C
3

NOTE.
RAMS CAN BE 32K X

(I)

a

RAM OR 8K X 6 RAM

a
-i

o
o

Vi

29K Family Application Notes

..J

«

z:

H

f-

:L

en

n:::

o

w

f-

:r:

o

f-

o

f-

~

::; (9? ,? )? 0 0?9 ?9?9?9?9?~
-,:IH ~I- :1" :1- :1- :1-: -

(9?90 °

0

0 0 00 00 0;']

:I:I~t:I:I:I:I:I-t.;;I-· ~rl.I:" -" :1-:

~I-I=I:::I:I:I~I:I:I

w:'-I

I

L--l--l-+-""""'-I,i' ..;

II,

L--I-

I
o

C • • OC • • • • • • • CU.CIlOIO.CD
O-cCWClCCCDCmCCDQCDOOCIDar:;

<

:;~~~~:;:;;~E~i~EE~~~~~

~ ~~~M=~1l1lEg~=~ii~~6

C'IJ

::J

}~!~~P rr[4-I-!f+++rtN!----,
~f-+-W

Ir

... = ..

:;;:;~

<.>,
~I

--~

0::

o.. ... Q.. ... O" ...... L

0",

-

~-

. ..
"l -l

""" .,..,,...,

~]~

Hl'"l"rnr-r-'"

0

(~

..
(

~

(

~~~,~,~3

)

0(

C(JU

..........

3 76
w

(~OuJ

eo')

..........

0:.

Introduction to the Am29000 Development Tools

OREO.

DREOTO
DREOT1
RESET.

DIPSW8
11 II 01 t
12 02
13 03
14 04
15 05
18 08
17 07
18 08

:

HALS245A
2

1
2
3
4
5
8
7 a:
8 o.

.......

ttl

SW2

B2
B3
B4
B5
B8
B7
B8
U88

PUI

~~~~~~~~ vee

..
a:

DIPSW8
11 II 01 t
12 02
13 03
14 04
15 05
18 08
17 07
18 08

HALS245A

SWI
DLBUS(15:0)

3-77

29K Family Application Notes

iI:N~.n~~~:'o
:::::::::!
~

<,!t
;;!I

- - - - - -

-

=>

""

""

fl'"

'" "" "" ""

r--

-

ul

;;!I

-

-

'I

~

..

.

-~',
_

..

-:~

<=>
-

~JJ

w

f-

0

z

-

.

;:;

.;

!
3-78

~~
[)

i

~

:
~

...

~

)

)

.

Introduction to the Am29000 Development Tools

II

;:::=:::::1.:===:

---

r-f.
~~

1T

o

ll-

:1'

JI J

J

J

l
J
---

-

1
;l

-i--

+1

~(

--t;.

~~
~

~/N..

----- ...

IJ

~~
~

Q

(

3-79

29K Family Application Notes

g:;~.~.~.I)·~·il·
~m
(

T

r

1

..,.r-------------,
~~~~~~~~~

Tl

""r------------,
~~~~~~~~~

C")r-------------,

~':a~~~mm~m~

..
.,
'"=>a>
I

..
en
=>
I

-<

'"

3·80

Preparing PROMs Using the
Am29000 Development Tools
Application Note
by Manoj Desai and Doug Walton

INTRODUCTION
Source code for a given application must be converted
to executable Am29000™ object code and transferred to
the appropriate storage media before it can be executed
in a real system. Usually several utilities are involved;
these include:

grams that includes compilers, assemblers, linkers, and
format translators. These programs perform the operations necessary to translate the source code into a
machine-readable format. The components of the 29K
tool chain are:
• HighC29KTM Compiler

• Assemblers

• ASM29\{TM Assembler

• Compilers

• ASM29K Linker

• Linkers

• COFF2HEX (COFF to hexadecimal translator)

• Format translators (optional, depending on the destination media)

• ROMCOFF

This application note shows how an example program in
source code form is made into object code and downloaded to a target board with the ADAPT29\{TM
Advanced Development and Prototyping Tool, or programmed into PROMs.
THE 29K TOOL CHAIN
The 29KTM tool chain is used to produce the executable
object module. The tool chain is an integrated set of pro-

Publication II

Rev.

Amendment

11966

A

/0

Issue Date:

11/89

• BTOA (binary to ASCII translator)
Figure 1 shows the relationship of the 29K tool chain
elements to each other. In the following discussion,
familiarity with these tools is assumed. Consult the
appropriate reference manuals for more details.
The 29K tool chain can be run under UNIX®, SunOS®, or
DOS, but it must be installed properly on the host system before the following example can be performed.
The host in the following discussion is assumed to be an
IBM@ A~ or compatible.

© 1989 Advanced Micro Devices. Inc.

3-81

29K Family Application Notes

Cor
Assembly
Language
Source File

""'

........
.C (C source file)
or
, ;S (assembly-language source file)
HighC29K
Compiler

1---------ASM29K
Assembler

.0 (relocatable object module)

-

.LlB ..
Library
Files

ASM29K
Linker

~

,OUT (absolule objecl module)

Binary to ASCII
BTOA

COFF2HEX

~ ,ASC (ASCII objecl module)
PROM
Programmer

ADAPT29Kor
MON29K Target

11966A-01

Figure 1. The 29K Tool Chain

3-82

preparing PROMs Using the Am29000 Development Tools
SUGGESTED REFERENCE MATERIALS

SOFTWARE

Consult the following reference materials for more information on the topics covered in this application note.

The software is a small program that initializes its operating environment and then continuously tests memory.
It is comprised of a boot module and a C-Ianguage module. A flow chart for the complete application is shown in
Figure 2.

• Am29000 Streamlined Instruction Processor User's
Manual, order #10620. It contains details regarding
the instruction set and register organization of the
Am29000.
• Am29000 Streamlined Instruction Processor Data
Sheet, order #09075. It embodies a great deal of
information about the Am29000. including: distinctive
characteristics, general description. simplified system
diagram. connection diagram, pin designations and
descriptions, functional description. absolute maximum ratings. operational ranges. DC characteristics.
switching characteristics and wave-forms. and physical dimensions.
• ADAPT29K User's Manual. It provides detailed information on the ADAPT29K. including installation.
commands. theory of operation. and target design
requi re me nts.
• ASM29K Documentation Set. It provides complete
information on the installation and use of the ASM29K
assembler. linker, and librarian manager. This
includes information on using the ROMCOFF and
COFF2HEX utilities.
• HighC29K Documentation Set. It covers how the
Am29000 C compiler is used.
These materials can be obtained by writing to:
Advanced Micro Devices, Inc.
901 Thompson Place
P.O. Box 3453
Sunnyvale. CA 94088-3453
or by calling (800) 222-9323.
For questions that cannot be resolved with the current
literature. further technical support can be obtained by
writing or calling:
29K Support Products Engineering
Mail Stop 561
5900 E. Ben White Blvd.
Austin, TX 78741
(800) 2929-AMD (US)
0-800-89-1131 (UK)
0-031-11-1129 (Japan)

THE EXAMPLE SYSTEM
The example system used for illustration in this document consists of a generic hardware environment and a
small software program. The only function of this selfcontained standalone system is to test a block of memory. This section describes how the example system
works.

The main portions of the program are contained in two
source files: smplboot.s and cprog.c. The smplboot.s
module is an assembly-language boot program that
receives control on power up. The C-Ianguage program
cprog.c performs the memory test.
The tasks performed by smplboot.s are: (1) establish the execution environment. (2) set up a block of
initialized data in instruction/data RAM (using a routine generated by the ROMCOFF utility), (3) call the
main program cprog.c. and (4) evaluate the results of
the memory test. If the test fails, smplboot.s halts the
processor.
The cprog.c program tests a 32K byte block of RAM,
using a simple binary write and read test. Then, cprog.c
checks the validity of the initialized data section in
instruction/data RAM. After each successful completion, a flag is returned to smplboot.s, which increments
a counter. If a test fails, cprog.c returns the address of
the failing memory location. A memory map of the application is shown in Figure 3.
Three additional files (traps.s, r29k.s, and scregs.def)
contain the supporting procedures and declarations. All
of the files in the application are listed in Appendices A
through E. To actually perform the example, the files
must be entered onto the host system.
HARDWARE ENVIRONMENT
The application runs on the Standalone Execution
Board (STEB). manufactured by STEP Engineering.
Figure 4 shows a block diagram of the STEB. which
contains an Am29000, some RAM and ROM, and two
serial ports (provided by an 8530 serial communications
controller).
A few important features of the STEB should be noted.
First. data can be passed between the instruction and
data buses via a bi-directional swap buffer. The swap
buffer permits code to be downloaded into the instruction RAM area via the ADAPT29K. It also allows data
objects in the instruction ROM space to be read as data.
Second, the instruction ROM space can contain RAM
devices or ROM devices. RAM devices should be
installed when working with the ADAPT29K (see
Appendix F). so that code can be downloaded into the
instruction ROM space.

3-83

29K Family Application Notes

Initialize
Am29000

Transcribe
Initialized Data

Call Mem Test

Write Pattern
and Check

Check Initialized
Data

11966A-02

Figure 2. Flow Chart of the Example Application

3-84

Preparing PROMs Using the Am29000 Development Tools

Instruction ROM

Instruction/Data RAM
Am29000 VAT

Example
Code

Workspace

OxO
Ox400
ox420

Initialized Data
Ox500
Tested Space
32K
Ox8S00

Empty

MStack
2K
RStack
2K
11966A-03

Figure 3. Memory Map of the Example Application

3·85

29K Family Application Notes

System
Address
Bus
Data
Bus

Am29000
Processor

v'--------""
_ _ _ _ _ _.....,.I"I
~

Buffered
Address
Bus

Instruction!
Data RAM
Space
Bank #0
Bank #1
Bank #2
Bank #3
11014A-04

Figure 4. Block Diagram of the STEB

PREPARING AN EXECUTABLE
OBJECT MODULE
Preparing the executable object module involves several steps. Typically, the steps are repeated frequently

3-86

because errors must be corrected and revisions must be
made. The process can be automated by placing the
commands in a DOS batch file. Listing 1 shows the
batch file sc.bat, which is used in the example application. Following the listing, each step is explained.

Preparing PROMs Using the Am29000 Development Tools
Listing 1. The Batch File sC.bat

...

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
....

@echo off
echo *********************************************************
echo "Compiling cprog.c and Assembling the .s files"
echo *********************************************************
hc29 -c -w cprog.c > cprog.e
hc29 -8 -Hasm cprog.c > cprog.e
as29 -1 > smplboot.lst -0 smplboot.o smplboot.s
as29 -1 > traps.lst -0 traps.o traps.s
as29 -1 > r29K.lst -0 r29k.o r29k.s
echo
echo
echo
echo
Id29

*********************************************************
"Linking object files with libraries and generating"
"executable object module for ROMCOFF"

*********************************************************
-0 step1.out -f tx -m > outlink.map

-c step1.cmd

echo *********************************************************
echo "Using ROMCOFF"
echo *********************************************************
c:\29k\bin\romcoff -tlb step1.out rom.o
echo
echo
echo
echo
as29
ld29

*********************************************************
"Linking object files with libraries and generating"
"final executable object module"

*********************************************************
-1 > smplboot.lst -DRAMINIT -0 smplboot.o smplboot.s
-c step2.cmd -0 step2.out -f tx -m > step2.map

echo *********************************************************
echo "Converting executable object code to downloadable format"
echo *********************************************************
c:\29k\bin\btoa step2s.out sc.a
echo *********************************************************
echo "Converting executable into PROM-programmable format"
echo *********************************************************
coff2hex -c t -m -p 27512 step2e.out > step2.e
echo on

COMPILING CPROG.C AND ASSEMBLING
THE.S FILES
The first group of operations in the batch file obtains
relocatable object modules from the source files. The
C-Ianguage source file cprog.c is compiled by invoking
the HighC29K compiler with the command line:
hc29 -c -w cprog.c

HighC29K replaces the symbolic instructions in the
source file with equivalent machine-code routines. Then
a relocatable object file (cprog.o) is produced, as
shown in Figure 5.

-

o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
....

The parameter -w suppresses warning messages,limit·
ing the output to containing only errors; the -c parameter instructs the assembler to produce the object file.
Note that a second compilation is performed with the
-Hasm flag on. This produces an assembly listing (.5
file) only.
Next, the ASM29K assembler is used to assemble the
modules smplboot.s, traps.s, and r29k.s. This involves replacing assembly-language symbolic instructions in the source file with the corresponding machine
instruction code. To assemble smplboot.s and obtain a

3-87

29K Family Application Notes
relocatable object file, the following command line can
be entered:
as29 -1 > smp1boot.1st
smp1boot.o smp1boot.s

-0

}

Same line.

A relocatable object file (smplboot.o) and a listing file
(smplboot.lst) are produced from the assembly. All
assembly-time errors are directed to the std.out. The
operation is shown in Figure 6. The same operation is
done on traps.s and r29k.s.
LInking
Once the relocatable object files have been made, they
must be linked (Le., assigned physical addresses). This
is done using the ASM29K linker, which allows one or
more object files from either the assembler or the

compiler to be linked together into a single executable
object file.
The object modules are linked by entering the command
line:
1d29 -c step!. cmd -0 step!. out} S
-f tx -m > out1ink .map
ame

r

me.

Using the command file step1.cmd (see Listing 2), the
files smplboot.o, r29K.o, and traps.o are linked with
cprog.o into a single, non-relocatable object file called
sc.out. A reference to where each module was placed is
put in the map file step1.map. Any error messages are
sent to the std.out. The linking process produces a map
file that lists the local symbol table, external symbols,
and the cross-reference. This type of output is a good
reference to the entire application program.

HighC29K Compiler

11966A-05

Figure 5. Compiling cprog.c

3-88

Preparing PROMs Using the Am29000 Development Tools

ASM29K
Assembler

smplbootlst

11966A-06

Figure 6. Assembling smplboot.s

Listing 2. The Linker Command File step1.cmd
"o

o
o
o
o

o

o
o
o
o
o
o

o
~

ORDER .text=OxO
ORDER .bss=Oxl00400
ORDER .data=Oxl00420
PUBLIC _MSTACK=Oxlf7fc
PUBLIC _RSTACK=Oxlfffc
load smplboot.o,r29k.o,traps.o
load cprog.o
load c:\29k\lib\libmw.lib

TRANSFERRING CODE FROM ROM TO RAM:
ROMCOFF
The smplboot.s file contains a section of initialized data
that must be loaded into instruction/data RAM and
tested by the application program. This could be accomplished by writing many lines of const, consth, and
storem instructions into the smplboot.s file. Another
method is to use the ROM GOFF utility.
The ROMGOFF utility transforms user-specified sections of an Am29000 program into a stream of instruc-

o
o
o
o
o
o
o
o
o
o
o
o

tions that will perform the transcription. From a fully
linked, executable Am29000 program, the ROMGOFF
utility generates a GOFF output file containing initializers that will establish the image of an executable
GOFF input file in instruction/data RAM. The output file
contains one section, RUext, within which is one routine, RAMlnit. The output file can then be linked with
other relocatable modules that will remain in Instruction
ROM, to produce a single non-relocatable module for
programming PROMs.

3-89

29K Family ApplIcation Notes
ROMCOFF can be used to transcribe entire sections of
code into instruction/data RAM. Then, once the application's boot program has finished preparing the environment, it transfers control to the transcribed program in
instruction/data RAM. This allows the code to be
executed out of high-speed RAM devices, which are
frequently more cost effective than high-speed PROMs.
See Figure 7.
In the example program, only a section of initialized data
in smplboot.s is transferred to RAM. ROMCOFF
creates a relocatable object module that transcribes the
data sections to RAM when the following command line
is entered:
romcoff -t1b stepl.out rom.o
The linked output file step1.out is made into the file
rom.o. Only the data section is output, because of the
ROMCOFF options -Ub, which specify that the text,
literal, and bss sections should be ignored.
The output from ROMCOFF (rom.o) contains only code
to transcribe data sections. It must be re-linked with the

object files to produce a final absolute object module.
First, the code in smplboot.s, which contains a call to
the RUext section, must be assembled to include the
conditional assembly statements.
To assemble smplboot.s so that it will contain the call,
enter:
as29 -1 > smp1boot .1st -DRAMINIT} Same
-0 smp1boot. 0 smp1boot. s
line.
The -0 option defines RAMlnit so that conditional assembly statements in the source file will be assembled.
The statements include a definition of RAMlnit, and a
call to it. Then, all of the object modules can be linked
with rom.o as follows:
1d29 -c step2.cmd -0 step2.out
-f tx -m > step2. map

} Same

line.

A second linker command file is used because rom.o
must identified to the linker (see Listing 3).

Instruction ROM
Or
Boot

Initialize
Environment

Main

Instruction/Data RAM
Main

Transcribe Code
to RAM

Call Main

Execute
Application
11966A-07

Figure 7. Using ROMCOFF .

3-90

Preparing PROMs Using the Am29000 Development Tools
Listing 3. The Linker Command File step2.cmd

....

""
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

ORDER .text=OxO,RI_text
ORDER .bss=Ox100400
ORDER .data=Ox100420
PUBLIC _MSTACK=Ox1f7fc
PUBLIC RSTACK=Oxlfffc
load smplboot.o
load rom.o
load r29k.o,traps.o
load cprog.o
load c:\29k\lib\libmw.lib

-

-

DOWNLOADING TO THE ADAPT29K
Once the final executable object module is created, the
example program can be downloaded to the target
system and tested using the ADAPT29K.
USING BTOA
The BTOA utility creates an ASCII COFF output from
the input file. Although the ADAPT29K can handle
Tektronics® or Motorola® hex files, using the BTOA utility to make the ASCII hex file has several advantages.

Most importantly, BTOA encodes the input file into
(7-bit) ASCII using a compact base-5 scheme that limits
file expansion to only 25 percent, as opposed to 150 percent for standard hex formats. Hence, the resulting output file is smaller, and consequently quicker to transfer.
Also, BTOA maintains the ASCII COFF format, rather
than converting it to absolute addresses.
As shown in the sC.bat batch file, BTOA produces the
output file sc.a and is invoked by:
btoa step2s.out sc.a

OOOOOOOOR

c6400200

MFSR

GR64,CPS

00000004R

03fb4lff

CONST

GR65,OxFBFF

00000008R

90404041

AND

GR64, GR64, GR65

OOOOOOOcR

ceOO0240

MTSR

CPS,GR64
GR64,OxO

00000010R

03004000

CONST

00000014R

ceOOO040

MTSR

VAB,GR64

00000018R

0300403f

CONST

GR64,Ox3F

000000lcR

ceOO0740

MTSR

RBP,GR64

11966A-08

Figure 8. List Memory Display

3-91

29K Family Application Notes
Listing 4. Results of "End Execution" Command List

..,

...,

o

g
o

g
o

..~

o
o
o
o

> d 400,420
00000400
00000410
00000420

00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000

o
o

o

,-

~

TESTING THE EXAMPLE PROGRAM WITH THE
ADAPT29K

the next prompt has appeared, the contents of the
instruction ROM can be verified by entering:

*1

Once the object module has been translated using the
BTOA utility, it can be downloaded to the target using
ADAPT29K. For use with ADAPT29K, the STEB should
be configured as indicated in Appendix F.

The ADAPT29K should respond to the L (list memory)
command with the display shown in Figure 8. The locations starting at Ox400 in instruction/data RAM contain
the status of the test and number of successful loops,
respectively. Which location actually contains which
variable is a decision made by the linker, and must be
determined by inspection.

To download the file, communication must be estab-.
lished with the ADAPT29K. On a PC, this is done by
invoking the terminal emulator program (for example,
CrossTalkl!!i), establishing communication with the
ADAPT29K, and entering (note that # is the ADAPT29K
monitor prompt):

* ya

Or

To check these locations automatically when the execution stops, set up an "end execution" command list by
entering:

e,Or

*e

The Y (load a file to memory) command prepares the
ADAPT29K to receive an ASCII-encoded file from the
DCE port. Then, the emulator must be instructed to
transmit the file (for example, se sC.a when using
CrossTalk). After the code has been downloaded, and

d 400,420;

The list is executed on entry. It should appear as shown
in Listing 4.

GR080

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

GR088
GR096

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00104a18 00000000 00000000 00000000 00000000 00000000 00000000 00000000

GR104

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

GR112
GR120

00000000 00000000 00000000 00000000 80000020 000095d9 00100400 00000095
ffffffff 80000000 00000000 00000000 00000000 000lf7fe 00000£££ 06050101

LROOO

00000928 OOOlf££e 00100414 00108414 000000£0 OOOlf££e 00000000 00000000

LR008

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

LR016

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

LR024
LR032

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

LR040

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

LR048

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

LR056

GR001
000lf£e4
(R249)

IPC

IPA

IPB

00

00

00

(GROOO)

(GROOO)

Q

00000000

ALU: DF V N Z C
0

o

0 0

BP

FC

CR

0

00

00

(GROOO)

Figure 9. Key Registers Display

3-92

0

11966A-09

Preparing PROMs Using the Am29000 Development Tools
Prior to starting the test, it is a good practice to reset the
system by using the P reset command:

Enter:
# e

# preset

The display will be similar to that shown in Figure 11.
The precise display in any given situation, particularly
the loop count stored in location 40CD is dependent
on the exact time elapsed between the start execution
and the entry of the E command. At another time, it may
appear as shown in Figure 12.

To verify the condition of the system before execution,
the X (Display Key Registers) command is entered as:
# x

This will result in a display as shown in Figure 9. The
special-purpose protected registers can be checked
using the XP (display protected registers) command.
The display appears as shown in Figure 10.

The state of the processor can be checked using the C
(check execution state command):
# c

To execute the program starting from address 0 in
instruction ROM, the G (go-start execution) command
is used:

When the processor is running, ADAPT29K displays:
Am29000 is Running.

# g Or

During execution, the status of the program can be
checked by invoking the previously defined "end execution" command list.
.

# xp

CPS:

ops:

CA

IP

TE

TP

TU

FZ

LK

RE

WM

PD

PI

SM

IM

DI

DA

0
0

0
0

0
0

0
0

0
0

1
1

0
0

1
1

0
0

1

1
1

1
1

0
0

1
1

1

LS ML ST LA TF

TR

NN CV

79

1

CFG: PRL
01

VAB
0000
CHA

CHD

00104a14

00000000

VF

RV

BO

1

0

0

CHC:

CE

00

1

CD

CP
1

CNTL
0

1

1

CR
00

1

0

0

0

0

0

RBP: BF BE BD BC BB BA B9 B8 B7 B6 B5 B4 B3 B2 B1 BO
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
TCV
000000

TR: OV IN IE
1

1

0

TRV
000000

PCO

PC1

PC2

00000a34

00000a30

00000a2c

MMU: PS
0

PID
00

LRU
0
11966A-10

#

Figure 10. Protected Registers Display

3-93

29K Family Application Notes

> d 400,420
00000400
00000410

009595d9 00000000 00108414 0000002d

00000420

00000000

00100414 00000000 00000000 00000000

11966A-11

Figure 11. Check Status Display

> d 400,420
00000400
00000410
00000420
#

009595d9 00000000 00108414 000000e1
00100414 ffffffff ffffffff ffffffff
ffffffff
11966A-12

Figure 12. Second Check Status Display

PREPARING PROMs

PROGRAMMING THE PROMS

Once the absolute object file has been prepared, it must
be transferred to the media from which the code will be
executed. Often, this medium is a PROM set. Most
PROM programmers require their input to be in an
ASCII hex format, so a translation normally is performed
before sending the program to the PROM programmer.

A PROM programmer is used to "burn" the binary object
file into PROM devices. Many types of PROM programmers are available. The Data 1/0 Unisite(ll) PROM
programmer is used in the following example.

MAKING HEX FILES: COFF2HEX
The COFF2HEX utility produces a 32-bit ASCII hex file
in either the Motorola S3 or Tektronics Extended format.
Both of these formats are accepted by most PROM
programmers, as well as the ADAPT29K. Note that the
ADAPT29K requires the file to be one module, rather
than being divided into separate modules by part size
(see the options of the COFF2HEX utility).
In sc.bat, COFF2HEX is invoked by entering:
coff2hex -c t -m -p 27512
step2e_out > sccoff.e

}

Same line.

This produces 8-bit wide modules that will fit into a
27512 EPROM (-p option). The format is Motorola S3
(-m option), and will include only the text sections (-c t
option).
The resulting file(s) will be named a.aOO, a.a08, a.a16,
and a.a24, indicating which bytes of the word they
represent. If the file is larger than the capacity of the part
size specified, additional sets of four will be generated
with filenames a.bOO, a.b08, a.b16, a.b24, and so on,
with further sets having a corresponding nomellclature.
Once generated, the files can then be transmitted to a
PROM programmer.

3·94

Assuming an object module had been created as
described in the first part of this document (and a set
of Motorola S3 modules were obtained using
COFF2HEX), the following procedure could be used to
create a PROM set. .
1. Turn on the PROM programmer. Make sure the
algorithm disk is properly inserted in the lower front
slot.
2. Once the power-up sequence and diagnostics have
completed, a screen should appear on the attached
terminal. If there is no terminal, or the screen does.
not appear, refer to the set-up section of the user's
manual for the PROM programmer.
3. Make sure a host system is attached. In this example, the use of a PC is assumed. At the PC, set the
. COM1 serial port of the PC to 9600 baud, no parity,
8-bit bytes, and one stop-bit by entering: mode
com1 :96,n,8,1. On the PROM programmer, select
"Configure System," followed by "Edit," and then
"Serial 110." Make sure the remote port parameters
are set properly.
4. The program will be placed in AMD 27512 PROMs.
To inform the PROM programmer, choose "Select
Device," "3" (AMD), and "25" (27512).

Preparing PROMs Using the Am29000 Development Tools
5. It is a good idea to clear the PROM programmer's
memory before downloading data. This ensures
that the PROMs do not become programmed with
leftover data from a previous operation, which may
cause troublesome errors. To clear the memory,
select "Fill Memory." Enter 00 to 7FFFF as the
address range, and FF as the data.
6. The PROM programmer must know the format of
the incoming data. Select "Transfer Data," followed
by "Format Select." Enter "95" for Motorola S3
Record.
7. Select "Load Device" on the programmer. On the
PC, enter:

B. Properly insert a PROM into the ZIF socket on the
PROM programmer and engage the locking mechanism. Select "Program Device" option on the PROM
programmer.
9. Once the PROM has been burned, remove it and
label it with the program name, range of bits,
verSion, and date. Then, repeat steps 7-9 using the
files a.aOa through a.a24. If a larger program is
used, it may be necessary to repeat steps 7-9 using
modules a.bOO, a.bOB, a.b16, a.b24, and so on.

copy a.aOO coml:
This causes the lowest B bits of the application to be
transmitted to the PROM programmer, which will
load the data into its memory.

3-95

29K Family Application Notes

APPENDIX A: smplboot.s
.extern
.extern
.extern
.extern
.extern
.equ
.equ
.equ
.include
.data
.word
.comm
.text
.ifdef
.extern
.endif
.global

r29k_init
_main
V_SPILL, V_FILL
spill, fill
_RSTACK,_MSTACK
ROM_TH,Ox2
RSC_SIZE,Ox200
TBM_SIZE,Ox20000
"scregs.def"

mfsr
canst
and
mtsr
canst
mtsr
canst

tmpO,CPS
tmpl,OxFBFF
tmpO,tmpO,tmpl
CPS,tmpO
tmpO,O
VAB,tmpO
tmpO,Oxll

mtsr
canst
canst
consth
sub
srI
sub

CFG,tmpO
tmp2,0
tmpO,O
tmpl,TBM_SIZE
tmpl,tmp1,tmpO
tmpl,tmpl,2
tmpl, tmp1, 2

store
jmpfdec
add
canst
canst
consth
canst

tmp1,mem_00
tmpO,tmpO,4
tmpO,256-2
tmp1,illtrap+Ox2
tmp1, illtrap
tmp2,0

assembly module
C module
Linker definable V_SPILL and V FILL vector numbers
spill and fill procedure
Link time definable stack pointer assignments
Spill and fill trap interface do truly reside in ROM space
Default reg_stack_cache usage=512
32K*4=12Bkb of Inst/RAM size

(20)170
mtp_count,4
RAMINIT
RAMInit

if RAMINIT Flag on
make RAMInit available

start

start:

vtd_init:
store
jmpfdec
add
canst
consth
canst
sll
store
canst
consth
canst
sll
store
canst
consth
canst
sub
sub
canst
consth
add
canst
consth
calli
nap
.ifdcf

3-96

0,0,tmp1,tmp2
tmpO,vtd_init
tmp2,tmp2,4
tmpO,spilltrap+ROM_TH
tmpO,spilltrap
tmp1,V_SPILL
tmp1,tmp1,2
0,0,tmpO,tmp1
tmpO,filltrap+ROM_TH
tmpO, filltrap
tmp1,V_FILL
tmp1,tmp1,2
0,0,tmpO,tmp1
rfb,_RSTACK
rfb,_RSTACK
tmpO,RSC_SIZE
rab,rfb,tmpO
rsp,rfb,OxB
msp,_MSTACK
msp,_MSTACK
Ir1,rfb,0
tmpO,r29k_init
tmpO,r29k_init
IrO,tmpO
RAMINIT

Read CPS
Clear FZ bit
Update CPS
Set VAB pointing to LOW memory
Set VF=l, i.e., Vector table scheme and CD=l,
i.e., Branch Target Cache is disabled
Write Data pattern = OxOOOOOOOO
Low memory address
High memory address
Get address difference
Get word count from diff value
adjustments for jmpfdec instr
fill TB_memory with all zeros
0,0,tmp2,tmpO

Total of 256 vector table entries
ROM based illegal trap handlers
address, by default
fill vector table with default
trap handlers

get spill trap entry point
get spill trap vector number
generate vect number location
store address of trap handler into vector table
get fill trap entry point
get fill trap vector number
generate vect number location
store address of trap handler into vector table
Set RFB
Ox200=512 bytes ie 12B*4
Set RAB=RFB-512
Set RSP=RFB-B
Set MSP
Set Ir1 to RFB

call procedure to init 29K registers
if RAMINIT on,

Preparing PROMs Using the Am29000 Development Tools
const
consth
calli
.else
nop
nop
nop
.endif
nop
const
consth
mtsrim
mtsrim
mtsr
add
mtsr
xor
iret

tmpO,RAMInit
tmpO,RAMInit
gr96,tmpO

set up RAMInit call
and do the call
make sure code takes same
number of locations
regardless of RAMINIT condition

tmpO,exec
tmpO,exec
OPS,Oxl72
CPS,Ox573
PCl,tmpO
tmpO,tmpO,4
PCO,tmpO
tmpO,tmpO,tmpO

in case we did calli
get target application task address
RE=l, PI=l, PO=l, SM=l and 01=1
Set Target application Task address

Any additional regs clean up
Give control to application via IRET

exec:
const
consth
calli
nop
sll
sll
sll
const
consth
load
cpeq
jmpt
nop
halt

lrO, _main
lrO,_main
lrO,lrO
gr97,gr64,O
gr98,gr65,O
gr99,gr66,O
gr64,mtp_count
gr64,mtp_count
O,O,gr65,gr64
gr67,gr96,O
gr67,again

get C-callable routine entry point
make the call
Save user global registers gr64
through gr66
get address of memory test pass
count recorder
get current count so far
check for memory test pass?
true then run test again
false halt further memory testing

again:
add
store
sll
sll
sll
jmp
nop
spilltrap:
mfsr
const
consth
mtsr
add
mtsr
iret
filltrap:
mfsr
const
consth
mtsr
add
mtsr
iret
illtrap:
halt
.end

gr65,gr65,l
O,O,gr65,gr64
gr64,gr97,O
gr65,gr98,O
gr66,gr99,O
exec

tpc,PCl
tmpO, spill
tmpO, spill
PCl,tmpO
tav,tmpO,tmpO+4
- PCO, tmpO

tpc,PCl
tmpO, fill
tmpO, fill
PCl,tmpO
tav,tmpO,tmpO+4
PCO,tmpO

bump mtp_count by 1
update in memory also
Restore user global registers gr64
through gr66
run the memory test once again

save return address in tpc
get spill procedure entry point
fill Am29000 pipeline target address
fill Am29000 pipeline with target address+4

save return address in tpc
get fill procedure entry point
fill Am29000 pipeline target address
fill Am29000 pipeline with target address+4

3-97

29K Family Application Notes

APPENDIX B:
IIdefine
IIdefine
#define
IIdefine
IIdefine
#define
#define
#define
#define
int
int
int
int

cprog.c

MT_PASSED
SOLID_ONES
SOLID_ZEROS
MT_BLK_SIZE
WORD_SIZE
INIT_DATA
MEM_BLOCK
NIT_DATA_BASE
INIT_DATA_SIZE

0
-1
32768
4

170
1056
1280
15

*mt_sts;
Im_addr,hm_addr;
initdata;
*mem_test();

main ()
(

Im_addr =
hm_addr =
initdata
mt_sts

INIT_DATA_BASE;
INIT_DATA_BASE+MT_BLK_SIZE/WORD_SIZE;
MEM_BLOCK;
mem_test(lm_adqr,hm_addr,initdata);

int *mem_test(low,high,initd)
int *low,*high,*initd;
int *addr;
/* Solid Ones test */
for{addr=low; addr<=high; addr++)
*addr = SOLID_ONES;
for{addr=low; addr<=high; addr++)
if{*addr != SOLID_ONES)
return{addr);
/* Solid Zeros test */
for(addr=low; addr<=high; addr++)
*addr = SOLID_ZEROS;
for(addr=low; addr<~high; addr++)
if(*addr != SOLID_ZEROS)
return(addr);
for (addr=initd;addr'
Address Bus

Figure 7. Example Am29000 System

3-114

--q
11025A-07

Programming Standalone Am29000 Systems

tions off (PD,PI = 1), turns on supervisor mode (SM = 1),
and disables all interrupts and traps (DI,DA= 1).
Step 2-Establlshlng a Simple Register Stack
Frame

BOOT.S calls several procedures, so it establishes a
Register Stack Frame. However, control will not return
to BOOT.S after calling _main. Therefore, it only needs
to use a limited stack frame. The frame is set up with:

const
frame
const
sub
pl
add

rfb, 512

;set up temp reg

rab, 0
rsp, rfb, 16 ;enough for pO and
lrl, rfb, 0

Step 3-lnltlallzlng I/O Devices

An I/O device is initialized early, so that it can be used to
transmit error messages. The 8530 serial communications controller is initialized using the routine shown in
Listing 1.

LIsting 1. Initializing I/O Devices
SerInit:
.reg
.reg
const
consth
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store

SI CtAd, %% (TEMP_REG + 0)
SI_CtVl, %% (TEMP_REG + 1)
SI CtAd, SCCCntlAd
SI_CtAd, SCCCntlAd
SI_CtVl, 9
0, 0, SI_CtVl, SI CtAd
SI_CtVl, OxcO
0, 0, SI _CtVl, SI_CtAd
SI CtVl, 4
0, 0, SI _CtVl, SI CtAd
SI_CtVl, Ox44
0, 0, SI_CtVl, SI_CtAd
SI_CtVl, 3
0, 0, SI_CtVl, SI CtAd
SI_CtVl, OxcO
0, 0, SI_CtVl, SI _CtAd
SI_CtVl, 5
0, 0, SI_CtVl, SI_CtAd
SI_CtVl, Ox60
0, 0, SI_CtVl, SI_CtAd
SI_CtVl, 9
0, 0, SI_CtVl, SI_CtAd
S1_CtVl, OxO
0, 0, S1 CtVl, SI CtAd
SI_CtVl, 10
0, 0, SI _CtVl, S1_CtAd
S1 CtVl, OxO
0, 0, SI _CtVl, S1 CtAd
SI CtVl, 11
0, 0, S1 _CtVl, SI CtAd
SI CtVl, Ox56
0, 0, SI _CtVl, SI_CtAd
S1 CtVl, 12
0, 0, S1 _CtVl, S1_CtAd
SI - CtVl, Ox6
0, 0, SI _CtVl, SI CtAd

-

;control port address
;control port value

-

;reset the port

-

-

;x16, 1 stop, no parity

;8 bits receive

-

;8 bits xmit

;Int. disabled

;NRZ

;Tx & Rx BRG out

;9600 baud

-

3-115

29K Family Application Notes
Listing 1. Initializing 1/0 Devices (continued)
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
EPILOGUE

SI_CtVl, 13
0, 0, SI_CtVl,
SI_CtVl, OxO
0, 0, SI_CtVl,
SI_CtVl, 14
0, 0, SI_CtVl,
SI_CtVl, OxO
0, 0, SI_CtVl,
SI_CtVl, 14
0, 0, SI_CtVl,
SI_CtVl, Ox1
0, 0, SI_CtVl,
SI_CtVl, 3
0, 0, SI_CtVl,
SI_CtVl, Oxc1
0, 0, SI_CtVl,
SI_CtVl, 5
0, 0, SI_CtVl,
SI_CtVl, Oxea
0, 0, SI_CtVl,

;9600 baud
SI_CtAd
SI_CtAd
;BRG in RTxC
SI_CtAd
SI_CtAd
;BRG on
SI CtAd

-

SI_CtAd
;Rx enable

SI CtAd

-

SI_CtAd
;Tx enable
SI_CtAd
SI_CtAd

Step 4-Testlng RAM
The RAM is tested before code is transferred to it.
BOOT.S calls a single test, an address pattern test.
Other tests are included in the source listing shown in
Appendix A. The test used by BOOT.S is shown in
Listing 2.

Step S-Settlng the Vector Table Entries to the
Invalid Trap Handler

will

START.S
set up the vector table, but BOOT.S
guards against abnormal ends by making all of the
vector table entries point to an invalid trap handler in
ROM. This is done with the following routine, which is
called from the main loop, as shown In Listing 3.

Listing 2. Testing RAM
.sbttl
FUNCTION

"RAM Address Pattern Test"
RAMAddr, 2, 0, 3

This routine will run a two-pass test on RAM. It will be controlled by input values
specifying the base address and the count of locations ~o be tested. In the first
pass, the data will be set equal to the address. In the second pass, the data
will be set equal to the complement of the address.

3·116

In:

(see below)

Out:

(see below)

.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg

RA_StrtAdd,
RA_WrdCnt,
RA_TmpCnt,
RA_StrtPat,
RA_PtrnInc,
RA_NxtAdd,
RA_WrtPat,
RA_RedPat,

%% (IN PRM + 0)
n (IN_PRM + 1)
%% (TEMP_REG + 0)
%% (TEMP_REG + 1)
n (TEMP_REG + 2)
n (OUT_PRM + 0)

%!is (OUT_PRM
%!is (OUT_PRM

+ 1)
+ 2)

:starting address
;count of words
:total test word count
;starting pattern
:ptrn increment value
;error address
;pattern written
; pattern read

Programming Standalone Am29000 Systems
Listing 2. Testing RAM (continued)
.reg
add
const

RA_Fail,
%%(RET_VAL + 0)
RA_StrtPat, RA_StrtAdd, 0
RA_Ptrnlnc, 4

;fill memory with pattern
add
RA_NxtAdd, RA_StrtAdd, 0
RA_TmpCnt, RA_WrdCnt, 2
sub
RA_WrtPat, RA_StrtPat, 0
add

0, 0, RA_WrtPat, RA_NxtAdd
store
add
RA WrtPat, RA_WrtPat, RA_Ptrnlnc
jmpfdec
RA_TmpCnt, RA_2
RA_NxtAdd, RA_NxtAdd,
add
;check memory for pattern
add
RA_NxtAdd, RA_StrtAdd, 0
RA_TmpCnt, RA_WrdCnt, 2
sub
RA_WrtPat, RA_StrtPat, 0
add

;TRUE for fail
;start with address

;get start address
;for jmpfdec
;set the pattern

;next test mem addr
;get start address
;for jmpfdec
;set the pattern

RA_3:
CD, DATA_CTL, RA_RedPat, RA_NxtAdd
load
cpneq
RA_Fail, RA_RedPat, RA_WrtPat
jmpt
RA_Fail, RA_ERR
nop
add
RA_WrtPat, RA_WrtPat, RA_Ptrnlnc
jmpfdec
RA_TmpCnt, RA_3
add
RA_NxtAdd, RA_NxtAdd,
; invert ptrn for next pass
nor
RA_StrtPat, RA_StrtPat, 0
cpneq
RA_Fail, RA_StrtPat, RA_StrtAdd
jmpt
RA_Fail, RA_l
subr
RA_Ptrnlnc, RA_Ptrnlnc, 0
jmp
RA_EXIT
nop

;err if neq

;next test mem address
;invert initial

;negate inc value

RA_ERR:
call
nop
const
consth

lrO, RAMErr
RA_Fail, TRUE
RA_Fail, TRUE

;set after call

RA_EXIT:
EPILOGUE

3·117

29K Family Application Notes
Listing 3. Setting Vector Table Entries
. sbttl
LEAF

"Vector Initialization"
Vectlnit, 0

This routine initializes the vector table and vab.
All vectors
are set to point to the invalid trap handler in ROM.
.reg
.reg
.reg
mtsrim
mfsr
const
consth
const

VI_Vect,
%% (TEMP_REG + ,0)
VI_VectSt, %%(TEMP_REG + 1)
VI_VectCnt, %%(TEMP_REG + 2)
vab, 0
VI_VectSt, vab
VI_Vect, (InvalidTrapHandler I 2)
VI_Vect, InvalidTrapHandler
VI_VectCnt, (256 - 2)

;vector value
;vector storage address
;vector count register

store
jmpfdec
add
EPILOGUE

0, 0, VI_VectSt, VI_Vect
VI_VectCnt, VI_Loop
VI_VectSt, VI_VectSt, 4

;store the vector

; for jmpfdec

VI_Loop:

Step 6-Transcribing Code to RAM
BOOT.S transcribes START.S and the C-Ianguage
application (Simulated by TEST.S) into instruction/data
RAM by calling RAMlnit.
RAMlnit is a routine that is created by the ROMCOFF
utility. When an executable Am29000 object file is submitted to ROMCOFF, the utility generates a relocatable
object file of type RLText that (when called) establishes
an image of the executable module in instruction/data
RAM. BOOT.S transfers START.S and the C-Ianguage
application to RAM by calling the RAMlnit routine created by ROMCOFF.
RAMlnit is called by:
RI_Ret,RAMlnit

call

;initialize RAM

Note that when RAMlnit is called, the return address is
not stored In a local register (such as IrO) , and that
RAMlnit is called just before transferring control to

_main. To transcribe data to RAM, RAMlnit will create a
stream of const and consth instructions that will load up
the local registers starting from IrO. Then it will insert a
store multiple command to transfer the data into memory. Consequently, any data in local registers will be
overwritten.
Step 7~alllng START.S
As BOOT.S does not intend to have control returned to
it, it calls START.S by Simulating a return from interrupt.
This is accomplished by setting the freeze (FRZ) bit ON
in the old processor status (ops) and current processor
status registers (cps), putting the starting address of
START.S in peo, and performing a return from interrupt
(see Listing 4).
The Main Loop of BOOT,S
When all of the preceding steps are put together, the
main loop appears as shown in Listing 5.

Listing 4. Calling START.S
mtsrim
mtsrim
const
consth
mtsr
add
mtsr
iretinv

3·118

ops,
cps,
lrO,
lrO,
pc1,
lrO,
pcO,

Ox473
Ox473
TextBas
TextBas
lrO
lrO, 4
lrO

;FZ, PO, PI, SM, 01, DA
;FZ, PO, PI, SM, 01, DA
; (using lrO as temp)

;go to inst space, TextBas

Programming Standalone Am29000 Systems
LIsting 5. Main Loop of BOOT.S
Boot:
.reg
mtsrim
const
const
sub
add
call
nop
const
consth
call
const
call
nop
call
mtsrim
mtsrim
const
consth
mtsr
add
mtsr

RI_Ret, %%(TEMP_REG + 0)
cps, Ox173
rfb, 512
rab, 0
rsp, rfb, 16
.lr1, rfb, 0
IrO, Serlnit
p1,
p1,
IrQ,
pO,

(RAM_SIZE» 2)
(RAM_SIZE» 2)
RAMAddr
0

IrO, Vectlnit
RI_Ret, RAMlnit
ops, Ox473
cps, Ox473
IrO, TextBas
IrO, TextBas
pc1, IrO
IrO, IrO, 4
pcQ, IrO

CREATING THE EXECUTION ENVIRONMENT
WITH START.S
The START.S file is used to prepare the execution
environment for the application program (simulated by
TEST.S). Although a given application certainly will
have varied requirements in different hardware environments, the tasks that will be performed by START.S are
needed to establish virtually any operating environment
on the Am29000. These are:
1. Configure the Am29000.
2. Allocate the register and memory stacks.
3. Initialize vector table and trap handlers.
4. Initialize the TLB by marking all entries invalid.
5. Call "main."
Step 1-Conflgurlng the Am29000
Code similar to that shown below can be used to set the
contents of the cfg so that the vector area is a table of
pOinters (VF =1) and the Branch Target Cache™ is
disabled (CD =1). Also, the cps register is set so that
physical addressing is used for both instructions and
data (PO = 1,PI = 1), all interrupts and traps are disabled
(01 =1), and supervisor mode is ON (SM =1). The timer
(tmr) is also set to 0 to avoid unwanted timer interrupts:
mtsrim
mtsrim
mtsrim

tmr, 0
cfg, (VFICD)
cps, (PDIPIISMIDI)

:RAMlnit return
:RE, PO, PI, SM, OI, OA
:set up temp reg frame
:enough for pO and p1
:initialize an 8530 to report errors
:test full RAM size
:ca11 a RAM address test
:test from addr 0 (input parm) to RAM test
ito RAM test
:routine to initialize traps to
;invalid trap handler
:initialize RAM -- from ROMCOFF
;FZ, PO, PI, SM, OI, OA
:FZ, PO, PI, SM, OI, OA
: (using IrO as temp)

The setting of the VF bit has determined the structure of
the vector area table. The vector area is a usermanaged table in external instruction/data memory that
starts at the address held in the vector area base (VAS)
register. The vector area can have one of two different
structures, as determined by the VF bit of the configuration register.
If VF = 1, then the vector area is organized as a list of
256 pointers to interrupUtrap handlers. If VF =O,then the
vector area is arranged as 256 64-instruction blocks,
each corresponding to a given call. Each fixed block
then contains the corresponding interrupt or trap
handler. Figure 8 shows the two structures.
When the Am29000 receives an interrupt or trap, the
location of the appropriate handler is determined by the
vector area (VA). Each interrupt and trap has a vector
number between 0 and 255 that corresponds to an entry
in the vector area. Of the vector numbers, 0 to 63 are
reserved for system and floating-point operations. The
assigned vector numbers are given in the Am29000

User's Manual.
If the table is a list of pointers, control will be passed to
the address at VAS + (vector number· 4). Multiplication
by 4 adjusts the vector number to words. If the vector
table is composed of handlers, control will be passed to
a handler starting at VAS + (vector number • 64 • 4),
where the vector number is adjusted to words and multiplied by the number of instructions per block (fixed) (see
Table 2).
.

3·119

29K Family Application Notes
Table 2. The Location of a Pointer In the VAT
CFG:VF

ISR Address=

VAB + (vector number· 4)
VAB + (vector number • 256)

1

o

Step 2-Allocatlng Register and Memory Stack
Frames
A full register stack frame is established by START.S,
because it will call the application program Lmain).
Further, control could be passed back to the START.S
return address (which then initiates a ''warm start"). This

should be done early in the main loop, as START.S will
call some supporting assembly-language routines. The
register stack frame can be established by the code
shown in Listing 6.
Arguments that overflow the register stack will have to
be placed in the memory stack (see Figure 8). The
current position in the memory stack is pointed to by the
memory stack pointer (msp).
The stack can be established by:
const.
consth

InSp, ,MStkTop
InSP, MStkTop

LIsting 6. Allocating Register and Memory Stack Frames
const
consth
const
consth
add
sub

rfb,
rfb,
rab,
rab,
lr1,
rsp,

vAB
+
(Vector Nu mber· 256)

I

...

-

;RStkTop is set to the
;desired address in the declarations file
;128*4, maximum
;part that can
;be cached
;adjusts for lrO, lr1, argc, and argv

RStkTop
RStkTop
(RStkTop - 512)
(RStkTop - 512)
rfb, 0
rfb, 16

Handler

~

VAB

~

Handler

... J

-I

•
•
•

•
•
•

I

CFG:VF=1

Figure 8. The Two Structures of the Vector Area

3-120

I

VAB
+
(Vector Number • 4)

Handler

CFG:VF=O

Handler

---

I
11025A·08

Programming Standalone Am29000 Systems
Step 3-lnltlallzlng the Vector Area and Vectors
Although the organization of the vector area is determined by the configuration register, the table and pointers still must be initialized. In the following example, the
vector initialization code is kept compact, while permitting easy expansion of the vector set, by using a table in

the .data section. Each entry in the table has two words.
The first is the vector number; the second is the handler
address (see Listing 7).
When the vector area base (vab) is supplied to the
routine shown in Listing 8, it initializes the handlers.

Listing 7. Initializing the Vector Area and Vectors
;switch to .data for table

.data
VectInitTable:
.word
.word
.word
.word
.word
.word
.word
.word
.word
.equ
.text

V_SupInstTLB, SupInstTLBHandler
V_SupDataTLB, SupDataTLBHandler
V_MULTIPLY, MultiplyHandler
V_DIVIDE, DivideHandler
V_MULTIPLU, MultipluHandler
V_DIVIDU, DividuHandler
V_SPILL, SpillHandler
V_FILL, FillHandler
V_Timer, TimerHandler
VINIT_CNT,
VectInitTable) / 8)

«. -

;switch back to .text for code

Listing 8. Initializing Vector Handlers
VectInit:
.reg
.reg
.reg
.reg
.reg
mfsr
const
const
con 5th

VI_Vect,%%(TMP_REG + 0)
VI_St,%%(TMP_REG + 1)
VI_Cnt,%%(TMP_REG + 2)
VI_Base,%%(TMP_REG + 3)
VI_TbPt,%%(TMP_REG + 4)
VI_Base, vab
VI_Cnt, (VINIT_CNT - 2)
VI_TbPt, VectInitTable
VI_TbPt, VectInitTable

load
add
511
add
load
add
jmpfdec
store
jmp
nop

0, 0, VI_St, VI_TbPt
VI_TbPt, VI_TbPt, 4
VI_St, VI_St, 2
VI_St, VI_St, VI_Base
0, 0, VI_Vect, VI_TbPt
VI_TbPt, VI_TbPt, 4
VI_Cnt, VI_Loop
0, 0, VI_Vect, VI_St
raddr

;vector
;vector
;vector
;vector
;vector

value
storage address
count
base
base

;for jmpfdec

VI_Loop:
;get the vector
;convert to address
;get the handler

3-121

29K Family Application Notes
Step 4-lnltlallzlng the Translation Look·Aslde
Buffer (TLB)
When the Am29000 is first powered-up, the TLB will not
have valid entries. To prevent erroneous TLB misses,
the entries should be marked invalid by the start-up
sequence before control is passed to the application
program. This can be done with an assembly-language
sequence (see Listing 9).
Step 5-Calllng "main"
Once the proper environment has been established for
the application program, the main C program must be
called. This is done by placing the address of the starting
instruction in registers and performing a call. When the
jump is "short," or less than 256 words, a call can be
done directly. However, the jump often,will be farther,
and calli must be used in conjunction with an address
stored in registers, as shown below:

Notice that raddr signifies the return address, usually
IrO, by convention. Once the call is made, the return
address of the caller has replaced the target location, in
the event there is a return from _main.
The START.S Main Loop
The complete START.S main loop, as developed in the
previous sections, is shown in Listing 10. The routine
receives control after being transcribed to RAM; once
there, it initializes the vector handlers, clears the BSS
area, initializes the TLBs, and establishes initial stack
pointers and an initial register frame. Lastly, it invokes
_main. Note that, in the event _main returns, a warm
start is performed.

const raddr, _main ;store lower 16 bits
consth raddr, _main ;store upper 16 bits
calli raddr, raddr ;call indirect

Listing 9. Initializing the TLB
.reg
.reg
.reg
const
const
const

TI_Reg,%%(TEMP_REG
TI_Val,%%(TEMP_REG
TI_Cnt,%%(TEMP_REG
TI_Reg, 0
TI_Val, 0
TI Cnt, (TLB_CNT -

rnttlb
jrnpfdec
add

TI_Reg, TI_Val
TI Cnt, TI _Loop
TI_Reg, TI_Reg, 1

-

TI_Loop:

3-122

-

+ 0)
+ 1)
+ 2)

;the TLB register number
;the TLB value (0)
;the TLB register count

2)

; for jrnpfdec

Programming Standalone Am29000 Systems
LIsting 10. START.S Main Loop
Start:
Ox73
MMU_PS
Ox10
RStkTop
RStkTop
(RStkTop - 512)
(RStkTop - 512)
rfb, 0
rfb, 16

;set PO, PI, SM, OI, OA
;PID = 0

mtsrim
mtsrim
mtsrim
const
consth
const
consth
add
sub

cps,
mmu,
cfg,
rfb,
rfb,
rab,
rab,
lr1,
rsp,

const
consth
call

msp, MStkTop
msp, MStkTop
lrO, Vectlnit

;routine to install handled

IrO, TLBlnit

;routine to mark TLBs invalid

cps,
lr2,
lr3,
lrO,

Ox10
0
0
_main

;SM
;argc
;argv

cps,
ops,
cfg,
chc,
pc1,
pcO,

Ox473
Ox173
1
0
0
4

;set FZ, PO, PI, SM, OI, OA
;set RE, PO, PI, SM, OI, OA
;cache disabled
;contents invalid
;cold start address

;VF

;set up stack pointers

;make room for IrO, Ir1, argc,

argv

vectors
nop
call
nop
mtsrim
const
const
call
nop
mtsrim
mtsrim
mtsrim
mtsrim
mtsrim
mtsrim
iretinv

= 0
0

3-123

29K Family Application Notes

APPENDIX A:

boot.s

. title

"ROM Boot Code"

Copyright 1988, Advanced Micro Devices
Written by Gibbons and Associates, Inc.
This module is intended to receive control at address O. It handles a hardware
reset or a simulation of that event in a "warm start" situation.
Its purpose is to provide sufficient initializations for the operation of a program
in RAM data/instruction space.
The initializations must include the transcription
of the program and its initialized data. The code and initialized data are stored
in ROM prior to transcription.
To provide for orderly operation, C linkages are used. It is known that the register
stack will never overflow. When certain calamities occur (e.g., invalid
traps), the registers will be re-initialized to allow the use of subroutines in
this module. There is no intention of ever returning under these circumstances.
Some of the routines in this module have a rather tedious implementation because
they do not assume the validity of RAM or the readability of ROM.
This is
considered appropriate since it assures the validity of error handling.
This module provides no global addresses for external use.
be called.
It is best thought of as bootstrap code.

It is not intended to

Some tests which are not actually used are included here for use in environments
that may allow them.
The external addresses named below are required.

.extern

RAMlnit

;romcoff generated

This module needs the addresses for the control and data ports of the SCC.
are declared below.
.equ
.equ

SCCCntlAd,OxfffffffO
SCCDataAd,Oxfffffff4

These

;control port address
;data port address

This module assumes that RAM begins at data address 0 and has the size declared
below.
.equ
.include
.eject
.sbttl

RAM_SIZE,Ox40000
"romdcl.h"

;256K bytes

"Section Declarations"

This module has only one section, which is called "rom." It receives control at
reset, i.e., it is an absolute segment based at address 0 (in ROM space).
.sect
.use

3-124

rom, text, absolute 0
rom

Programming Standalone Am29000 Systems
RomBase:
jmp
nop
nop
nop
halt
nop

Boot

;the RESET entry

;the warn entry
;Could be a report routine

.eject
.sbttl

"scc

LEAF

Serlnit,O

Routines"

This routine initializes the serial port for non-interrupt driven access at 9600
baud.
In:

(nothing)

Out:

(nothing)
.reg
.reg
const
consth
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store

SI_CtAd,%%(TEMP_REG
SI _CtVI,%%(TEMP_REG
SI _CtAd,SCCCntIAd
SI _CtAd,SCCCntIAd
SI_CtVI,9
O,O,SI_CtVI,SI CtAd
SI CtVI,OxcO
0, 0, SI_CtVI,SI_CtAd
SI CtVI,4
O,O,SI CtVI,SI_CtAd
SI CtVI,Ox44
O,O,SI_CtVI,S1_CtAd
S1 CtVI,3
O,O,SI CtVI,S1_CtAd
SI CtVI,OxcO
O,O,S1_CtVI,S1 CtAd
S1 CtVI,5
O,O,SI CtVI,S1_CtAd
SI _CtVI,Ox60
O,O,SI CtVI,S1_CtAd
S1_CtVI,9
O,O,S1 CtVI,S1_CtAd
SI _CtVI,OxO
O,O,S1 CtVI,S1_CtAd
S1 CtVI,10
O,O,SI _CtVI,S1_CtAd
SI_CtVI,OxO
O,O,S1 CtVI,SI_CtAd
S! CtVI,11
O,O,S1_CtVI,S1 CtAd
S1 CtVI,Ox56
0, 0, SI_CtVI, S1_CtAd
SI _CtVI,12
O,O,SI CtVI,SI_CtAd
S1 CtVI,Ox6
O,O,SI CtVI,S1_CtAd
SI_CtVI,13
O,O,S1 CtVI,SI_CtAd
S1 CtVI,OxO
O,O,S! CtVI,SI CtAd

+ 0)
+ 1)

;control port address
;control port value

;reset the port

;x16,1 stop,no parity

-

-

-

;8 bits receive

-

;8 bits xmit

-

;1nt. disabled

-

;NRZ

;Tx & Rx BRG out

-

-

;9600 baud

-

-

;9600 baud

-

3-125

29K Family Application Notes
const
store
const
store
const
store
const
store
const
store
const
store
const
store
const
store
EPILOGUE

SI CtVl,14
O,O,SI CtVl,SI CtAd
SI CtVl,OxO
O,O,SI CtVl,SI_CtAd
SI CtVl,14
O,O,SI _CtVl,SI CtAd
SI CtVl,Oxl
O,O,SI CtVl,SI _CtAd
SI CtVl,3
O,O,SI CtVl,SI CtAd
SI CtVl,Oxcl
O,O,SI _CtVl,SI CtAd
SI _CtVl,5
O,O,SI CtVl,SI CtAd
SI_CtVl,Oxea
O,O,SI CtVl,SI CtAd

LEAF

SerXmt,l

-

-

(see below)

Out:

(nothing)

;BRG on

-

-

;Rx enable

;Tx enable

-

This routine transmits a single character via the SCC.
the SCC to become ready.
In:

;BRG in RTxC

-

It will wait (forever) for

.reg
.reg
.reg
const
consth

SX_Char,%%(IN_PRM + 0)
SX_Ad,%%(TEMP_REG + 0)
SX_Vl,%%(TEMP_REG + 1)
SX_Ad,SCCCntlAd
SX_Ad,SCCCntlAd

; character
; port address
;port value

load
and
cpeq
jmpf
nop
const
consth
store
EPILOGUE

O,O,SX_Vl,SX_Ad
SX_Vl,SX_Vl,Ox4
SX_Vl,SX_Vl,O
SX_Vl,SX_Wait

; get the status
;check tx buf empty

SX_Ad,SCCDataAd
SX_Ad,SCCDataAd
O,O,SX_Char,SX_Ad

;send the character

LEAF

SerRcv,O

SX Wait:,

This routine waits for a receive character to become ready, then reads and'returns
that character.
In:

(nothing)

Out:

(see below)
.reg
.reg
const
consth

3-126

SR_Ad,%%(TEMP_REG + 0)
SR_Char,%%(RET_VAL + 0)
SR_Ad,SCCCntlAd
SR_Ad,SCCCntlAd

; port address
;character (stat tmp)

Programming Standalone Am29000 Systems
SR_Wait:
load
and
cpeq
jmpf
nop
const
consth
load
and
EPILOGUE

LEAF

0, 0, SR_Char,SR_Ad
SR_Char,SR_Char,Ox1
SR_Char,SR_Char,O
SR_Char,SR_Wait

;get the status
;check rcv buf ready

SR_Ad,SCCDataAd
SR_Ad,SCCDataAd
0, 0, SR_Char,SR_Ad
SR_Char,SR_Char,Oxff

;fetch the character

SerChk,O

This routine checks to determine if a receive character is ready at the serial
port. It will return -1 if a character is ready and 0 if it is not.
In:

(nothing)

Out:

(see below)
.reg
.reg
const
consth
load
and
cpeq
sra
EPILOGUE

.sbttl

SC_Ad,%%(TEMP_REG + 0)
SC_Rdy,%%(RET_VAL + 0)
SC_Ad,SCCCntlAd
SC_Ad,SCCCntlAd
0, O,SC_Rdy,SC_Ad
SC_Rdy,SC_Rdy,Ox1
SC_Rdy,SC_Rdy,0
SC_Rdy,SC_Rdy,31

; port address
; character

;get the status
;check rcv buf ready
;convert to 0 or -1

.eject
"Error Message Routines"
FUNCTION

SendErr,O,O,l

This routine sends the text "Error
.reg
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
const
EPILOGUE

SE_Char,%%(OUT_PRM
lrO,SerXmt
SE_Char,'E'
lrO,SerXmt
SE_Char,' r'
lrO,SerXmt
SE_Char,' r'
lrO,SerXmt
SE_Char,'o'
lrO,SerXmt
SE_Char,' r'
lrO,SerXmt
SE_Char, , ,
lrO,SerXmt
SE_Char,'-'
lrO,SerXmt
SE_Char, , ,

FUNCTION

SendNL,O,O,l

+ 0)

; output character
;send a "E"
;send a "r"
;send a "r"
;send a "0"
;send a "r"
;send a
;send a "_"
; send a

3-127

29K Family Application Notes
This routine sends a CR-LF sequence.
.reg
call
const
call
const
EPILOGUE

SN_Char,%%(OUT_PRM + 0)
lrO,SerXmt
SE_Char,OxOd
lrO,SerXmt
SE_Char,OxOa

FUNCTION

SendWord,1,1,1

isend a "CR"
isend a "LF"

This routine sends a 32-bit word in ASCII hex
.reg
.reg
.reg
.reg
const

SW_Word,%%(IN_PRM + 0)
SW_Shift,%%(LOC_REG + 0)
SW_T_Flag,%%(TEMP_REG + 0)
SW_Char,%%(OUT_PRM + 0)
SW_Shift,28

srI
and
cplt
jmpt
add
add

SW_Char,SW_Word,SW_Shift
SW_Char,SW_Char,Oxf
SW_T_Flag,SW_Char,10
SW_T_Flag,SW_1
SW_Char,SW_Char,Ox30
SW_Char,SW_Char,Ox27

iconvert to ASCII digit
iconvert to ASCII letter

call
nop
subs
cpge
jmpt
nop
EPILOGUE

lrO,SerXmt

isend the character

SW Shift,SW_Shift,4
SW_T_Flag,SW_Shift,O
SW_T_Flag,SW_O

inext digit shift fact
icheck if done
icontinue if not

ithe word to send
i shift factor
icharacter to send
iright shift factor

SW_O:
iisolate nibble
icheck decimal

SW 1:

........................................
,

FUNCTION

'

RAMErr,3,0,1'

This routine reports RAM errors with the message,
"Error - RAM at aaaaaaaa write bbbbbbbb read cccccccc\n"
.reg
.reg
.reg
.reg
.reg
call
nop
call
const
call
const
call
const
call
const
call
const

3-128

RE_ErrAdd,%%(IN_PRM

+

0)

+ 1)
RE_RedPat,%%(IN_PRM + 2)
RE_Char,%%(OUT_PRM + 0)
RE_Word,%%(OUT_PRM + 0)
RE_WrtPat,%%(IN~PRM

lrO,SendErr
lrO,SerXmt
RE_Char,' R'
lrO,SerXmt
RE_Char,'A'
lrO,SerXmt
RE_Char,'M'
lrO,SerXmt
RE_Char, ,
lrO,SerXmt
RE_Char,'A'

isend "Error -

isend a "R"
isend a "A"

isend a "M"
isend a

isend a "A"

"

Programming Standalone Am29000 Systems
call
const
call
const
call
add
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
add
call
const
call
const
call
const
call
const
call
const
call
const
call
add
call
nop
EPILOGUE

lrO,SerXmt
RE_Char,'T'
lrO,SerXmt
RE_Char,' ,
lrO,SendWord
RE_Word,RE_ErrAdd,O
lrO,SerXmt
RE_Char,' ,
lrO,SerXmt
RE_Char,' w'
lrO,SerXmt
RE_Char,' r'
lrO,SerXmt
RE_Char,' i'
lrO,SerXmt
RE_Char,'t'
lrO,SerXmt
RE_Char,'e'
lrO,SerXmt
RE_Char, ,
lrO,SendWord
RE_Word,RE_WrtPat,O
lrO,SerXmt
RE_Char, ,
lrO,SerXmt
RE_Char,'R'
lrO,SerXmt
RE_Char, , e'
lrO,SerXmt
RE_Char,'a'
lrO,SerXmt
RE_Char,'d'
lrO,SerXmt
RE_Char,' ,
lrO,SendWord
RE_Word,RE_RedPat,O
lrO,SendNL

FUNCTION

ROMErr,l,O,l

;send a "T"
;send a
;send error address

;send a
;send a "w"
;send a "r"
;send a "i"
;send a "t"
;send a "e"
;send a
;send good pattern

;send a
;send a "R"
;send a "e"
;send a "a"
;send a "d"
;send a
;send bad pattern
;send a new line

This routine reports a ROM sum error with the message,
"Error - ROM sum aaaaaaaa\n"
.reg
.reg
.reg
call
nop
call
const
call
const
call
const
call
const
call
const

ROM_Sum,%%(IN_PRM + 0)
ROM_Char, %% (OUT_PRM + 0)
ROM_Word,%%(OUT_PRM + 0)
lrO,SendErr
lrO,SerXmt
ROM_Char, 'R'
lrO,SerXmt
ROM_Char,'O'
lrO,SerXmt
ROM_Char,'M'
lrO,SerXmt
ROM_Char, ,
lrO,SerXmt
ROM_Char, , 5'

;send "Error - "

;send a "R"
;send a "0"
;send a "M"
;send a
;send a "5"

3-129

29K Family Application Notes
call
const
call
const
call
const
call
const
call
const
call
add
call
nop
EPILOGUE

lrO,SerXmt
ROM_Char,'u'
IrO,SerXmt
ROM_Char,'m'
IrO,SerXmt
ROM_Char, ,
IrO,SerXmt
ROM_Char, '='
lrO,SerXmt
ROM_Char, '
IrO,SendWord
ROM_Word,ROM_Sum,°
IrO,SendNL

FUNCTION

SizeErr,O,O,l

;send a "u"
;send a "mil
;send a
;send a "_,,
;send a
;send ROM check sum
;send a new line

This routine reports insufficient RAM size with the message
"Error - RAM size\n"
.reg
call
nop
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
nop
EPILOGUE

FUNCTION

SIZ_Char,%%(OUT_PRM + 0)
IrO,SendErr
IrO,SerXmt
SIZ_Char,'R'
lrO,SerXmt
SIZ_Char, 'A'
IrO,SerXmt
SIZ_Char,'M'
lrO,SerXmt
,
SIZ_Char, '
lrO,SerXmt
SIZ_Char,'s'
lrO,SerXmt
SIZ_Char,'i'
lrO,SerXmt
SIZ_Char, 'z'
lrO,SerXmt
SIZ_Char,'e'
lrO,SendNL

;send "Error - "

;send a \\R"
;send a "A"
;send a "Mil
;send a
;send a "s"
;send a "i"
;send a "z"
;send a "e"
;send a new line

TrapErr,O,O,l

This routine reports insufficient RAM size with the message
"Error - Invalid trap\n"
.reg
call
nop
call
const
call
const
call
const
call

3-130

TE_Char,%%(OUT_PRM + 0)
lrO,SendErr
lrO,SerXmt
TE_Char, 'I'
lrO,SerXmt
TE_Char,'n'
lrO,SerXmt
TE_Char, 'v'
lrO,SerXmt

;send "Error - "

;send a "I"
;send a "nil
;send a "v"

Programming Standalone Am29000 Systems
const
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
const
call
nop
EPILOGUE

TE_Char,'a'
lrO,SerXmt
TE_Char,'l'
lrO,SerXmt
TE_Char,' i'
lrO,SerXmt
TE_Char,'d'
lrO,SerXmt
TE_Char, , ,
lrO,SerXmt
TE_Char,'t'
lrO,SerXmt
TE_Char,' r'
lrO,SerXmt
TE_Char,'a'
lrO,SerXmt
TE_Char,'p'
lrO,SendNL

.eject
.sbttl

"ROM Checksum Test"

FUNCTION

ROMSum,2,0,1

isend a "a"
isend a "1"
isend a "iN
isend a "d"
isend a
isend a "t"
isend a "r"
isend a \\a"
isend a "p"
; send a new line

This routine is used to ensure that the ROM is "intacted" correctly by using
the checksum checking method.
In:

(see below)

Out:

(see below)
.reg
.reg
.reg
.reg
.reg
xor
sub

RS_StrtAdd,%%(IN_PRM + 0)
RS_WrdCnt,%%(IN_PRM + 1)
RS_SumTmp,%%(TEMP_REG + 0)
RS_ChkSum,%%(OUT_PRM + 0)
RS_Fail,%%(RET_VAL + 0)
RS_ChkSum,RS_ChkSum,RS_ChkSum
RS_WrdCnt,RS_WrdCnt,2

load
add
jmpfdec
add

CD, ROM_CTL, RS_SumTmp, RS_StrtAdd
RS_ChkSum,RS_ChkSum,RS_SumTmp
RS_WrdCnt,RS_1
RS_StrtAdd,RS_StrtAdd,4

cpneq
jmpf
nop

RS_Fail,RS_ChkSum,O
RS_Fail,RS_EXIT

call
nop
const
consth

lrO,ROMErr
iO/P para -- ChkSum
RS_Fail,TRUE
RS_Fail,TRUE

istart address
iword count

;TRUE for fail
iclear ChkSum
ifor jmpfdec

iadd to ChkSum
inext ROM addr
;if ChkSum == 0 then
iRS_PASS else RS_ERR

icall ROMErr routine
iTRUE for test fail

3-131

29K Family Application Notes
RS EXIT:
EPILOGUE

.eject
.sbttl

"RAM 01 Test"

FUNCTION

RAM01,2,0,3

This routine tests the RAM by the following method set all RAM area to 0 then check
for o. set all RAM area to 1 then check for 1.
In:

(see below)

Out:

(see below)
.reg
.reg
.reg
.reg
.reg
.reg
.reg
xor

ROl_StrtAdd,%%(IN_PRM + 0)
ROl_WrdCnt,%%(IN_PRM + 1)
ROl_TmpCnt, %% (TEMP_REG + 0)
R01_NxtAdd,%%(OUT_PRM + 0)
ROl_WrtPat,%%(OUT_PRM + 1)
ROl_RedPat,%%(OUT_PRM + 2)
ROl_Fail,%%(RET_VAL + 0)

add
sub

R01_NxtAdd,R01_StrtAdd,0
ROl_TmpCnt,ROl_WrdCnt,2

store
jmpfdec
add

CD, DATA_CTL, R01_WrtPat, R01_NxtAdd
ROl_TmpCnt,R01_l
ROl_NxtAdd,R01_NxtAdd,WRD_SIZ

add
sub

R01_NxtAdd,R01_StrtAdd,0
ROl_TmpCnt,ROl_WrdCnt,2

load
cpneq
jmpt
nop
jmpfdec
add
cpeq
jmpt
nor
jmp
nop

CD,DATA_CTL,ROl_RedPat,ROl_NxtAdd
ROl_Fail,ROl_RedPat,ROl_WrtPat
ROl_Fail,ROl_ERR

call
nop
const
consth

lrO,RAMErr

ROl~WrtPat,R01_WrtPat,ROl_WrtPat

ROl 0:

-

;starting address
;count of words
; counter
;error addres
;pattern written
;pattern read
;TRUE for fail
;0 to start
;set O's or l's
;get strt RAM addr
; for jmpfdec

ROl 1 :

-

;check for O's or l's
;get strt RAM addr
; for jmpfdec

R01 2 :

-

R01_TmpCnt,ROl_2
ROl_NxtAdd,ROl_NxtAdd,WRD_SIZ
ROl_Fail,ROl_WrtPat,O

;if WrtPat = 0 then
;ROl_O else done
ROl_WrtPat, ROl_WrtPat, ROl_WrtPat ; invert ptrn
ROl_EXIT
;pass 0 and 1 test

ROl ERR:

-

;O/P Parms -- NxtAdd,WrtPat,RedPat

ROl_Fail,TRUE
R01_Fail,TRUE

EPILOGUE

.eject
.sbttl

3-132

;err if neq

"RAM Checker Pattern Test"

;TRUE for test fail

Programming Standalone Am29000 Systems
FUNCTION

RAMChkr,2,0,3

This routine will run a two-pass checkerboard on RAM. It will be controlled by
input values specifying the base address and the count of locations to be tested.
In:

(see below)

Out:

(see below)
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
const
consth

RC_StrtAdd,%%(IN_PRM + 0)
RC_WrdCnt,%%(IN_PRM + 1)
RC_TmpCnt,%%(TEMP_REG + 0)
RC_StrtPat,%%(TEMP_REG + 1)
RC_NxtAdd,%%(OUT_PRM + 0)
RC_WrtPat,%%(OUT_PRM + 1)
RC_RedPat,%%(OUT_PRM + 2)
RC_Fail,%%(RET_VAL + 0)
RC_StrtPat,CHKPAT_aS
RC_StrtPat,CHKPAT_aS

add
sub
add

RC_NxtAdd,RC_StrtAdd,O
RC_TmpCnt,RC_WrdCnt,2
RC_WrtPat,RC_StrtPat,O

store
R_LEFT
jmpfdec
add

0, 0, RC_WrtPat,RC_NxtAdd
RC WrtPat
RC_TmpCnt,RC_2
RC_NxtAdd,RC_NxtAdd,4

add
sub
add

RC_NxtAdd,RC_StrtAdd,O
RC_TmpCnt,RC_WrdCnt,2
RC_WrtPat,RC_StrtPat,O

load
cpneq
jmpt
nop
R_LEFT
jmpfdec
add

CD,DATA_CTL,RC_RedPat,RC_NxtAdd
RC_Fail, RC_RedPat, RC_WrtPat
RC_Fail,RC_ERR

RC 1:

jstarting address
jcount of words
;total test word count
jstarting pattern
jerror address
jpattern written
jpattern read
;TRUE for fail
;start with as

;fill memory with pattern
;get start address
;for jmpfdec
jset the pattern

RC 2:
;rotate ptrn left
jnext test mem addr
; check memory for pattern
;get start address
; for jmpfdec
;set the pattern

RC 3:

RC_WrtPat
RC_TmpCnt,RC_3
RC_NxtAdd,RC_NxtAdd,4

jerr if neq

;rotate ptrn left

nor
jmpt
nop
jmp
nop

RC_StrtPat,RC_StrtPat,O
RC_StrtPat,RC_EXIT

;next test mem addr
; invert ptrn for next pass
; invert initial
;done if msb = 1

RC 1

;try with inverted

call
nop
const
consth

lrO,RAMErr

RC ERR:

RC_Fail,TRUE
RC_Fail,TRUE

;set after call

RC EXIT:
EPILOGUE

3-133

29K Family Application Notes
.eject
.sbttl

"RAM Address Pattern Test"

FUNCTION

RAMAddr,2,O,3

This routine will run a two-pass test on RAM.
It will be controlled by input values
specifying the base address and the count of locations to be tested.
In the first
pass, the data will be set equal to the address.
In the second pass, the data will
be set equal to the complement of the address.
In:

(see below)

Out:

(see below)
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
add
const

RA_StrtAdd,%%(IN_PRM + 0)
RA_WrdCnt,%%(IN_PRM + 1)
RA_TmpCnt,%%(TEMP_REG + 0)
RA_StrtPat,%%(TEMP_REG + 1)
RA_PtrnInc,%%(TEMP_REG + 2)
RA_NxtAdd,%%(OUT_PRM + 0)
RA_WrtPat,%%(OUT_PRM + 1)
RA_RedPat,%%(OUT_PRM + 2)
RA_Fail,%%(RET_VAL + 0)
RA_StrtPat,RA_StrtAdd,O
RA_PtrnInc,4

add
sub
add

RA_NxtAdd,RA_StrtAdd,O
RA_TmpCnt,RA_WrdCnt,2
RA_WrtPat,RA_StrtPat,O

:fill memory with pattern
:get start address
: for jmpfdec
:set the pattern

store
add
jmpfdec
add

O,O,RA_WrtPat,RA_NxtAdd
RA_WrtPat,RA_WrtPat,RA_PtrnInc
RA_TmpCnt,RA_2
RA_NxtAdd,RA_NxtAdd,4

:next test mem addr

RA 1 :

;starting address
;count of words
;total test word count
;starting pattern
;ptrn increment value
:error address
:pattern written
: pattern read
:TRUE for fail
;start with address

RA_2:

check memory for pattern
add
RA_NxtAdd,RA_StrtAdd,O
sub
RA_TmpCnt,RA_WrdCnt,2
add
RA_WrtPat,RA_StrtPat,O

3-134

load
cpneq
jmpt
nop
add
jmpfdec
add

CD,DATA_CTL,RA_RedPat,RA_NxtAdd
RA_Fail,RA_RedPat,RA_WrtPat
RA_Fail,RA_ERR

nor
cpneq
jmpt
subr
jmp
nop

RA_StrtPat,RA_StrtPat,O
RA_Fail,RA_StrtPat,RA_StrtAdd
RA_Fail, RA_1
RA_PtrnInc,RA_PtrnInc,O
RA_EXIT

RA_WrtPat,RA_WrtPat,RA_PtrnInc
RA_TmpCnt,RA_3
RA_NxtAdd,RA_NxtAdd,4

:get start address
: for jmpfdec
:set the pattern

:err if neq

:next test mem addr
; invert ptrn for next pass
:invert initial

:negate inc value

Programming Standalone Am29000 Systems
call
nop
const
consth

lrO,RAMErr
RA_Fail,TRUE
RA_Fail, TRUE

iset after call

EPILOGUE

.eject
.sbttl

~Invalid

Trap Handler"

InvalidTrapHandler:
This routine receives control when an invalid trap occurs.
It will reinitialize
a register frame for use in error reporting.
It then reports the fact that an
invalid trap has occurred. Reporting of specific trap numbers could be achieved,
but at considerable cost in size.
The use of an instrument such as the ADAPT29KTM
is recommended for invalid trap identification.
If that is not practical, this
handler (or some other) could be extended to report numbers.
It would require 2K
bytes of additional code (jmp/const for each of 256 vectors).
mtsrim
const
const
sub
call
add
call
nop
halt
nop

cps,Ox173
rfb,5l2
rab,O
rsp,rfb,8
IrO,SerInit
Irl,rfb,O
I rO, T rapErr

.eject
.sbttl

~Vector

LEAF

VectInit,O

iRE,PD,PI,SM,DI,DA
iset up temp reg frame
iroom for linkage
iready to report errors
ismall frame required
ishow trap error

Initialization"

This routine initializes the vector table and vab.
All vectors
are set to point to the invalid trap handler in ROM.
.reg
.reg
.reg
mtsrim
mfsr
const
consth
const

VI_Vect,%%(TEMP_REG + 0)
VI_VectSt,%%(TEMP_REG + 1)
VI_VectCnt,%%(TEMP_REG + 2)
vab,O
VI_VectSt,vab
VI_Vect, (InvalidTrapHandler I 2)
VI_Vect,InvalidTrapHandler
VI_VectCnt, (256 - 2)

ivector value
ivector storage address
ivector count register

store
jmpfdec
add
EPILOGUE

O,O,VI_VectSt,VI_Vect
VI_VectCnt,VI_Loop
VI_VectSt,VI_VectSt,4

istore the vector

ifor jmpfdec

VI_Loop:

3·135

29K Family Application Notes
.eject
.sbttl

"Boot"

Boot:
This routine receives control upon a hardware reset.
Its purpose
is to establish the execution environment for the main program. This involves
transcriptions of data and possibly code.
The transcriptions may
take the form of executing code since the ROM may not be readable.
.reg
mtsrim
const
const
sub
add
call
nop
const
consth
call
const
call
nop
call
mtsrim
mtsrim
const
consth
mtsr
add
mtsr
iretinv

; end of boot.s

3-136

RI_Ret, %% (TEMP_REG + 0)
cps,Oxl73
rfb,5l2
rab,O
rsp,rfb,l6
lrl,rfb,O
lrO,SerInit
pl, (RAM_SIZE
pl, (RAM_SIZE
lrO,RAMAddr
pO,O
lrO,VectInit

»
»

2)
2)

RI_Ret,RAMInit
ops,Ox473
cps,Ox473
lrO,TextBas
lrO,TextBas
pcl,lrO
lrO,lrO,4
pcO,lrO
;go to inst space,TextBas

; RAMIni t return
;RE,PD,PI,SM,DI,DA
;set up temp reg frame
;enough for pO and pl
;ready to report errors
;test full RAM size
;just use one test
;test from address zero
;invalid traps
; initialize RAM
;FZ,PD,PI,SM,DI,DA
;FZ,PD,PI,SM,DI,DA
; (using lrO as temp)

Programming Standalone Am29000 Systems

APPENDIX 8: start.s
.title

"Start and Other Assembly-language Routines"

Copyright 1988, Advanced Micro Devices,Inc.
Written by Gibbons and Associates, Inc.
HISTORY:
1.3 29 July 88 E M Greenawalt SPR 0001
Fixed shift count on line 1034
This module provides initializations and trap handling for a program written in C
and operating in a stand alone environment.
It is designed for compatibility with
the ADAPT29K and various Am29000 monitors.
In this module, the first 16 system registers (gr64-gr79) are available for use as
system statics. They are not used in any of the routines in this file.
Their
values are not saved and restored in the C interrupt handler interrupts, so they
are truly static.
The second 16 system registers (gr80,-gr95) are used as temporary registers by trap
handlers, etc., in this module.
No such trap handler is itself interruptable. No
presumption is made about the preservation of values in these registers by any
program.
.extern
.global
.global
NOTE:

main
V_SPILL
V FILL

;the C main routine
;the spill/fill vectors

The equates below define the padding in the vector
section (to a full page), and constants related to
the page size.
The register and memory stack size
are also declared.

When operating with a monitor, the VECT PAD may need to be increased.
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
. include

NOTE:

PS,3
RPN_SHIFT, (10 + PS)
PAGE_SIZE, (1 « RPN_SHIFT)
MMU_PS, (PS « 8)
RPN_MASK, (- (PAGE_SIZE - 1))
VECT_PAD, (PAGE_SIZE - Ox400)
RSTK_SIZE,PAGE_SIZE
MSTK_SIZE,PAGE_SIZE
"romdcl.h"

;page size designation

The equates below define traps for divide by zero
and divide overflow.
They are not standard.
They
are not handled here.
.equ
.equ
.eject
.sbttl

V_DIVO,80
V_DIVOV,81

;divide by zero
;divide overflow

"Section Declarations"

3-137

29K Family Application Notes
Sections will be ordered in memory as shown below.
vectors
rstack
mstack
.data
.bss
.text
endsect

(at 0)
(register stack)
(memory stack)

(dummy for establishing bounds)

Vectors will be initialized by start-up code with pointers to an invalid trap
handler in ROM.
The initialization code will explicitly intercept those vectors
that will be handled.
.sect
.sect
.sect
.sect

vectors,bss
rstack,bss
mstack,bss
endsect,bss

The declarations that follow suggest the order of the segments, provide base
names for each, and allocate sizes for the vectors and stacks.
Jump instructions are also provided at the base of the .text section for ease
in linkage to the Start routine and the special routine which provides for
ADAPT29K initializations.
.use
.block
.block
.use

vectors
(4

*

256)

VECT_PAD
rstack

RStkBase:
.block
RStkTop:
.use

mstack

MStkBase:
.block
MStkTop:
.data
DataBase:

;base of init data
.bss

BSSBase:

;base of BSS data
.text

TextBase:
jmp
nop
jmp
nop
.use

3-138

;base of .text
Start
Adaptlnit
endsect

;allows easy linkage to Stait
;for bootstrap code
;makes Adaptlnit easier to find

Programming Standalone Am29000 Systems
;marks end of .text
;dummy to assure existence
;switch back to text

EndBase:
.block
.text
.eject
.sbttl
.global
.global
.global
.global

"Timer read/write functions"
_GetTmCnt
_SetTmCnt
GetTmRld
SetTmRld

LEAF

_GetTmCnt,O

This routine returns the timer/counter register value.
i.e., no mask is applied.
In:

(nothing)

Out:

(see below)
.reg
mfsr
EPILOGUE

GTC_Val,%%(RET_VAL + 0)
GTC_Val,tmc

LEAF

_SetTmCnt,l

This routine sets the timer/counter register value.
i.e., no mask is applied.
In:

·(see below)

Out:

(nothing)
.reg
mtsr
EPILOGUE

STC_Val,%%(IN_PRM + 0)
tmc,STC_Val

LEAF

_GetTmRld,O

All the fields are returned;

;timer reg value

All the fields are set;

;timer reg value

This routine gets the current contents of the timer reload register.
are applied.
In:

(nothing)

Out:

(see below)
.reg
mfsr
EPILOGUE

GTR_Val,%%(RET_VAL + 0)
GTR_Val,tmr

LEAF

_SetTmRld,l

This routine sets the timer/counter reload value.
i.e., no mask is applied.
In:

No masks

;timer reload value

All the fields are set;

(see below)

3-139

29K Family Application Notes
; Out:

(nothing)
.reg

mtsr

STR_Val,%%(IN_PRM + 0)
tmr,STR_Val

;timer reload value

EPILOGUE

.eject
.sbttl

"32-bit Time Extensions"

The routines below extend the timer counter to 32 bits via a trap handler. The
32-bit value may be initialized and read by C-callable routines declared as
globals. The trap handler is also included. Note that the caller of the C routines
must be running in supervisor mode .
. global
.global
.bss

_ClrTm32
_GetTm32
;switch to declare bss

.block
.text

4

LEAF

_ClrTm32,0

TimeUpper:
;reserve a word for extension
;switch back

This routine clears the 32-bit extended counter by setting the tmc, tmr and
software extension value.
The timer interrupt is also enabled in tmr.
In:

(nothing)

Out:

(nothing)

(timer initialized to zero)

Temp:
.reg
.reg
const
consth
mtsr
consth
mtsr
const
consth
const
store
EPILOGUE

CTVal,%%(TEMP_REG + 0)
CTUpPt,%%(TEMP_REG + 1)
CTVal,Oxffffff
CTVal,Oxffffff
tmc,CTVal
CTVal,Ox1ffffff
tmr,CTVal
CTUpPt,TimeUpper
CTUpPt,TimeUpper
CTVal,O
O,O,CTVal,CTUpPt

(see below)
;timer reg value
;upper pointer
;for tc and TimeUpper
;should keep it busy
;set ie

;no extension

LEAF _GetTm32,O
This routine returns a 32-bit clock counter.
The clock counter is implemented
by extending the hardware counter in software and negating the value before it is
returned. The negation causes the returned value to be an up counter of the time
since the counter was last reset. The low-level timer access routines may be used
in initializations to assure a desired starting value.
The software extension to 32 bits introduces a coordination problem in reading
the counter's value.
This is resolved by reading the upper 8 bits both before
and after the TC value.
If the TC value is greater than 2**23, the second upper
value read is presumed to be correct.
Lengthy interruptions of this ,routine
(> 2**21 clocks) could cause errors.
In:

3-140

(nothing)

Programming Standalone Am29000 Systems
Out:

(see below)
(see below)

Temp:
.reg
.reg
.reg
.reg
.reg
.reg
const
consth
load
add
mfsr
load
sll
jmpf
or
or

TUpPt,%%(TEMP_REG + 0)
TUpr1,%%(TEMP_REG + 1)
TUpr2,%%(TEMP_REG + 2)
TLwr,%%(TEMP_REG + 3)
TChk,%%(TEMP_REG + 4)
T32,%%(RET_VAL + 0)
TUpPt,TimeUpper
TUpPt,TimeUpper
0,0,TUpr1,TUpPt
TUpr1,TUpr1,0
TLwr, tmc
0,0,TUpr2,TUpPt
TChk,TLwr,8
TChk,GT_Exit
T32,TLwr,TUpr1
T32,T32,TUpr2

;upper time pointer
;upper time bits - 1st read
;upper time bits - 2nd read
;lower time bits - from cntr
;temp to check high bit
;32-bit time value
;get upper 8 bits of timer

subr
EPILOGUE

T32,T32,0

;negate to count up from zero

;hold till load complete
;get upper 8 bits again
;is upper TC bit set?
;if not, use 1st read
;poss ovfl before 2nd read
;poss ovfl after 1st read

GT Exit:

TimerHandler:
This routine handles the timer trap. The timer trap will occur at intervals in the
range of a second (depending on the actual clock speed). The extension to 32 bits
makes the timer somewhat more useful for common benchmarks. A different scheme
would be required for longer intervals.
.reg
.reg
.reg
mfsr
sll
srl
mtsr
const
consth
load
srl
sub
sll
store
iret

THTr,%%(SYS_TEMP + 0)
THUpPt,%%(SYS_TEMP + 0)
THUpVl,%%(SYS_TEMP + 1)
THTr,tmr
THTr,THTr,7
THTr,THTr,7
tmr,THTr
THUpPt,TimeUpper
THUpPt,TimeUpper
0, 0, THUPY1,THUpPt
THUpVl,THUpVl,24
THUpVl,THUpVl,l
THUpVl,THUpVl,24
0, 0, THUpVl,THUpPt
;done

.eject
.sbttl
.global

"C Interrupt Handler Interface"
Clntf

;temp for tmr (shared)
;pointer to upper 8 bits
;upper a-bit value
;clear out upper tmr bits
;leaving ie alone
;decrement the upper bits

Clntf:
This routine is used to call a C routine that will handle an interrupt.
In order
to accomplish this, the context of the current program must be saved prior to the
call and restored after the call.
It is relatively expensive.
In many
instances, it may be best to write the interrupt handlers in assembly-language. Note

3-141

29K Family Application Notes
that assembly-language handlers will have the system statics available to retain
state information. Note also that system statics are not saved and restored here.
They are "static."
This routine receives as inputs the address of the C routine and the vector number.
It passes the vector number to the C routine as its only parameter. An initial
stack of 16 registers (including inputs) is provided to the C routine.

+ 0)
+ 1)

In:

(SYS_TEMP
(SYS_TEMP

Out:

(nothing)

Temp:

(SYS_TEMP 2-13)
(see below)

3-142

C routine address
vector number

used to hold specials

.reg
.reg
.reg
.reg
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
mfsr
add
mtsrim
sub
const
consth
asge
store
mtsr
storem
add
const
sub
add
sub
mtsr
storem
add
add
calli
mtsrim

CI_Rout,%%(SYS_TEMP + 0)
CI_Vect,%%(SYS_TEMP + 1)
CI_Stk,%%(SYS_TEMP + 14)
CI_Frm,%%(SYS_TEMP + 14)
st2,ops
st3,cha
st4,chd
st5,chc
st6,pcO
st7,pc1
'sta, ipc
st9,ipa
st10,ipb
stU, q
st12,alu
st13,rsp,0
cps,Ox73
msp,msp, «64 - 16) * 4)
CI_Stk,MStkBase
CI_Stk,MStkBase
V_DataTLBProt,msp,CI_Stk
O,O,graO,msp
im
O,O,graO,msp
rfb,rsp,O
CI_Frm,512
rab,rfb,CI_Frm
rsp, rab, (13 * 4)
msp,msp, (16 * 4)
im
O,O,rab,msp
lr1,rfb,0
pO,CI_Vect,O
lrO,CI_Rout
cps,Ox13

mtsrim
sub
mtsrim
loadm
add
mtsrim
loadm
add
mtsr

cps,Ox73
rab, rsp, (13 *
CR, (16 - 1)
O,O,rab,msp
msp,msp, (16 *
CR, «64 - 16)
0,0,gr64,msp
msp,msp, «64 ops,st2

ithe C routine
ithe vector
istack check value
iframe size (shared)
isave specials temps

;PD,PI,SM,DI,DA
iallocate space for globals
icheck for overflow
isimulate Prot (no return on fail)
iflush for CPU bug
CR, « 64 - 16) - 1)
isave the globals
imove down the frame
ibeneath rsp
iset rsp in 16 reg frame
isave the frame
CR, (16 - 1)

4)

irequire remaining locals
ivector is output parm 0
;call the handler
iwith prot and no ints (no good
i for more complex TLB schemes)
iready to reload
ireload locals in frame

4)
- 1)

ireload globals

16)

*

4)
irestore specials

Programming Standalone Am29000 Systems
mtsr
mtsr
mtsr
mtsr
mtsr
mtsr
mtsr
mtsr
mtsr
mtsr
add
iret

cha,st3
chd,st4
chc,st5
pcO,st6
pc1,st7
ipc,stS
ipa,st9
ipb,st10
q,st11
alu,st12
rsp,st13,0
; return from int

.eject
.sbttl

"Multiply and Divide Handlers"

MultiplyHandler:
This trap handler performs the (signed) operation:
DEST//Q <- SRCA * SRCB.
IPC, IPA, and IPB are set by the MULTIPLY instruction prior to the invocation of
this trap handler.
In:

IPC
IPA
IPB

DEST
SRCA
SRCB

Out:

DEST//Q

IPB

Temp:

(see below)
.reg
mtsr
mfsr
mtsr
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul
mul

IPC

MH_ IP,%%(SYS_TEMP + 0)
q,grO
MH_IP,ipc
ipb,MH_IP
grO,grO,O
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO

. (unimportant side effect)

; temp for move operation
;SRCB (multiplier) to Q
;use a system temp to set
; ipb = ipc
;step I. (no initial prod)
;step 2.
;step 3.
;step 4.
; step 5.
;step 6.
; step 7.
; step S.
;step 9.
;step 10.
;step 1I.
;step 12.
; step 13.
; step 14.
;step 15.
; step 16.
; step 17.
;step lS.
; step 19.
; step 20.
;step 21; step 22.
; step 23.
; step 24.
; step 25.

3-143

29K Family Application Notes
mul
mul
mul
mul
mul
mul
mull
iret

grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
;done

; step
;step
;step
; step
;step
;step
;step

26.
27.
28.
29.
30.
31.
32.

This trap handler performs the (unsigned) operation
DEST//Q <- SRCA * SRCB.
IPC,IPA,and IPB are set by the MULTIPLU instruction prior to
the invocation of this trap handler.
In:

IPC
IPA
IPB

DEST
SRCA
SRCB

Out:

DEST//Q
IPB = IPC

(unimportant side effect)

Temp:

(see below)
.reg
mtsr
mfsr
mtsr
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mulu
mu1u
mulu
mulu
mulu
mulu
mulu
mu1u
mulu
mulu
mulu
mulu
iret

3-144

MU IP,%%(SYS_TEMP + 0)
q,grO
MU_IP,ipc
ipb,MU_IP
grO,grO,O
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
grO,grO,grO
; done

;temp for move operation
;SRCB (multiplier) to Q
;use a system temp to set
; ipb = ipc
;step I. (no initial prod)
;step 2.
; step 3.
;step 4.
;step 5.
;step 6.
;step 7.
; step 8.
;step 9.
;step 10.
;step II.
; step 12.
; step 13.
;step 14.
;step 15.
; step 16.
;step 17.
;step 18.
;step 19.
;step 20.
;step 2I.
; step 22.
;step 23.
; step 24.
;step 25.
;step 26.
; step 27.
;step 28.
;step 29.
;step 30.
;step 3I.
;step 32.

Programming Standalone Am29000 Systems

DivideHandler:
This trap handler performs the (signed) operation:
DEST <- (SRCA//Q) / SRCB
IPC,IPA,and IPB are set by the DIVIDE instruction prior to
the invocation of this trap handler.
;In:

IPC
IPA
IPB
Q

Out:

DEST

Temp:

(see below)
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
add
mfsr
sub
add
asneq

DividendCheck:
jmpf
const
cpeq
subr
subre
DivisorCheck:
jmpf
nop
cpeq
subr

DEST
SRCA
SRCB

D_Rmdr,%%(SYS_TEMP + 0)
D_Dvsr,%%(SYS_TEMP + 1)
D_Sign,%%(SYS_TEMP + 2)
D_DvdHi,%%(SYS _TEMP + 3)
D_DvdLo,%%(SYS_TEMP + 4)
D_Quot,%%(SYS TEMP + 5)
D_Ovfl, %% (SYS _TEMP + 6)
D_MnNg,%%(SYS_TEI1P + 7)
D_DvdHi,grO,O
D_DvdLo,q
D_Dvsr,D_ Dvsr,O
D_Dvsr,D_Dvsr,grO
V_DIVO,D_ Dvsr,O

D_DvdHi,DivisorCheck
D_Sign, FALSE
D_Sign, D_Sign,
D_DvdLo,D_DvdLo,O
D_DvdHi,D _DvdHi,

°

°

;shift area and remainder
;divisor
;0 for positive
;dividend high
';dividend low

;most negative integer
;SRCA is dividend high
;Q is dividend low
;divisor is in SRCB
;any easier access?
;check for divisor zero

;toggle flag
;negate dividend

D_Dvsr, DivideOp

°

D_Sign,D_Sign,
D_Dvsr,D_ Dvsr,O

;toggle flag
;negate divisor

q,D_DvdLo
D_Rmdr, D_DvdHi
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D - Dvsr
D_Rmdr,D_Rmdr,D- Dvsr
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D- Dvsr
D_Rmdr,D_Rmdr,D Dvsr
D_Rmdr,D_Rmdr,D _Dvsr
D_Rmdr,D_Rmdr,D- Dvsr
D_Rmdr,D_Rmdr,D- Dvsr

;dividend low to q
;D_Rmdr becomes shift high
; step I.
; step 2.
; step 3.
;step 4.
;step 5.
; step 6.
; step 7.
; step 8.
; step 9.
;step 10.
; step II.
; step 12.

DivideOp:
mtsr
divO
div
div
div
div
div
div
div
div
div
div
div
div

3·145

29K Family Application Notes
D Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr, D_Rmdr, D_Dvs"r
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Rmdr,D_Rmdr,D_Dvsr
D_Quot,q
D_Ovfl,D_Quot,O
D_Sign,DivideCorrect

; step 13.
;step 14.
; step 15.
;step 16.
; step 17.
; step 18.
; step 19.
; step 20.
; step 21; step 22.
; step 23.
; step 24.
;step 25.
;step 26.
; step 27.
; step 28.
; step 29.
; step 30.
; step 31;don't need remainder
;get quotient out of q
; check overflow

D_MnNg,D_MnNg,D~MnNg

D_Ovfl,D_MnNg,D_Quot
D_Ovfl,D_Ovfl,D_Sign

;set most neg
;check for most neg
;allow if to be neg

DivideCorrect:
jmpf
aseq
subr
subr

D_Sign,DivideExit
V_DIVOV,D_Ovfl,O
D_Quot,D_Quot,0
D_Rmdr,D_Rmdr,O

;done if positive
;trap on overflow
; negate quotient
;don't need remainder

DivideExit:
add
iret

grO,D_Quot,O
;done

;set DEST

div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
divrem
mfsr
cplt
jmpf
cpeq
cpeq
cpneq

DividuHandler:
This trap handler performs the (unsigned) operation:
DEST <- (SRCA//Q) / SRCB
IPC,IPA,and IPB are set by the DIVIDU instruction prior to
the invocation of this trap handler.
In:

IPC
IPA
IPB
Q

Out:

DEST

Temp:

(see below)
.reg
add
divO
div
div
div

3-146

DEST
SRCA
SRCB

DU_Rmdr, %% (SYS_TEMP
DU_Rmdr,grO,O
DU_Rmdr,DU_Rmdr
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO

+ 0)

;shift area and remainder
;SRCA to DU_Rmdr
;DU_Rmdr becomes shift high
; step 1;step 2.
;step 3.

Programming Standalone Am29000 Systems
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
div
divrem
mfsr
iret

DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr, DU_Rmdr, grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr,DU_Rmdr,grO
DU_Rmdr,DU_Rmdr,grO
grO,q
idone

.eject
.sbttl

"Spill and Fill Handlers"

istep 4.
i step 5.
istep 6.
i step 7.
i step B.
i step 9.
i step 10.
istep 1I.
i step 12.
istep 13.
i step 14.
i step 15.
i st.ep 16.
i step 17.
istep lB.
istep 19.
i step 20.
istep 2I.
istep 22.
istep 23.
istep 24.
istep 25.
istep 26.
istep 27.
istep 2B.
istep 29.
i step 30.
i step 3I.
idon't need remainder
iquotient to (ipc)

The routines below handle the allocation and free assertions
in subroutine prologues and epilogues. The temps they use
are given below.
.reg
. reg
.reg
.reg

R_Cnt,%%(SYS_TEMP + 0)
R_Bnd, %% (SYS_TEMP + 0)
R_TmpPCO,%%(SYS_TEMP + 1)
R_TmpPC1,%%(SYS_TEMP + 2)

itemp
itemp
itemp
itemp

for
for
for
for

count (shared)
boundary
PCO
PC1

SpillHandler:
This routine handles a false assertion in the standard prologue
In:

rab > rsp

(requiring an allocation)
lr1 <= rfb
rfb
rab + 512

Out:

rab

mfsr
mfsr
mtsrim
sub
sub

rsp

(just enough allocated)

R_TmpPCO,pcO
R_TmpPC1,pc1
cps,Ox73
R_Cnt,rab,rsp
rfb,rfb,R_Cnt

Ir1 <= rfb
rfb
rab + 512
isave the PCs
iPD,PI,SM,DI,DA
iR_Cnt = # of bytes to spill
imove down the frame bound

3-147

29K Family Application Notes
store
srI
sub
mtsr
storem
add
const
consth
asge

O,O,lrO,rfb
R_Cnt,R_Cnt,2
R_Cnt,R_Cnt,l
cr,R_Cnt
O,O,lrO,rfb
rab,rsp,O
R_Bnd,RStkBase
R_Bnd,RStkBase
V DataTLBProt,rab,R_Bnd

mtsrim
mtsr
mtsr
iret

cps,Ox473
pcO,R_TmpPCO
pc1,R_TmpPC1

iflush for storem bug
iR_Cnt ~ count of words to spill
icorrect for ~torem
iset up count for storem
ispill from the allocated area
imove down the allocate bound
icheck for possible overflow
isimulate TLB prot
iNOTE: no return on fail
iFZ,PD,PI,SM,DI,DA
irestore the PCs

FillHandler:
This routine handles a false assertion in the standard epilogue.
In:

Ir1 > rfb

(requiring deallocation)
rsp >= rab
rfb == rab + 512

Out:

Ir1

(just enough freed)
rsp >= rab
rfb = rab + 512
isave the PCs

rfb

mfsr
mfsr
mtsrim
const
consth
asle

R_TmpPCO,pcO
R_TmpPC1,pc1
cps,Ox73
R_Bnd,RStkTop
R_Bnd,RStkTop
V DataTLBProt,rfb,R_Bnd

const
or
mtsr
sub
add
srI
sub
mtsr
loadm
add
mtsrim
mtsr
mtsr
iret

R_Cnt,512
R_Cnt,R_Cnt,rfb
ipa,R_Cnt
R_Cnt,lr1,rfb
rab,rab,R_Cnt
R_Cnt,R_Cnt,2
R_Cnt,R_Cnt,l
cr,R_Cnt
O,O,grO,rfb
rfb,lr1,O
cps,Ox473
pcO,R_TmpPCO
pc1,R_TmpPC1

.eject
.sbttl

iPD,PI,SM,DI,DA
icheck for possible underflow
;simulate TLB prot
iNOTE: no return on fail
imake local reg ip
from rfb
iset up indirect ptr for loadm
;R_Cnt = # of bytes to fill
imove up the allocate bound
iR_Cnt = number of words to fill
icorrect for loadm
iset up count for loadm
ifill area freed
imove up frame bound
iFZ,PD,PI,SM,DI,DA
irestore the PCs

"TLB Miss Handler"

The routines below provide one-for-one TLBs, i.e., the virtual address is set equal
to the physical address. A central routine is used to do the actual TLB update.
Some enhancement would be appropriate to allow I/O access as data,i i.e.,
memory-mapped I/O. Speed improvements could be realized (four instructions) by the
allocation and initialization of system registers for the bounds.
The temp registers used are indicated'below.

3-148

Programming Standalone Am29000 Systems
.reg
.reg
.reg
.reg
.reg
.reg

TH_Ad,%%(SYS_TEMP + 0)
TH_Ac,%%(SYS_TEMP + 1)
TH_Bnd,%%(SYS_TEMP + 2)
TH_Reg,%%(SYS_TEMP + 3)
TH_WdO,%%(SYS_TEMP + 4)
TH_Wd1,%%(SYS_TEMP + 5)

;the miss address
;the required privileges
;access bound
;TLB register number
;TLB word 0 value
;TLB word 1 value

This routine handles supervisor instruction TLB misses.
An attempted access out of range is treated as an instruction

TLB protection violation.
mfsr
const
consth
asge

TH_Ad,pc1
TH_Bnd,TextBase
TH_Bnd,TextBase
V_InstTLBProt,TH_Ad,TH Bnd

const
consth
aslt

TH_Bnd,EndBase
TH_Bnd,EndBase
V_InstTLBProt,TH_Ad,TH Bnd

jmp
const

TLBHandler
TH_Ac,Ox4BOO

;NOTE: no return on fail

;NOTE: no return on fail
;VE,SE

SupDataTLBHandler:
This routine handles the supervisor data TLB misses.
It should
be enhanced to allow I/O access as well as data access.
mfsr
TH_Ad,cha
const
TH_Ac,Ox7000
;VE,SR,SW
const
TH_Bnd,MStkBase
consth
TH_Bnd,MStkBase
asge
V DataTLBProt,TH_Ad,TH_Bnd
iNOTE: no return on fail
const
TH_Bnd,TextBase
consth
TH_Bnd,TextBase
aslt
V_InstTLBProt,TH_Ad,TH Bnd
;NOTE: no return on fail
(drop through to TLB handler)

TLBHandler:
This routine handles TLB updates once it has been determined
that the update is appropriate.NOTE:

This routine presumes an BK-byte page size.

In:

TH Ad
TH_Ac
lru

the address where access is ,required
the access that is required
the recommended TLB for replacement

Out:

(lru)
constn
sl1
and
and
or
mfsr

provides access to TH Ad
TH_Wd1 , RPN_MASK
TH_WdO,TH_Wd1,5
TH_Wd1,TH_Wd1,TH_Ad
TH_WdO,TH_WdO,TH_Ad
TH_WdO,TH_WdO,TH_Ac
TH_Reg,lru

ishift for vtag
iestablish addr fields
iestablish access
iset the TLB entry

3·149

29K Family Application Notes
mttlb
add
mttlb
iret

.eject
.sbttl
LEAF

TH_Reg,TH_WdO
TH_Reg,TH_Reg,l
TH_Reg,TH_Wd1

"TLB Initialization"

TLBInit,O

This routine is uoed to initialize the TLBs.
It clears all the TLB registers, thus marking all entries invalid.
In:

(nothing)

Out:

(nothing)

Temps:

(see below)
.reg
.reg
.reg
const
const
const

TI_Reg,%%(TEMP_REG + 0)
TI_Val,%%(TEMP_REG + 1)
TI_Cnt,%%(TEMP_REG + 2)
TI_Reg,O
TI_Val,O
TI_Cnt, (TLB_CNT - 2)

mttlb
jmpfdec
add
EPILOGUE

TI_Reg,TI_Val
TI_Cnt,TI_Loop
TI_Reg,TI_Reg,l

.eject
.sbttl

;the TLB register number
;the TLB value (0)
;the TLB register count

;for jmpfdec

"Vector Initialization"

In order that the vector initialization code might be compact
and that the set of vectors initialized might be easily expanded,
a table in .data is used.
Each entry in the table has two words.
The first word is the number of the vector to be initialized.
The
second word is the address of the handler.
.data

;switch to .data for table

VectInitTable:
.word
.word
.word
.word
.word
.word
.word
.word
.word
.equ
.text

3·150

V_SupInstTLB,SupInstTLBHandler
V_SupDataTLB,SupDataTLBHandler
V_MULTIPLY, MultiplyHandler
V_DIVIDE, DivideHandler
V_MULTIPLU,MultipluHandler
V_DIVIDU,DividuHandler
V_SPILL,SpillHandler
V_FILL, FillHandler
V_Timer, TimerHandler
VINIT_CNT, ((. - VectInitTable) / 8)
;switch back to .text for code

Programming Standalone Am29000 Systems
VectInit:
This routine initialzes the vectors for which handlers exist.
vector area base

In:

vab

Out:

(vectors initialized)
(see below)

Temp:
.reg
.reg
.reg
.reg
.reg
mfsr
const
const
consth

VI_Vect,%%(TEMP_REG + 0)
VI_St,%%(TEMP_REG + 1)
VI_Cnt,%%(TEMP_REG + 2)
VI_Base,%%(TEMP_REG + 3)
VI_TbPt,%%(TEMP_REG + 4)
VI_Base,vab
VI_Cnt, (VINIT_CNT - 2)
VI_TbPt,VectInitTable
VI_TbPt,VectInitTable

;vector
;vector
;vector
;vector
;vector

load
add
sll
add
load
add
jmpfdec
store
jmpi
nop

O,O,VI_St,VI_TbPt
VI_TbPt,VI_TbPt,4
VI_St,VI_St,2
VI_St,VI_St,VI_Base
O,O,VI_Vect,VI_TbPt
VI_TbPt,VI_TbPt,4
VI_Cnt,VI_Loop
O,O,VI_Vect,VI_St
lrO

;get the vector

.eject
.sbttl

value
storage address
count
base
base

;for jmpfdec

;convert to address (fixed v1.3)
;get the handler

"ADAPT29K Initializations"

AdaptInit:
This routine is for use in situations where the bootstrap process
has not occurred.
Instead, the ADAPT29K has been used to load
the program.
Initializations of the vectors, etc., will be
required.
As an aid to fault identification, the vector table is initialized
with pointers to the words immediately following the vectors. These
words are initialized with HALT instructions. When one of these
halts executes, the ADAPT29K will report the event and the address
of the halt. This will allow the invalid trap that has occurred
to be identified.
CAUTION!

This requires that the vector pad be at least 1024.
.reg
.reg
.reg
.reg
mtsrim
mtsrim
mfsr
const
const

AI_Vect,%%(TEMP_REG + 0)
AI_St,%%(TEMP_REG + 1)
AI_Cnt,%%(TEMP_REG + 2)
AI_Halt,%%(TEMP_REG + 3)
cps,Ox73
vab,O
AI_St,vab
AI_Vect,1024
AI_Halt,Ox89000000

;vector value
;vector storage address
;vector count register
;halt instruction register
;PD,PI,SM,DI,DA

;just beyond vectors

3-151

29K Family Application Notes
consth
const

AI_Halt,Ox89000000
AI_Cnt, (256 - 2)

store
add
store
jmpfdec
add
jmp
nop

O,O,AI_St,AI_Vect
AI_St,AI_St,4
O,O,AI_Vect,AI_Halt
AI_Cnt,AI_Loop
AI_Vect,AI_Vect,4
Start

.eject
.sbttl

; for jmpfdec

:store the vector
;store the HALT

"Start"

Start:
This routine receives control after any required bootstrap processes.
It will
initialize the vectors which are actually handled, clear the BSS area, initialize
the TLBs, and establish initial stack pointers and an initial register frame.
It will then invoke _main.
In the event that _main returns, this routine will perform a warm start.
In:

vab

Out:

(nothing)
mtsrim
mtsrim
mtsrim
const
consth
const
consth
add
sub
const
consth
call
nop
call
nop
call
nop
mtsrim
const
const
call
nop
mtsrim
mtsrim
mtsrim
mtsrim
mtsrim
mtsrim
iretinv

; end of start.s

3·152

indicates vector area

cps,Ox73
mmu,MMU_PS
cfg,Ox10
rfb,RStkTop
rfb,RStkTop
rab, (RStkTop - 512)
rab, (RStkTop - 512)
lr1,rfb,O
rsp,rfb,16
msp,MStkTop
msp,MStkTop
lrO,Vectlnit

;install handled vectors

lrO,TLBInit

;establish TLBs invalid

lrO, ClrTm32
(leave to _main ???)
cps,Ox10
lr2,O
lr3,O
lrO, main

;clear and enable timer

cps,Ox473
ops,Ox173
cfg,l
chc,O
pc1,O
pcO,4

:FZ,PD,PI,SM,DI,DA
;RE,PD,PI,SM,DI,DA
;cache disabled
;contents invalid
;cold start address

-

;PD,PI,SM,DI,DA
;order It ~ 0
;VF
;set up stack pointers

;lrO,lr1,argc,argv

iSM
;argc
;argv

=

0
0

-

Programming Standalone Am29000 Systems

APPENDIX C: test.s
.title

"Test of Assembly-language Utilities"

Copyright 1988, Advanced Micro Devices, Inc.
Written by Gibbons and Associates, Inc .
"romdcl.h"
. include
.extern
_GetTm32
.data
OxDEADBEEF
.word
.bss
1024
.block
.text
.eject
"Multiply/Divide Test"
.sbttl

LEAF

;just to test
; verify zeros

_MultDiv,O

This routine gives a test of the multiply and divide trap
handlers by the simple expedient of performing one of each.
Using the debugger, it can be forced to loop, etc.
In:

(nothing)

Out:

(nothing)
(see below)

Temp:
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
const
const
consth
const
consth

MD_Mpd,%%(TEMP_REG + 0)
MD_Mpr,%%(TEMP_REG + 1)
MD_PrLo,%%(TEMP_REG + 2)
MD_PrHi,%%(TEMP_REG + 3)
MD_Mlp,%%(TEMP_REG + 4)
MD_DvdHi,%%(TEMP_REG + 0)
MD_DvdLo,%%(TEMP_REG + 1)
MD_Dvsr,%%(TEMP_REG + 2)
MD_Quot,%%(TEMP_REG + 3)
MD_Dlp,%%(TEMP_REG + 4)
MD_Mlp,O
MD_Mpd,3
MD_Mpd,3
MD_Mpr, 5
MD_Mpr,5

multiply
mfsr
jmpt
nop
const
const
consth
const
consth
const
consth

MD_PrHi,MD_Mpd,MD_Mpr
MD_PrLo,q
MD_Mlp,M_Loop

mtsr
divide
jmpt

q,MD_DvdLo
MD Quot,MD_DvdHi,MD_Dvsr
MD_Dlp,D_Loop

;multiplicand
; multiplier
;product low
;product high
;BOOLEAN for looping
;dividend high
;dividend low
;divisor
; quotient
;BOOLEAN for looping
; FALSE
; (full 32-bit for patching)

M_Loop:

MD_Dlp,O
MD_DvdHi,O
MD_DvdHi,O
MD_DvdLo,15
MD_DvdLo,15
MD_Dvsr,3
MD_Dvsr,3

iFALSE
i (full setting for patch)

D_Loop:

3·153

29K Family Application Notes
nop
EPILOGUE

.eject
.sbttl

"Spill/Fill Test"

FUNCTION

_Recurse,1,29,1

This routine is a simple recursive do-nothing that is used to test
spill/fill.
It accepts a count as its input, decrements that count, and, if the
result is zero or greater, calls itself with the now decremented
count.
Each instance of the routine allocates 32 new registers.
Thus the total register requirement is 32 * (InCnt + 1) where InCnt
is the input count.
In:

(see below)

Out:

(nothing in final return)

Temp:

(allocated but not used)
.reg
.reg
sub
jmpt
nop
call
nop

R_InCnt,%%(IN_PRM + 0)
R_OutCnt,%%(OUT_PRM + 0)
R_OutCnt,R_InCnt,l
R_OutCnt,R_Exit
IrO, _Recurse

R Exit:
EPILOGUE

.eject
.sbttl
.extern

"C Interrupt Interface Test"
CIntf

LEAF

_Trap70,1

This "C" routine handles trap 70.
It increments the value of a global
system register so that its effect may easily be seen.
In:

(see below)

Out:

stO
st1

incremented
set to input parameter value

.reg
add
add
EPILOGUE

T70_V,%%(IN_PRM + 0)
stO,stO,l
st1,T70_V,0

;the vector

Trap70:
; This is the assembly-language routine that should get control on

3·154

Programming Standalone Am29000 Systems
trap 70.
It invokes CIntf in such a way as to give control to
_Trap70, the "c" routine above.
Note that control never returns
to this routine.
CIntf performs the iret.
In:

(nothing)

Out:

(nothing)
.reg
.reg
const
consth
jmp
const

T70_Rout,%%(SYS_TEMP + 0)
T70_Vect,%%(SYS_TEMP + 1)
T70_Rout,_Trap70
T70_Rout,_Trap70
CIntf
T70_Vect,70

.eject
.sbttl
.global
FUNCTION

_main,2,2,1

This routine plays the role of a C main routine.
It
is coded in assembly language to ease testing with
an absolute debugger.
.reg
.reg
.reg
.reg
call
nop
add
call
nop
call
const
asneq
call
nop
add
EPILOGUE

argc, %% (IN_PRM + 0)
argv, %% (IN_PRM + 1)
StTm,%%(LOC_REG + 0)
EndTm,%%(LOC_REG + 1)
IrO, GetTm32

;argc (= 0)
;argv (= NULL)
; start time
;end time
;should return start time

StTm,vO,O
IrO, MultDiv

;save the result
;test multiply/divide

IrO, _Recurse
pO,IS
70,grl,grl
IrO, GetTm32

;test spill/fill
; require 1024 registers
;force trap 70
;should return end time

EndTm,vO,O

;save the result

-

-

; end of test.s

3·155

29K Family Application Notes

APPENDIX D: romdcl.h
.eject
.sbttl

"Register, constant, and Macro Declarations"

Copyright 1988, Advanced Micro Devices
Written by Gibbons and Associates, Inc.
; Global registers

.reg
.equ
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.equ
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.equ
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg

3-156

rsp,gr1
SYS_TEMP,64
stO,gr64
st1,gr65
st2,gr66
st3,gr67
st4,gr68
st5,gr69
st6,gr70
st7,gr71
st8,gr72
st9,gr73
st10,gr74
stll, gr75
st12,gr76
st13,gr77
st14,gr78
st15,gr79
SYS_STAT,80
ssO,gr80
ss1,gr81
ss2,gr82
ss3,gr83
ss4,gr84
ss5,gr85
ss6,gr86
ss7,gr87
ss8,gr88
ss9,gr89
ss10,gr90
ss11, gr91
ss12,gr92
ss13,gr93
ss14,gr94
ss15,gr95
RET_VAL, 96
vO,gr96
v1,gr97
v2,gr98
v3,gr99
v4,gr100
v5,gr101
v6,gr102
v7,gr103
v8,gr104
v9,gr105
v10,gr106
vll, gr107
v12,gr108
v13,gr109
v14,grllO

;local reg. var. stack pointer
;system temp registers

;system static registers

;return registers

Programming Standalone Am29000 Systems
.reg
.equ
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.equ
.reg
.reg
.reg
.reg
.equ

.req
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
i

i temp registers

ireserved (for user)

itemp extension (and shared)

Global registers with special calling convention uses
.reg
.reg
.reg
.reg
.reg
.reg
.reg

i

vlS,grl11
TEMP_REG, 96
to,gr96
tl,gr97
t2,gr9S
t3,gr99
t4,grlOO
tS,grlOl
t6,grlO2
t7,grlO3
'tS, grlO4
t9,grlOS
tlO,grlO6
t11, grlO7
t12,grlOS
t13,grlO9
t14, gr110
tlS,grl11
RES_REG, 112
rO, gr112
rl,gr113
r2, gr114
r3, gr11S
TEMP_EXT, 116
xO,gr116
xl,gr117
x2, gr11S
x3,grl19
x4,gr120
xS,gr121
x6,gr122
x7,gr123
xS,gr124

tav,gr121
tpc,gr122
lsrp,gr123
slp,gr124
msp,gr12S
rab,gr126
rfb,gr127

itrap handler argument (also x6)
itrap handler return (also x7)
ilarge return pointer (also xS)
istatic link pointer (also x9)
imemory stack pointer
iregister alloc bound
;register frame bound

Local compiler registers - output parameters, etc.
(only valid if frame has been established)
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg
.reg

plS,lr17
p14,lr16
p13,lrlS
p12,lr14
p11,lr13
plO,lr12
p9,lr11
pS,lrlO
p7,lr9
p6,lrS
pS,lr7
p4,lr6
p3,lrS
p2,lr4

iparameter registers

3·157

29K Family Application Notes
.reg
.reg

p1,lr3
pO,lr2

; TLB register count

.equ
.eject

; constants for general use
.equ
.equ
.equ
.equ

WRD_SIZ,4
TRUE,OxBOOOOOOO
FALSE,OxOOOOOOOO
CHKPAT_a5,Oxa5a5a5a5

;word size
;logical true -- bit 31
; logical false -- 0
; check pattern

; constants for data access control
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ

CE,Ob1
CD,ObO
AS,Ob1000000
PA,Ob0100000
SB,Ob0010000
UA,Ob0001000
ROM_OPT,Ob100
DATA_OPT,ObOOO
INST_OPT,ObOOO
ROM_CTL, (PA + ROM_OPT)
DATA_CTL, (PA + DATA_OPT)
INST_CTL, (PA + INST_OPT)
IO~CTL, (AS + PA + DATA_OPT)

;co-processor enable
;co-processor disable
; set for I/O
;set for physical ad
;set for set BP
;set for user access
;OPT values for acc

;control field

.eject

;----------------------------------------------------------------------;defined vectors

i----------------------------------------------------- ------------------

22

3-158

-

.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
31 reserved
.equ

V_IllegalOp,O
V_Unaligned, 1
V_Out Of Range, 2
V_NoCoProc,3
V_CoProcExcept,4
V_ProtViol,5
V_InstAccExcept,6
V_DataAccExcept,7
V_UserlnstTLB,B
V_UserDataTLB,9
V_SuplnstTLB,lO
V_SupDataTLB,ll
V_InstTLBProt,12
V_DataTLBProt,13
V_Timer, 14
V_Trace, 15
V_INTRO,16
V_INTR1,17
V_INTR2,lB
V_INTR3,19
V_TRAPO,20
V_TRAP1,21
V_MULTIPLY, 32

Programming Standalone Am29000 Systems
.equ
.equ
.equ
.equ
37 - 41 reserved
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
.equ
56 - 63 reserved
.equ
.equ
.equ
.equ
.equ
.equ
.eject
.macro

V_DIVIDE, 33
V_MULTIPLU,34
V_DIVIDU,35
V_CONVERT, 36
V_FEQ,42
V_DEQ,43
V_FGT,44
V_DGT,45
V_FGE,46
V_DGE,47
V_FADD,48
V_DADD,49
V_FSUB,50
V_DSUB,51
V_FMUL,52
V_DMUL,53
V_FDIV,54
V_DDIV,55
V_SPILL, 64
V_FILL, 65
V_BSDCALL,66
V_SYSVCALL,67
V_BRKPNT,68
V EPI _OS, 69

R_LEFT,REGVAR

Rotate left
Parameters:

REGVAR

register to rotate

add
addc
.endm

REGVAR,REGVAR,REGVAR
REGVAR,REGVAR,O

;shift left by 1 bit,C
;add C to LSB

.macro

FUNCTION,NAME,INCNT,LOCCNT,OUTCNT

MSB

Introduces a non-leaf routine.
This macro defines the standard tag word before the function,
then establishes the statement label with the function's name
and finally allocates a register stack frame.
It may not be used
if a memory stack frame is required.
Note also that the size of the register stack frame is limited.
Neither this nor the lack of a memory frame is considered to be
a severe restriction in an assembly-language environment.
The
assembler will report errors if the requested frame is too large
for this macro.
It may be good practice to allocate an even number of both output
registers and local registers.
This will help in maintaining
double word alignment within these groups. The macro will assure
double word alignment of the stack frame as a whole, as required
for correct linkage.

3·159

29K Family Application Notes
Paramters:

the function name
input parameter count
local register count
output parameter count

NAME
INCNT
LOCCNT
OUTCNT

.set
.set
.set
.if
.set
.endif
.if
.set
.endif
.if
.set
.endif
.word

ALLOC_CNT, ((2 + OUTCNT + LOCCNT) « 2)
PAD_CNT, (ALLOC_CNT & 4)
ALLOC_CNT, (ALLOC_CNT + PAD_CNT)
(INCNT)
IN_PRM, (4 + OUTCNT + PAD_CNT + LOCCNT + Ox80)

sub
asgeu
add
.endm

rsp,rsp,ALLOC_CNT
V_SPILL,rsp,rab
Ir1,rsp, ((4 + OUTCNT + LOCCNT + INCNT)

.macro

LEAF,NAME,INCNT

(LOCCNT)
LOC_REG, (2 + OUTCNT + PAD_CNT + Ox80)
(OUTCNT)
OUT_PRM, (2 + Ox80)
((2 + OUTCNT + LOCCNT) «

16)

NAME:

«

2)

Introduces a leaf routine
This macro defines the standard tag word before the function,
then establishes the statement label with the function's name.
Paramters:

.if
.set
.endif
.set
.word

NAME
INCNT

the function name
input parameter count

(INCNT)
IN_PRM, (2 + Ox80)

NAME:
.endm

.macro

EPILOGUE

Deallocates register stack frame
. if
add
nop
jmpi
asleu
.else
jmpi
nop
.endif
.set
.set

3-160

(only and only if necessary) .

(ALLOC_CNT)
rsp,rsp,ALLOC_CNT
IrO
V_FILL,lr1,rfb
IrO

IN_PRM, (1024)
LOC_REG, (1024)

;illegal,to cause err on ref
;illegal,to cause err on ref

Programming Standalone Am29000 Systems
.set
.set
endm

OUT_PRM, (1024)
ALLOC_CNT, (1024)

;i1legal,to cause err on ref
;illegal,to cause err on ref

Initial values for macro set variables to guard against misuse
;illegal,t,o
.set
IN_PRM, (1024)
;illegal,to
LOC_REG, (1024)
.set
;illegal,to
OUT_PRM, (1024)
.set
ALLOC_CNT, (1024)
;illegal,to
.set

cause
cause
cause
cause

err
err
err
err

on
on
on
on

ref
ref
ref
ref

; end of romdcl.h

3·161

29K Family Application Notes

APPENDIX E: testoid
test.ld

Linker Directives

see test.s and start.s for descriptions of sections
load test.o,start.o
order vectors=O,rstack,mstack,.bss, .data, .text,endsect

3-162

Host Interface (HIF) v1.0 Specification
Application Note
by E. M. Greenawalt

PREFACE
This document describes HIF (v1.0), the Am29000 Architectural Host Interface, and explains how to use it.
HIF is the software standard that defines the interface
between the user's high-level language program and
the Am29000 processor. The document is written for
experienced programmers and assumes a working
knowledge of the Am29000 microprocessor.

INTRODUCTION
Advanced Micro Devices is developing a complete line
of Am29000™ simulators, hardware target execution
vehicles, and high-level language development tools for
the Am29000 32-bit Streamlined Instruction Processor.
These products are designed to support end-users who
are building embedded system applications based on
the Am29000 processor. For these users, often there is
no existing operating system or kernel fortheir hardware
deSign.
Before AMD could create development tools for the
Am29000, a standard set of kernel services had to be
defined that would interface a user-application program,

written in a high-level language, to a host operating system or an Am29000 processor.
HIF, the host interface, is the software specification that
defines this standard set of kernel services. Figure
NO TAG shows the level where HIF resides. As implied
by the figure, HIF does not describe any particular implementation; but rather each simulator, hardware vehicle, and high-level language implements HIF in its own
way. The kernel services provide the minimum functionality needed to interface high-level language library
functions to the user's operating system code.
Using HIF, program modules written in any of the languages available for the Am29000 can be combined
and the resulting program can run, without change, o~
any Am29000 simulator or hardware execution vehicle.
Future AMD products will also use HIF, and AMD is
actively encouraging third-party vendor support.
AMD is indebted to Embedded Performance, Incorporated (EPI), who originally developed the HIF concepts
and then graciously placed them in the public domain.

User's application program
High-level language library
Host interface (HIF)
Operating system kernel

Figure 1. HIF Interface

Publication'

11014

Rev.

A

Amendment

10

Issue Date:

11/89

©

1989 Advanced Micro Devices, Inc.

3-163

29K Family Application Notes
HIF APPLICATIONS
The HIF specification has broad applications; currently it
provides the interface between the user's high-level
language program and the following hardware and
software products:
• Am29000 Architectural Simulator. This software product provides the means to simulate the operation of
the Am29000 in a specified system environment. It
provides detailed performance statistics by modeling
the internal architecture of the Am29000, as well as
system memory configurations and timing. The HIF
specification is implemented to provide the interface
between the user's program and the host operating
system.
• PC Execution Board (PCEB29K TM). This hardwarel
software product contains an Am29000 processor
and memory and is an add-in board to IBM®
PC-based systems. Part of the HIF specification is
implemented on the board with another part implemented on the PC, to interface with the DOS operating
system.
• Standalone Execution Board (STEB). This hardware
product from STEP Engineering is intended to be an
evaluation vehicle for the Am29000 and, optionally,
Am29027™ Arithmetic Accelerator devices. The entire HIF specification is implemented on this board,
which contains a resident monitor to implement the
necessary kernel services.
Because HIF is a general-purpose standard, it can be
used to interface any high-level language to the
Am29000. User programs need not be written entirely in
a high-level language; they may incorporate assemblylanguage functions when maximized performance is the
primary concern.
HIFUSERS
There are three categories of end-users who need to
know the details of the host interface:
• Those USing AMD-supplied hardware execution vehicles or simulators. This document defines the lowlevel mechanisms of HIF. With this information and
the design concepts presented herein, end-users can
extend the HIF environment to meet the needed
degree of software functionality and sophistication.
• Those developing a custom kernel operating system
for an Am29000 design. These users need access to
AMD's high-level and assembly-language development tools. This document provides the information
required to build a HIF-conforming kernel that uses
the high-level language development tools directly.
With this information, end-users can extend and

3-164

customize the operating system code without interfering with the basiC capabilities of the HIF.
• Those who are using the AMD-supplied high-level
language development tools, but who must conform
to another kernel operating system interface. There is
sufficient information in this document to enable users
to modify the development tools to properly interface
with the target kernel's specifications.
HIF CONCEPTS
Programmers developing software in a high-level
language do not work directly with the processor.
Instead, they think in terms of a virtual machine ideally
suited to the computational paradigm of the language.
For instance, the C-Ianguage virtual machine has
operations such as fprlntf() and strcpy(), and the
FORTRAN machine has operations such as alog and
sqrt.
In actual practice, these virtual machines are implemented by libraries of object code that perform
language-specific operations. As long as programmers
use only the functions of the language's implied virtual
machine, the programs will be portable across a broad
range of implementations of the language.
However, computer systems generally provide another
virtual machine to the world: one that is defined by the
operating system software. This virtual machine
requires system calls to perform the services that are
implemented within the operating system code. Typical
services are: process management, file system
management, device management, and memory
management.
The high-level language virtual, machine usually
consists of: (1) functions that can be implemented
entirely within library routines, and (2) functions that
require the services of the operating system. The functions of the first group (usually defined as the standard
library for that language) are independent of the operating system virtual machine on which they are implemented. The functions of the second group must be
coded in terms of the operating system virtual machine.
In other words, they must make system calls.
It is often useful for end-users to also make system calls,
even though this practice makes their programs less
portable. This requirement can be accommodated by
augmenting the language library with glue routines that
specifically invoke the system calls, while providing the
end user with suitable high-level syntax and semantics.
(For detailed information on the glue routines for the
C compiler, see the HighC29K Reference Manual,
"Appendix A, Host Interface Definition.")

Host Interface v1.0 Specification
Given the above discussion. the required task is to create high-level language development tools that can be
used easily and efficiently on a variety of execution vehicles. This task can be broken down into the following
steps:
• Define an operating system virtual machine that
provides sufficient functionality to support the fundamental requirements of each high-level language. but
not so much as to require a massive development
effort to create.
• Add appropriate glue routines to the standard libraries
of the language so that the libraries are defined in
terms of the operating system virtual machine.
• Implement the operating system's virtual machine
services on the various execution vehicles. For
hardware vehicles. the virtual machine is implemented by a kernel, typically contained in a resident
monitor software program. For simulation vehicles.
the virtual machine is implemented by code internal to
the simulator and by code simulated by the simulator.
For the Am29000 hardware and software support products. HIF consists of the following operating system
virtual machine definitions:
• A carefully defined. efficient system call mechanism.
Accessing an HIF kernel service requires a transition
from user mode to supervisor mode on the processor.
This requires a specific mechanism. such as a trap
handler. to be invoked.
• A set of services that support the primitive requirements of C. FORTRAN. and Pascal. Most of the
services are defined according to UNIX® operating
system interface specifications.
• A specification of the environment created by the
kernel. This involves the definition of storage allocation and register initializations implemented by the
kernel.

IMPLEMENTATION TYPES
Implementations of the HIF specification take two fundamental forms: self-hosted and embedded. Examples
of each of these are provided in the Standalone Execution Board (STEB) manufactured by STEP Engineering
and AMD's PC Execution Board (PCEB29K).
The STEB is a single-board computer that incorporates
an Am29000 processor. an optional Am29027 arithmetic accelerator. program and data memory, serial ports,
and timer-counter resources. The HIF implementation
for this board consists of a resident monitor program that
is downloaded into low-memory locations. and which
implements the kernel services described in the "HIF
Service Routine" section of this document. This is a selfhosted implementation.
In contrast to the STEB. the PCEB29K is an add-in
board for IBM PC-compatible computers that incorporates an Am29000 processor. program and data memory. serial ports. and timer-counter resources. The HIF
implementation forthis board consists of two portions of
code. One performs some of the kernel services on the
board and the other performs some of the kernel services through the auspices of the DOS operating system.
In the sense that the HIF is grafted onto the existing host
operating system. it is called an embedded implementation. The architectural and instruction simulators are
also embedded implementations because they share
the HIF implementation between custom code and
existing host-computer operating-system code.
There is no preference for either type of implementation
as long as the services and features of the H IF specification are fully implemented in the target environment.
With the standard interfaces that a HIF implementation
presents, application programs written for one environment will run equally well in another.
HIF SERVICES PREVIEW
Table 1 lists the services defined by the HIF interface.
Most are similar or identical to equivalent UNIX operating system calls. The titles given in column one are not
the names that actually exist in a particular library but,
instead. are the generic names of the services. for the
purpose of this overview.

3-165

29K Family Application Notes
Table 1. HIF Services
Page

Name

Description

clock
close
cycles
exit
getargs
getenv
getpsize
Iseek
open
read
remove
rename
sysalloc
sysfree
setvec
time
tmpname
write

Returns the elapsed processor time, in milliseconds
Closes a file
Returns processor cycle counts
Terminates a program
Returns an argument address
Gets the environment
Returns the memory page size
Sets a file position
Opens a file
Reads a buffer of data from a file
Removes (deletes) a file
Renames a file
Allocates memory space
Frees allocated memory space
Sets user trap addresses
Returns number of seconds since Jan. 1, 1970
Returns a temporary file name
Writes a buffer of data to a file

INTENDED AUDIENCE
This document is intended for systems designers and
programmers who have a working knowledge of the
Am29000 and ~s supporting peripheral hardware. It
does not cover CPU deSign, the Am29000 instruction
set, or any other hardware detail. Those topiCS are
adequately covered in the reference documents listed
below.

28

14
29
10
27

23
26
17
11
15
19
20
24

25
30
22
21
16

initializations performed by the HIFconforming kernel prior to execution of a
user program.
Appendix A:

HIF Quick Reference-lists all of the
services and service parameters used in
this document, in quick reference form.

Appendix B:

Error Messages-lists the error codes
that HIF-conforming services may
return.

ABOUT THIS DOCUMENT
The contents of each section and appendix of this
document are described below:
Section 1:

Section 2:

Section 3:

Section 4:

3-166

Introduction-discusses the important
concepts underlying the host interface
definition and previews the services that
form the basis of the HIF specification.
System Call Mechanism-describes the
mechanism used to make calls on the
HIF services, and includes information
on register usage for passing parameters and receiving results.
Service
Routine
Descriptions-describes each of the services defined in
HIF and shows details of the code
sequences, including examples, for invoking the services.
Process Environment-describes the
standard memory allocation and register

REFERENCE DOCUMENTS
The user should have access to the following AMD
documents:

• Am29000 Streamlined Instruction Processor Users
Manual, order #10620
• ADAPT29K User's Manual
• MON29K User's Manual
• MON29K Installation and Customization Manual
• Am29000 Execution Board and Monitor User's
Manual
• ASM29K Utilities Manual from the ASM29K documentation set
• HighC29K Reference Manual from the HighC29K
documentation set

Host Interface v1.0 Specification
DOCUMENTATION CONVENTIONS
This specification assumes some familiarity with the
UNIX operating system and the C language. In the following sections, the conventions presented in the subsections below are assumed.
Numeric Values
All numeric values are presumed to be expressed in
decimal notation, unless otherwise stated. Hexadecimal
values are prefaced by the characters "Ox." Any value
not prefaced by "Ox" is defined to be a decimal number.
For example:
100092
Ox100092

Decimal number
Hexadecimal number

The first number, above, is a decimal value by implication, because it has not been prefaced by "Ox." The
second constant includes the explicit "Ox" prefix, designating it as a hexadecimal value.
Character Strings
In the documentation, frequent mention is made of character strings that hold file names, path names, and environment variable names. In all cases, the HIF
Specification requires that strings be constructed as a
sequence of ASCII characters terminated by a NULL
byte (an 8-bit character composed of all zero bits). This
is the form in which strings are represented in the C
language. Thus, the space reserved for a string must be
one byte longer than the length of the string, to accommodate the NULL byte.
Languages such as Pascal, which require "counted"
strings (that is, a single 8-bit byte in the first character of
the string that specifies the number of bytes that follow),
are required to convert these to NULL-terminated form
before calling the HIF kernel services. In addition,
languages other than C may need to convert strings
passed back from the HIF kernel services to a compatible internal form. All returned strings are in NULLterminated form.

SYSTEM CALL MECHANISM
System calls on Am29000-based systems are accomplished through invocation of a specific software trap.
The Am29000 traps are roughly equivalent to software
interrupts on other CPUs. System call traps are invoked
through execution of an appropriate assert instruction
whose assertion is FALSE at the time the instruction is
executed.

instruction, where the result of the assertion is FALSE,
will cause the trap specified in the instruction to be
taken.
Once the trap is invoked, the Am29000 accesses a trap
vector containing up to 256 separate trap handler
addresses; or it may directly invoke a trap handler routine, depending on the implementation of the operating
system trapping mechanism and the state of the Vector
Fetch (VF) bit in the processor's Configuration Register.
In most implementations, a table of vectors is used.
However, the operating system software may implement direct trap execution for the increased efficiency it
offers even though it requires the reservation of a much
greater amount of system memory, but bypasses the
need for vector table lookup.
When a trap is taken, the normal program execution
sequence is interrupted and the trap handler is invoked.
At this point, the current program's context is contained
in Am29000 CPU registers. No saving or restoring of
registers is performed by the processor when a trap
occurs. HIF services are required to preserve the
following registers and restore their contents before
returning to the application program:
• All local registers
• Global registers gr1, gr112, gr115, and gr125
• Global registers gr126 and gr127should be preserved
according to AMD calling conventions. Their values
may differ upon return from a HIF service, but the span
between their values will remain the same.
The HIF services may modify the contents of certain
registers without first saving their values, namely:
gr121, gr96, and gr97; although, the application program should not count on gr96through gr111 to be untouched by current and future HIF kernel services.
HIF SERVICE INVOCATION
Before invoking a HIF service, the service number and
any input parameters to be passed must be loaded into
Am29000 general registers. Both local and global registers are used for various HIF services, as shown in the
HIF Quick Reference table in Appendix A of this document. Details for invoking specific services are contained in the Service Routine Descriptions section.
Service Number
Every HIF system service is identified by a unique number. Service numbers 0-127 and 256-383 are
reserved for use by AMD and should not be used for
user-supplied extensions.

Execution of an ASEQ, ASGE, ASGEU, ASGT,
ASGTU, ASLE, ASLEU, ASLT, ASLTU, or ASNEQ

3-167

29K Family Application Notes
canst
consth
canst
canst
asneq
canst
consth
canst
canst
asneq

Ir2,input_file
Ir2,input_file
Ir3,O_RDONLY
gr121,17
69,gr1,gr1
Ir2,input_file
Ir2,input_file
Ir3,O_RDONLY
gr121,17
.69,gr1,gr1

The service number must be loaded into global register
gr121, the trap-handler argument register. Gr121 is a
temporary register and its value is not preserved over a
system call, nor will its value be preserved over any trap
invoked by the running program.
Input Parameters
Any input parameters to be passed must be placed in
local registers Ir2 through Ir17. Input parameters are
passed to HIF services using the parameter passing
mechanism specified in the Am29000 calling conventions documentation (Am29000 Streamlined Instruction
Processor User's Manual, order #1 0620).
Invoking a HIF Service
The HIF services are accessed by forcing trap 69 to
occur, after the service number and parameters (if any)
are loaded in the designated registers. Trap handler 69
executes the service in supervisor mode.

set input file
pathname address
set open mode
service number = 17 (open)
force trap 69 (system call)
set input file
pathname address
set open mode
service number = 17 (open)
force trap 69 (system call)

Appendix B of this document for existing HIF implementations.
. HIF does not specify these error codes. They may be
completely defined by an implementation, except for
cases in which there is a corresponding, existing, UNIX
error code. In these cases, the UNIX error code is
expected to be used.
Example Assembly Code
The code fragment above shows how the definitions are
implemented in Am29000 assembly-language to invoke
the open HIF service to open a file:
In this example, local register Ir2 is loaded with the
address of the filename constant; local register Ir3
contains the code: O_RDONLY, indicating that the file is
to be opened for read-only access. The service number
(17) is loaded into global register gr121 and the service
is executed by asserting that register gr1 is not equal to
itself. Since this is FALSE, the trap is invoked.

Returned Values
USER·MODE TRAPS
Most services return values, usually a single integer
value (number of bytes read or written, number of clock
ticks, size of a memory block, etc.), or a pointer (address
of a file descriptor, address of a memory block, etc.).
These values are returned in register gr96, per standard
high-level language calling conventions.
If a service returns multiple values, the additional values
are returned in gr97, gr98, and so forth. If the service
fails to perform the requested task, the values contained
in gr96 and succeeding registers are not guaranteed to
be valid.

See the documentation that accompanies your
language processor for additional details on Am29000
high-level language calling conventions.

When a trap is invoked, the Am29000 switches from
user mode to supervisor mode to execute the trap
handler code. Most traps are properly executed in this
mode, including the kernel services that implement the
HIF specification. However, a few traps, such as the
spilllfill handlers, are intended to execute in user mode.
In these cases, the trap handler code is not part of the
kernel, but is supplied by the particular high-level
language product library and is linked with the user's
application program.
In order to use a consistent trap handling mechanism,
and to support the individual language products' methodologies for user-mode traps, a HIF service called
setvec, is called with the address of the user-mode trap
handler code for each of the traps handled in this way.

Status Reporting
In all cases, upon return from a HIF service, global register gr121 contains either a TRUE value (Ox80000000),
or a positive non-zero integer error code indicating the
reason for failure. Pre-defined error codes are listed in
3-168

Once the user-mode handler addresses have been supplied, and the corresponding trap is invoked, the operating-system kernel receives control in supervisor mode.
It then reinstates user mode and invokes the appropriate language library trap handler to complete the

Host Interface v1.0 Specification
required operation. This bouncing from user mode to
supervisor mode and back to user mode is referred to as
a "trampoline" effect. When the trap handler's execution
is complete, it returns directly to the user's application
program, rather than back through the kernel.

the fill-trap handler. Since register stack management is
unique for each application environment, individual spilll
fill handlers are provided with each of the high-level
language products.

The register stack spilVfill handlers are appropriate
examples of code that is intended to execute in user
mode. When a user's application program calls a function that requires a large number of local registers to
execute, some currently unused registers may have to
be written to main memory to free enough of the on-chip
registers. In this case, the r~gisters are spilled to memory via the spill-trap handler. When the function
completes execution and intends to return to its caller,
the spilled registers may have to be restored by calling

HIF SERVICE ROUTINES
The HIF service routine calls currently defined are listed
by decimal service number in Table 2 below and
described in detail in the following pages.
Service numbers 0 through 127 and 256 through 383
are reserved by AMD and should not be used for usersupplied extensions. Table 3 describes the parameter
names used in the service descriptions.

Table 2. HIF Service Calls
Number

17
18
19
20
21
22
23
33
49
65
257
258
259
260
273
274
289

Title

Description

exit
open
close
read
write
Iseek
remove
rename
tmpnam
time
getenv
sysalloc
sysfree
getpsize
getargs
clock
cycles
setvec

Terminate a program
Open a file
Close a file
Read a buffer of data from a file
Write a buffer of data to a file
Seek file byte
Remove a file
Rename a file
Return a temporary name
Return seconds
Get environment
Allocate memory space
Free memory space
Return memory page size
Return base address
Return milliseconds
Return processor cycles
Set user trap address

Page
10
11
14
15
16
17
19
20
21
22
23
24
25
26
27
28
29
30

3-169

29K Family Application Notes
Table 3. Service Call Parameters
Parameter

Description

addrptr

A pointer to an allocated memory area, command-line-argument array, pathname buffer, or NULLterminated environment variable name string.
The base address of command-line-argument vector.
A pointer to buffer area which data is to be read from orwritten to during the execution of I/O services.
The number of bytes actually read from a file or written to a file.
The number of processor cycles returned.
The error code returned by the service, usually the same as the codes returned in the UNIX variable
ermo. See Appendix B, Table 8, starting at page 35, for a list of HIF error codes.
The exit code of the application program.
A pointer to a NULL-terminated ASCII string containing the directory path of a temporary filename.
The file descriptor, a small integer number. Descriptors 0, 1, and 2 are guaranteed to exist and
correspond to open files on program entry (0 is UNIX equivalent of stdln and is opened for input, 1 is
UNIX stdout and is opened for output, 2 is UNIX stderr and is opened for output). The fileno is
returned when an open call is successful.
A pointer to the address of a service.
A series of option flags whose values represent the operation to be performed.
Milliseconds.
A pOinter to a NULL-terminated ASCII string that contains an environment variable name.
The number of data bytes requested to be read from or written to a file, or numberof bytes to allocate
from the heap.
.
A pointer to a NULL-terminated ASCII string that contains the directory path of a new filename.
The number of bytes from a specified position (orig) in a file.
A pointer to NULL-terminated ASCII string that contains the directory path of the old filename.
A value of 0, 1, or 2 that refers to the beginning, current position, or the position of the end of a file.
The memory page size in bytes returned.
A pointer to a NULL-terminated ASCII string that contains the directory path of a filename.
The UNIX file access permission codes.
The return value that indicates success or failure.
The seconds count returned.
The trap number.
The current position in a specified file.

baseaddr
buffptr
count
cycles
errcode
exitcode
filename
fileno

funaddr
mode
msecs
name
nbytes
newfile
offset
oldfile
orig
page size
pathname
pflag
retval
secs
trapno
where

Each service description on the pages that follow
contains a concise explanation of the purpose of the
service, the input and result register contents, and
example assembly-language code to invoke the service. In all cases, operating system kernel services that
meet the HIF specifications are invoked by forcing the
software trap 69 to occur. The service number is always
contained in general register gr121 and parameters
are passed, if necessary, in local registers, beginning
with 1(2.

HIF implementations are required to return an error
code when a requested operation is not possible. The
codes from 0 to 255 are reserved for compatibility with
current and future error return standards. The currently
assigned codes and their meanings are listed in Appendix B, Table 8, starting on page 35. If a HIF implementation returns an error code in the range of 0 to 255, it
must carry the identical meaning to the corresponding
error code in this table. Error code values larger than
255 are available for implementation-specific errors.

When the service returns, general register gr121 is
required to report the success orfailure of the service. If
successful, gr121 is expected to contain a TRUE
boolean value (a 1 bit in the most significant bit position).
If the service is not successful, a positive non-zero error
code is returned in g(121. If the service returns results,
the first result is held in gr96, the second in gr97, and so
forth.

In the examples, references are made to error handlers
that are not part of the example code. These are
assumed to be contained in the larger part of the user's
code and are not supplied as part of the HIF specification. The JMPF instructions have been provided to show
that interface glue routines should incorporate this error
testing philosophy in orderto be robust. In practice, error
handling may be relegated to a single routine, or may be

3-170

Host Interface v1.0 SpeCification
vested in individual sections of either in-line code, or as
callable services by the glue routines.
Since HIF implementations may exist over a wide spectrum of systems, the capabilities of the HIF may vary
from one system to the next. In the simplest case, the
HIF implementation in an embedded Am29000 system,
such as a printer controller, may contain no external file
system. In this event, the inpuVoutput facilities specified
in the kernel service descriptions need not be implemented. In more common cases, where the HIF will exist on systems that have full operating system
capabilities, such as DOS or UNIX, it is assumed that all

of the features of the HIFwill be implemented. The service descriptions in this document provide a set of standard interfaces for commonly implemented operating
system interfaces. If individual features are implemented, the interfaces are expected to follow the guidelines in this specification.
Descriptions of the individual services follow on the
remaining pages of this section. They are listed in
numeric sequence by service number. Appendix A, HIF
Quick Reference, allows easy location of a service by its
number.

3-171

29K Family Application Notes

Terminate a Program

Service 1--exit
Description
This service terminates the current program and returns
a value to the system kernel, indicating the reason for
termination. By convention, a zero passed in Ir2
indicates normal termination, while any non-zero value

indicates an abnormal termination condition. There are
no returned values in registers gr96 and gr121 since this
service does not return.

Register Usage
Type

Regs

Calling:

Returns:

Contents

Description

gr121

1 (Ox1)

Service number

Ir2

exitcode

User-supplied exit code

gr96

undefined

This service call does not return

gr121

undefined

This service call does not return

Example Call
const
const
asneq

Ir2, 1
gr121,1
69,grl,grl

In the above example, the operating system kernel is
being called with service code 1 and an exit code of 1,
which is interpreted according to the specifications of
the individual operating system. The value of the exit
code is not defined as part of the HIF specification.
In general, however, an exit code of zero (0) specifies a
normal program termination condition, while a non-zero

o

3-172

exit code = 1
service = 1
call the operating system

code specifies an abnormal termination resulting from
detection of an error condition within the program.
Programs can terminate normally by falling through the
curly brace at the end of the main function in a
C-Ianguage program. Other languages may require an
explicit call to the kernel's exit service.

Host Interface v1.0 Specification

Open a File

Service 17-open
Description
This service opens a named file in a requested mode.
Files must be explicitly opened before any read, write,
close, or other file positioning accesses can be accomplished. The open service, if successful, returns an

integer token that is used to refer to the file in all subsequent service requests. In many high-level languages,
the returned token is referred to as a '1i1e descriptor."

Register Usage
Description

Contents

Type

Regs

Calling:

gr121

17(Ox11)

Service number

1r2
1r3

pathname

A pointer to a filename

Ir4

mode
pflag

See parameter descriptions below
See parameter descriptions below

gr96

fileno

gr121

Ox80000000
errcode

Success: ;;:: (file descriptor)
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Returns:

Parameter Descriptions
Pathname is a pointer to a zero-terminated string that
contains the full path and name of the file being
opened.* Individual operating systems have different
means to specify this information. With hierarchical file
systems, individual directory levels are separated with
special characters that can not be part of a valid filename or directory name. In UNIX-compatible file
systems, directory names are separated by forward
slash characters "/" (e.g., "/usr/jack/files/myfile"); where

"usr," "jack," and '1i1es" are succeedingly lower directory
levels, beginning at the root directory of the file system.
The name "myfile" is the filename to be opened at the
specified level. The individual characteristics of files and
pathnames are determined by the specifications of a
particular operating system implementation.
Mode is composed of a set of flags, whose mnemonics
and associated values are listed in Table 4.

Table 4. Open Service Parameters
Name

Value

Description

O_RDONLY
O_WRONLY
O_RDWR
O_APPEND
O_NDELAY
O_CREAT
O_TRUNC
O_EXCL
O_FORM

OxOOOO
OxOOO1
OxOOO2
OxOOO8
OxOO10
Ox0200
Ox0400
Ox0800
Ox4000

Open for read only access
Open for write only access
Open for read and write access
Always append when writing
No delay
Create file if it does not exist
If the file exists, truncate it to zero length
Fail if writing and the file exists
Open in text format

The O_RDONLY mode provides the means to open a
file and guarantee that subsequent accesses to that file
will be limited to read operations. The operating system
implementation will determine how errors are reported

*

for unauthorized operations. The file pointer is
poSitioned at the beginning of the file, unless the
O_APPEND mode is also selected.

The HIF specification intentionally refrains from defining the constituents of a legal path name, or any intrinsic characteristics of the implemented
file system. In this regard, the only requirement of a H1F-conforming kernel is that when the open service is successfully performed, that the
routine returns a small integer value that can be used in subsequent inpuVoutput service calls to refer to the opened file.

3-173

29K Family Application Notes
The O_WRONLY mode provides the means to open a
file and guarantee that subsequent accesses to that file
will be limited to write operations. The operating system
implementation will determine how errors are reported
for unauthorized operations. The file pointer is
positioned at the beginning of the file, unless the
a_APPEND mode is also selected.
The O_RDWR mode provides the means to open a file
for subsequent read and write accesses. The file
pointer is positioned at the beginning of the file, unless
the a_APPEND mode is also selected.
If a_APPEND mode is selected, the file pointer is
positioned to the end of the file at the conclusion of a
successful open operation, so that data written to the
file is added following the existing file contents.
Ordinarily, a file must already exist in order to be
opened. If the O_CREAT mode is selected, files that do
not currently exist are created; otherwise, the open
function will return an error condition in gr121.
If a file being opened already exists and the 0_TRUNC
mode is selected, the original contents of the file are discarded and the file pOinter is placed at the beginning of
the (empty) file. If the file does not already exist, the HIF
service routine should return an error value in gr121,
unless O_CREAT mode is also selected.
The O_EXCL mode provides a method for refusing to
open the file if the O_WRONLY or O_RDWR modes are
selected and the file already exists. In this case, the
kernel service routine should return an error code in
gr121.
a_FORM mode indicates that the file is to be opened as
a text file, ratherthan a binary file. The nominal standard
input, output, and error files (file descriptors 0,1, and 2)
are assumed to be open in text mode priorto commencing execution of the user's program.

3-174

When opening a FIFO (interprocess communication
file) with O_RDONLY or O_WRONL Y set, the following
conditions apply:
• If O_NDELAY is set (Le., equal to Ox0010):
- A read-only open will return without delay.
- A write-only open will return an error if no process
currently has the file open for reading.
• If O_NDELAY is clear (Le., equal to OxOOOO):
- A read-only open will block until a process opens a
file .for writing.
- A write-only open will block until a process opens a
file for reading.
When opening a file associated with a communication
line (e.g., a remote modem or terminal connection), the
following conditions apply:
• If O_NDELAY is set, the open will return without
waiting for the carrier detect condition to be TRUE.
• If O_NDELAY is clear, the open will block until the
carrier is found to be present.
The optional pflag parameter specifies the access
permissions associated with a file; it is only required
when O_CREAT is also specified (Le., create a new file
if it does not already exist). If the file already exists, pflag
is ignored. This parameter specifies UNIX-style file
access permission codes (r, W, and xfor read, write, and
execute, respectively) for the file's owner, the work
group, and other users. If the parameter is missing, pflag
will be set to -1 (all accesses allowed). See the UNIX
operating system documentation for additional
information on this topic.

Host Interface v1.0 Specification

Example Call
path:

.ascii
.set
.set

" /usr/jack/files/myfile\O"
mode, O_RDWRI O_CREAT 10_FORM
permit,Ox180

fd:

.word
const
consth
const
const
const
asneq
jmpf
const
consth
store

o
Ir2,path
Ir2,path
Ir3,mode
Ir4,permit
gr121,17
69,gr1,gr1
gr121,open_err
gr120,fd
gr120,fd
0,0,gr96,gr120

In the above example, the file is being opened in read!
write text mode. The UNIX permissions of the owner are
set to allow reading and writing, but not execution, and
all other permissions are denied. As indicated above in
the parameter descriptions, the file permissions are only
used if the file does not already exist. When the open
service returns, the program jumps to the open_err
error handler if the open was not successful; otherwise,
the file descriptor returned by the service is stored for
future use in read, write, Iseek, remove, rename, or
close service calls.

address of pathname
open mode settings
permissions
service = 17 (open)
perform OS call
jump if error on open
set address of
file descriptor
store file descriptor

As described in the introduction to these services, the
HIF can be implemented to several degrees of elaboration, depending on the underlying system hardware,
and whether the operating system is able to provide the
full set of kernel services. In the least capable instance
(i.e., a standalone board with a serial port), it is likely that
only the O_RDONL Y, O_WRONLY and O_RDWR
modes will be supported. In more capable systems, the
additional modes should be implemented, if possible.

3-175

29K Family Application Notes

Service 18-close

Close a File

Description
This service closes the open file associated with the file
descriptor passed in Ir2. Closing all file's is automatic on
program exit (see exit), but since there is an implemen-

tation-defined limit on the number of open files per process, an explicit close service call is necessary for
programs that deal with many files.

Register Usage
Type

Regs

Calling:

Returns:

Contents

Description

gr121

18 (Ox12)

Service number

Ir2

fileno

File descriptor

gr96

retval

Success: = 0
Failure: < 0

gr121

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
fd:

.word

°

const
consth
load
const
asneq
jrnpf
nop

gr96,fd
gr96,fd
O,O,lr2,gr96
gr121,18
69,gr1,gr1
gr121,clos_err

The above example illustrates loading a previously
stored file descriptor (fd, in this case) and calling the
kernel's close service to close the file associated with
that descriptor. If an error occurs when attempting to

3-176

set address of
file descriptor
get file descriptor
service = 18
and call the as
handle close error

close the file, the kernel will return an error code ingr121
(the content of that register will not be TRUE) and the
program will jump to an error handler; otherwise,
program execution will continue.

Host Interface v1.0 Specification

Read a Buffer of Data from a File

Service 19-read
Description
This service reads a number of bytes from a previously
opened file (identified by a small integer file descriptor in
Ir2that was returned by the open service) into memory
starting at the address given by the buffer pointer in Ir3.
Lr4 contains the number of bytes to be read. The num-

ber of bytes actually read is returned in gr96. Zero is
returned in gr96if the file is already positioned at its endof-file. If an error is detected, a small positive integer is
returned in gr121, indicating the nature of the error.

Register Usage
Contents

Description

gr121

19 (Ox13)

Service number

1r2
1r3

fileno

File descriptor

buffptr

A pointer to buffer area

Ir4

nbytes

Number of bytes to be read

gr96

count

gr121

Ox80000000
errcode

Success: > 0 (number of bytes actually read)
EOF:
=0
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Type

Regs

Calling:

Returns:

Example Call
fd:
buf:
num:

.word
.set
.block
.word

BUFSIZE,256
BUFSIZE

canst
consth
load
canst
consth
canst
canst
asneq
jmpf
canst
consth
store

gr96,fd
gr96,fd
O,O,lr2,gr96
Ir3,buf
Ir3,buf
lr4,BUFSIZE
gr121,19
69,gr1,gr1
gr121,rd_err
gr120,num
gr120,num
O,O,gr96,gr120

°

°

The above example requests the HIF to return BUFSIZE
bytes from the file descriptor contained in the variable fd.
If the call is successful, gr121 will contain a TRUE value

set address of
file descriptor
get file descriptor
set buffer address
specify buffer size
service = 19
call the as
handle read errors
set address of
'num' argument
store bytes read

and gr96will contain the number of bytes actually read.
If the service fails, gr121 will contain the error code.

3-177

29K Family Application Notes

Write a Buffer of Data to a File

.Service 20-write
Description
This service writes a number of bytes from memory
(starting at the address given by the pointer in Ir3) into
the file specified by the small positive integer file
descriptor that was returned by the open service when
the file was opened for writing. Lr4 contains the number

of bytes to be written. The number of bytes actually
written is returned in gr96. If an error is detected, gr121
will contain a small positive integer on return from the
service, indicating the nature of the error.

Register Usage
Contents

Description

gr121

20 (Ox14)

Service number

1r2
1r3

fileno
buffptr

File descriptor
A pOinter to the buffer area

Ir4

nbytes

Number of bytes to be written

gr96

count

Success: = Ir4
Failure: 0$ gr96< Ir4
Extreme: < 0

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Type

Regs

Calling:

Returns:

. gr121

Example Call
fd:
buf:
num:

.word
.set
.block
.word

BUFSIZE,256
BUFSIZE

canst
consth
load
canst
consth
canst
canst
asneq
jmpf
canst
consth
store

gr96,fd
gr96,fd
0,0,lr2,gr96
Ir3,buf
Ir3,buf
Ir4,BUFSIZE
gr121,20
69,grl,grl
gr121,wr_err
gr120,num
gr120,num
0,O,gr96,gr120

°
°

The example, above, writes BUFSIZE bytes from the
buffer located at buf to the file associated with the
descriptor stored in fd. If errors are detected during
execution of the service, the value returned in gr121 will

3-178

set address of
file descriptor
get file descriptor
set buffer address
specify buffer size
service = 20
call the as
handle write errors
set address of
"num" variable
store bytes written

be FALSE. In this case, the wr_err error handler will be
invoked. The number of bytes actually written is stored
in the variable num.

Host Interface v1.0 Specification

Seek a File Byte

Service 21-lseek
Description
This service positions the file associated with the file
descriptor in 1r2, "offsef' number of bytes from the position of the file referred to by the o,ig parameter. L,3
contains the number of bytes offset and 1,4 contains the
value for o,ig. The parameter o,ig is defined as:

The Iseek service can be used to reposition the file
pointer anywhere in a file. The offset parameter may
either be positive or negative. However, it is considered
an errorto attempt to seek in front of the beginning of the
file.

o = Beginning of the file
1 = Current position of the file
2 = End of the file

Register Usage
Type

Regs

Calling:

Returns:

Contents

Description

gr121

21 (Ox1S)

Service number

1r2
1r3

fileno

File descriptor

Ir4

offset
orig

Number of bytes offset from orig
A code number indicating the point within
the file from which the offset is counted

gr96

where

gr121

Ox80000000
errcode

Success: ~ 0 (current position in the file)
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
fd:
orig:
off:

.word
.word
.word
const
consth
.Load
const
consth
load
const
consth
load
const
asneq
jmpf
nop

6
23

file descriptor
6
or1g1n
start of file
offset = 23 bytes

gr96,fd
gr96,fd
O,O,lr2,gr96
gr96,off
gr96,off
O,O,lr3,gr96
gr96,orig
gr96,orig
O,O,lr4,gr96
gr121,21
69, gr1, gr1
gr121,seek_err

set address of
file descriptor
get file descriptor
set address of
offset argument
get offset
set address of
origin argument
get origin
service = 21
call the OS
seek error if false

a

The above example shows how a file can be positioned
to a particular byte address by specifying the o,ig, which
is the starting point from which the file position is
adjusted, and the offset, which is the number of bytes
from the o,ig to move the file pointer. In this case, the

file identified by file descriptor 6 is being repositioned
to byte 23, measured from the beginning of the file
(o,ig = 0).

3-179

29K Family Application Notes
The file descriptor, offset, and orig values are loaded
from preset constants and Iseek is called to perform the
file positioning operation. If an error occurs when
attempting to reposition the file, the value returned in

3·180

gr121 is not TRUE, containing an error code that indicates the reason for the error. Upon return, gr96 also
contains the file poSition measured from the beginning
of the file. In this case, this value is not stored.

Host Interface v1.0 Specification

Remove a File

Service 22-remove
Description
This service deletes a file from the file system. Lr2
contains a pointer to the pathname of the file. The path
must point to an existing file, and the referenced file

should not be currently open. The behavior of the
remove service is undefined if the file is open.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

22 (Ox16)

Service number

1r2

pathname

A pointer to string that contains the
pathname of the file

Returns:

gr96

retval

gr121

Ox80000000
errcode

Success: = 0
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
path:

.ascii

"/usr/jack/files/myfile\O"

const
consth
const
asneq
jmpf
nop

lr2,path
lr2,path
gr121,22
69,grl,grl
gr121,rem_err

Inthe above example, a file with a UNIX-style pathname
stored in the string named path is being removed. The
address (pointer) to the string is put into Ir2 and the
kernel service 22 is called to remove the file. If the file

set address of file
pathname.
service = 22
call the OS
jump if error

does not exist, or if it has not previously been closed, an
error code will be returned in gr121; otherwise, the value
in gr121 will be TRUE.

3-181

29K Family Application Notes

Rename a File

Service 23-rename
Description
This service moves a file to a new location within the file
system. Lr2 contains a pointer to the file's old pathname
and Ir3 contains a pointer to the file's new pathname.
When all components of the old and new pathnames are

the same, except forthe filename, the file is said to have
been renamed. The file identified by the old path name
must already exist, or an error code will be returned and
the rename operation will not be performed.

Register Usage
Regs

Calling:

gr121

23 (Ox17)

Service number

Ir2

oldfile

A pointer to string containing the old pathname of the file

1r3

newfile

A pointer to string containing the new path name of the file

gr96

retval

gr121

Ox80000000
errcode

Success: = 0
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Returns:

Contents

Description

Type

Example Call
old:
new:

.ascii
.ascii

"/usr/fred/payroll/report\O"
"/usr/fred/history/june89\O"

const
consth
const
consth
const
asneq

lr2,old
lr2,old
lr3,new
lr3,new
gr121,23
69, grl, grl

service = 23 (rename)
call the as

jmpf
nop

gr121,ren_err

jump if rename error

The above example moves a file from its old path
(renaming it in the process) to its new pathname location. The file will no longer be found at the old location.

3-182

set address of old pathname
set address of new pathname

Host Interface v1.0 Specification

Return Temporary Name

Service 33-tmpnam
Description
This service generates a string that can be used as a
temporary file pathname. A different name is generated
each time it is called. Generally. the name is guaranteed
not to duplicate any existing filename. The argument
should be a valid pointer to a buffer that is
passed in
large enough to contain the constructed file name. HIF
implementations are required to allocate a minimum of
128 bytes for this purpose.

,,2

and return a non-zero error number in global register
g,121.
The HI F specification sets no standards for the format or
content of legal pathnames; these are determined by
individual operating system requirements. However.
each implementation should undertake to construct a
valid filename that is also unique.

,,2

If the argument in
contains a NULL pointer. the HIF
service routine should treat this as an error condition

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

33 (Ox21)

Service number

Ir2

addrptr

A pointer to buffer into which the filename is to be stored

gr96

filename

Success: pointer to the temporary filename string. This will be
the same as 1r2 on entry unless an error occurred
Failure: = 0 ( NULL pointer)

gr121

Ox80000000
errcode

Logical TRUE. service successful
Error number. service not successful
(implementation dependent)

Returns:

Example Call
fbuf:

= 21 bytes

.block

21

buffer size

canst
consth
canst
asneq
jmpf
nap

lr2,fbuf
lr2,fbuf
gr121,33
69,gr1,gr1
gr121,tmp_err

set buffer pointer

In the above example. the tmpnam service is called with
a pointer to tbut. which has been allocated to hold a
name that is up to 21 bytes in length. If the service is able
to construct a valid name. the filename will be stored in

service = 33
call the as
jump if error

tbutwhen the service returns. If the content of g,121 on
return is not TRUE. the program fragment jumps to
tmp_err to handle the error condition.

3-183

29K Family Application Notes

Return Seconds Since 1970

Service 49-time
Description
This service returns, in register gr96, the number of
seconds elapsed since midnight, January 1, 1970, as an
integer 32-bit value. It is assumed that the kernel service

will have access to a counter whose contents can be
preloaded that measures time, with at least a onesecond resolution, for this purpose.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

49 (Ox31)

Service number

Returns:

gr96

secs

gr121

Ox80000000
errcode

Success: "# 0 (time in seconds)
Failure: = 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
sees:

.word

0

canst
asneq
jmpf
canst
consth
store

gr121,49
69,grl,grl
gr121,tim_err
gr120,secs
gr120,secs
O,O,gr96,gr120

In the above example, the kernel service time is being
called. If the value returned in g(121 is TRUE, the
number of seconds returned in gr96is stored in the sees

3-184

service = 49
call the as
jump if error
set the address
for storing 'sees'
store the seconds

variable; otherwise, the program jumps to tim_err to
determine the cause of the error.

Host Interface v1.0 specification

Get Environment

Service 65-getenv
Description
This service searches the system environment for a
string associated with a specified symbol. Lr2contains a
pointer to the symbol name. If the symbol name is found,
a pointer to the string associated with it is returned in
gr96; otherwise, a NULL pointer is returned.
In UNIX-hosted systems, the setenv command allows
a user to associate a symbol with an arbitrary string. For
example, the command
setenv TERM vt100
defines the string 'yt 100" to be associated with the symbol named TERM. Application programs can use this
association to determine the type of terminal connected

to the system, and, therefore, use the correct set of
codes when outputting information to the user's screen.
Toaccess the string, getenv should be called with 1r2
pointing to a string containing the TERM symbol name.
The address returned in gr96 will point to the corresponding ''vt100'' string if TERM is found. In UNIXhosted systems, entering a different setenv command
lets the user select a different terminal name without
requiring recompilation of the application program.
Operating system implementations that do not include
provisions for environment variables should always
return a NULL value in gr96 when this service is
requested.

Register Usage
Type

Regs

Contents

Calling:

gr121

65 (Ox41)

Service number

1r2

name

A pointer to the symbol name

gr96

addrptr

Success: pointer to the symbol name string
Failure: = 0 ( NULL pointer)

gr121

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Returns:

Description

Example Call
mysym:
strptr:

.ascii
.word

"MYSYMBOL\O"
0

canst
consth
canst
asneq
jmpf
canst
consth
store

lr2,mysym
lr2,mysym
gr121,65
69,gr1,gr1
gr121,env_err
gr120,strptr
gr120,strptr
O,O,gr96,gr120

The above example program calls the operating system
getenv service to access a string associated with the
environment variable MYSYMBOL. If the symbol is
found, a pointer to the string associated with the symbol

set address of symbol
to be locat~d in environment
service = 65
call the OS
jump if error
set address of
stxing pointer
store string pointe~
·is returned in gr96. If the call is not successful (Le.,
gr121 holds a FALSE boolean value upon return), the
program jumps to env_err to handle the error condition.

3-185

29K Family Application Notes

Allocate Memory Space

Service 257-sysalloc
Description
This service allocates a specified number of contiguous
bytes from the operating-system-maintained heap and
returns a pointer to the base of the allocated block. Lr2
contains the number of bytes requested. If the storage is

successfully allocated, gr96 contains a pointer to the
block; otherwise, gr121 contains an error code indicating the reason for failure of the call.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121
1r2

257 (Ox101)
nbytes

Service number
Number of bytes requested

Returns:

gr96

addrptr

gr121

Ox80000000
errcode

Success: pointer to allocated bytes,
Failure: = 0 ( NULL pointer)
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
blkptr:

.word

0

canst
canst
asneq
jmpf
canst
consth
store

lr2, 1200
gr121,257
69,gr1,gr1
gr121,alloc_err
gr120,blkptr
gr120,blkptr
0,0,gr96,gr120

The above example requests a block of 1200 contiguous bytes from the system heap. If the call is successful,
the program stores the pointer returned in gr96 into a

3-186

request 1200 bytes
service = 257
call the as
jump if error
set address to store
pointer
store the pointer

local variable called blkptr. If gr121 contains a boolean
FALSE value when the service returns, the program
jumps to alloc_err to handle the error condition.

Host Interface v1.0 SpeCification

Free Memory Space

Service 258-sysfree
Description
This service returns memory to the system starting at
the address specified in 1'2. L,3 contains the number of
bytes to be released. The pointer passed to the sysfree
service in 1,2 and the byte count passed in 1,3 must
match the address returned by a previous sysalloc
service request for the identical number of bytes. No

dynamic memory allocation structure is implied by this
service. High-level language library functions such as
malloc() and free() for the C language are required to
manage random dynamic memory block allocation and
deallocation, using the sysalloc and sysfree kernel
functions as their basis.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

1r2
1r3

258 (Ox102)
addrptr
nbytes

Service number
Starting address of area returned
Number of bytes to release

gr96

retval

Success: = 0
Failure: < 0

gr121

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Returns:

Example Call

blkptr:

.word

0

canst
consth
load
canst
canst
asneq
jmpf
nap

gr120,blkptr
gr120,blkptr
O,O,lr2,gr120
Ir3,1200
gr121,258
69,grl,grl
gr121,free_err

The above example calls sysfree to deallocate 1200
bytes of contiguous memory, beginning at the address
stored in the blkpt,variable. If the call is successful, the

set address of previously
block pointer
fetch pointer to block
set number of bytes to release
service = 258
call the OS
jump if error

program continues; otherwise, if the return value in
g,121 is FALSE, the program jumps to free_err to
handle the error condition.

3-187

29K Family Application Notes

Return Memory Page Size

Service 259--getpsize
Description
This service returns, in register gr96, the page size, in
bytes, used by the memory system of the HI F implementation.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

259 (Ox103)

Service number

Returns:

gr96

page size

gr121

Ox80000000
errcode

Success: memory page size, one of the following:
1024,2048,4096, and 8192
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
pagsiz:

.word

°

const
asneq
jmpf
const
consth
store

gr121,259
69,grl,grl
gr121,pag_err
gr120,pagsiz
gr120,pagsiz
O,O,gr96,gr120

The above example calls the operating system kernel to
return the page size used by the virtual memory system.
If.the call was successful, gr121 will contain a boolean
TRUE result and the program will store the value in gr96

3-188

service = 259
call the as
jump if error
set address to
store the page size
store it!

into the pagsizvariable; otherwise, a boolean FALSE is
returned in gr121. In this case, the program will jump to
pag_err to handle the error condition.

Host Interface v1.0 Specification

Return Base Address

Service 260-getargs
Description
This service returns the base address of the commandline-argument vector argv in register gr96, as constructed by the operating system kernel when an
application program is invoked.
Arguments are stored by the operating system as a
series of NULL-terminated character strings. A pointer
containing the address of each string is stored in an

array whose base address (referred to as argv) is
returned by the getargs HIF.service. The last entry in
the array contains a NULL pointer (an address consisting of all zero bits). The number of arguments can be
computed by counting the number of pointers in the
array, using the fact that the NULL pointer terminates
the list.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

260 (Ox104)

Service number

Returns:

gr96

baseaddr

Success: base address of argv
Failure: = 0 ( NULL pointer)

gr121

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
argptr:

.word
const
asneq
jmpf
const
consth
store

0

gr121,260
69,grl,grl
gr121,bas_err
gr120,argptr
gr120,argptr
O,0,gr96,gr120

The above example calls operating system service 260
to access the command-line-argument vector address.
If the service executes without error, the program
continues by storing the argument vector address in the

service = 260
call the as
jump if error
set address where base
pointer is to be stored
store the pointer

variable basptr. If gr121 contains a boolean FALSE
value upon return, the program jumps to bas_err to
handle the error condition.

3·189

29K Family Application Notes

Return Time in Milliseconds

Service 273--clock
Description
This service returns the elapsed processor time in milliseconds. Operating system initialization procedures set
this value to zero on startup. Successive calls to this

service return times that can be arithmetically subtracted to accurately measure time intervals.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

273 (Ox111)

Service number

Returns:

gr96

msecs

gr121

Ox80000000
errcode

Success: '# 0 (time in milliseconds)
Failure: = 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
time:

.word

0

const
asneq
jmpf
const
consth
store

gr121,273
69,grl,grl
gr121,clk_err
gr120,time
gr120,time
O,O,gr96,gr120

The above example calls the operating system kernel to
get the current value of the system clock in milliseconds.
On return, if gr121 contains a boolean FALSE value, the

3-190

service = 273
call the OS
jump i f error
set the address where
time is to be stored
store the time in ms.

program jumps to elk_err to handle the error; otherwise,
the time in milliseconds is stored in the variable time.

Host Interface v1.0 Specification

Return Processor Cycles

Service 274-cycles
Description
This service returns an ascending positive number in
registers gr96 and gr97that is the number of processor
cycles that have elapsed since the last hardware
RESET was applied to the CPU. It provides a mechanism for user programs to access the contents of the
internal Am29000 timer counter register. The cycle

count can be multiplied by the speed of the processor
clock to convert it to a time value. Gr97will contain the
most significant bits of the cycle count, while gr96 will
contain the least significant bits. HIF implementations of
this service are required to provide a cycle count with a
minimum of 56 bits of precision.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121

274 (Ox112)

Service number

Returns:

gr96

cycles

Success: Bits 0-31 of processor cycles
Failure: = 0 (in both gr96 and gr97)

gr97

cycles

Success: Bits 32-55 of processor cycles
Failure: = 0 (in both gr96 and gr97)

gr121

Ox80000000
errcode

Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
cycles:

.word
.word

°°

MSBs of cycles
LSBs of cycles

canst
asneq
jmpf
canst
consth
store
add
store

gr121,274
69,grl,grl
gr121,cyc_err
gr120,cycles
gr120,cycles
O,O,gr97,gr120
gr120,gr120,4
O,O,gr96,gr120

service = 274
call the as
jump if error
set the address where the
count is to be stored
store the MSBs,
increment the address,
then store the LSBs of cycles.

The above example program fragment calls the operating system service 274 to access the number of CPU
cycles that have elapsed since it was powered on. The
cycle count (in gr96 and gr97) is stored in the two words

addressed by the variable cycles if the service call is
successful. If gr121 contains a boolean FALSE value on
exit, the program jumps to cyc_err to handle the error
condition.

3-191

29K Family Application Notes

Set User Trap Address

Service 289-setvec
Description
This service sets the address for user-level trap handler
services that implement the local register stack spill and
fill traps. It returns an indication of success or failure in

register gr96. The method used to invoke these traps in
user mode is described on page 6 of this specification, in
the "User-Mode Traps" section.

Register Usage
Type

Regs

Contents

Description

Calling:

gr121
1r2
1r3

289 (Ox121)
trapno
funaddr

Service number
trap number
address of user trap handler

Returns:

gr96

retval

gr121

Ox80000000
errcode

Success: = 0
Failure: < 0
Logical TRUE, service successful
Error number, service not successful
(implementation dependent)

Example Call
trpadr:

.word
const
const
consth
const
asneq
jmpf
const
consth
store

0
lr2,64
lr3,t64_hnd
lr3,t64_hnd
gr121,289
69,grl,grl
gr121,vec_err
gr120,trpadr
gr120,trpadr
O,O,gr96,gr120

The above example calls the setvec service to pass the
address to be used for the trap 64 trap handler routine. If
the service returns with gr121 containing a boolean

3-192

trap number = 64
set address of
trap-64 handler
service = 289
call the as
jump if error
set address where to
store the trap address
and store it!

TRUE result, the program continues by storing the trap
address returned in gr96; otherwise, the program jumps
to vec_err to handle the error condition.

Host Interface v1.0 Specification

PROCESS ENVIRONMENT

Register Allocate Bound (gr126)

There are standard memory and register initializations
that must be performed by a HIF-conforming kernel
before entry to a user program. In C-Ianguage
programs, this is usually performed by the module crtO.
This module receives control when an application
program is invoked, and executes prior to invocation of
the user's main function. Other high-level languages
have similar modules.

The register allocate bound (RAB) register contains the
register stack address of the lowest-addressed word
contained within the register file. RAB is referenced in
the prolog of most user program functions to determine
whether a register spill operation is necessary to accommodate the local register requirements of the called
function.
Register Free Bound (gr127)

STARTUP INITIALIZATION
Initialization procedures must establish appropriate
values for the general registers mentioned below. In
addition, file descriptors for the standard input and output devices must be opened.
Register Stack Pointer (gr1)
The register stack pointer (RSP) register contains the
main memory address in which the local register IrOwill
be saved, and from which it will be restored. The content
of RSPis compared to the content of RABto determine
when it is necessary to spill part of the local register
stack to memory. On startup, the values in RAB, RSP
and RFB should be initialized to prevent a spill trap from
occurring on entry to the crtO code, as shown by the
following relation:
(RAB + 256) RSP RFB

This provides the crtO code with at least 64 registers on
entry, which should be a sufficient number to accomplish its purpose.
Memory Stack Pointer (gr125)
The memory stack pOinter (MSP) register points to the
top of the memory stack, or the lowest-addressed entry
on the memory stack. This register must be preserved
(or, more conventionally, restored).

The register free bound (RFB) register contains the
register stack address of the lowest-addressed word not
contained within the register file (andgreaterthan RAB).
RFB is referenced in the epilog of most user program
functions to determine whether a register fill operation is
necessary to restore previously spilled registers needed
by the function's caller.
Open File Descriptors
File descriptor 0 (corresponding to the standard input
device) must be opened for text mode input. File
descriptors 1 and 2 (corresponding to standard output
and standard error devices) must be opened for text
mode output prior to entry to the user's program.
PROGRAM TERMINATION
The only valid way for an application to terminate execution is by calling the exit service. Most high-level
languages provide this capability, even if the programmer does not explicitly invoke a corresponding library
function.
TRAP HANDLERS
The trap vector entries shown in Table 5 must be
installed, and corresponding handlers must be
provided.

3-193

29K Family Application Notes
Table 5~ Trap Handler Vectors

Note:

3·194

Trap

Description

32
33
34
35
36
42
43
44
45
46
47
48
49
50
51
52
53
54
55
64
65
69

MULTIPLY
DIVIDE
MULTIPLU
DIVIDU
CONVERT
FEQ
DEQ
FGT
DGT
FGE
DGE
FADD
DADO
FSUB
DSUB
FMUL
OMUL
FOIV
001 V
Spill (Set up by the user's task through a setvec call)
Fill (Set up by the user's task through a setvec call)
HIF System Call

The Spill (64) and Fill (65) traps are returned to the user's code to perform the trap handling functions in user
mode,as described in the "User Mode Traps" section.

Host Interface v1.0 Specification

APPENDIX A: HIF QUICK REFERENCE
Table 6 lists the HIF service calls, calling parameters,
and the returned values. If a column entry is blank, it

means the register is not used or is undefined. Table 7
describes the parameters given in Table 6.

Table 6. HIF Service Calls

Title
exit

Returned Values

Call1na Parameters

Service
GR121
1

LR2
exitcode

GR97

LR3

LR4

GR96

GR121

mode

pflag

fileno

errcode

retval

errcode

,

open

17

pathname

close

18

fileno

read

19

fileno

buffptr

nbytes

count

errcode

nbytes

count

errcode

orig

where

errcode

retval

errcode

retval

errcode

filename

errcode

secs

errcode

write

20

fileno

buffptr

Iseek

21

file no

offset

remove

22

pathname

rename

23

oldfile

tmpnam

33

addrptr

time

49

getenv

65

name

addrptr

errcode

sysalloc

257

nbytes

addrptr

errcode

addrptr

newfile

sysfree

258

retval

errcode

getpsize

259

pagesize

errcode

getargs

260

baseaddr

errcode

clock

273

msecs

errcode

cycles

274

LSBs cycles

setvec

289

trapno

nbytes

funaddr

retval

MSBs cycles

errcode
errcode

3-195

29K Family Application Notes
Table 7. Service Call Parameters
Parameter

Description

addrptr

A pointer to an allocated memory area, a command-line-argument array, a pathname buffer, or a
NULL-terminated environment variable name string.
The base address of the command-line-argument vector.
A pointer to the buffer area where data is to be read from or written to during the execution of I/O
services.
The number of bytes actually read from file or written to a file.
The number of processor cycles (returned value).
The error code returned by the service. These are usually the same as the codes returned in the UNIX
ermo variable. See Appendix B, Table 8, for a list of HIF error codes.
The exit code of the application program.
A pointerto a NULL-terminated ASCII string that contains the directory path of a temporary filename.
The file descriptor which is a small integer number. File descriptors 0, 1, and 2 are guaranteed to exist
and correspond to open files on program entry (0 refers to the UNIX equivalent of stdln and is opened
for input; 1 refers to the UNIX stdout, and is opened for output; 2 refers to the UNIX stderr, and is
opened for output).
A pointer to the address of a function.
A series of option flags whose values represent the operation to be performed.
Milliseconds.
A pointer to a NULL-terminated ASCII string that contains an environment variable name.
The number of data bytes requested to be read from or written to a file, or the number of bytes to
allocate from the heap.
A pointer to a NULL-terminated ASCII string that contains the directory path of a new filename.
The number of bytes from a specified pOSition (orig) in a file.
A pointer to NULL-terminated ASCII string that contains the directory path of the old filename.
A value of 0, 1, or 2 that refers to the beginning, the current position, or the position of the end of a file.
The memory page size in bytes (returned val).
A pointer to a NULL-terminated ASCII string that contains the directory path of a filename.
The UNIX file access permission codes.
The return value that indicates success or failure.
The seconds count returned.
The trap number.
The current position in a specified file.

baseaddr
buffptr
count
cycles
errcode
exitcode
filename
fileno

funaddr
mode
msecs
name
nbytes
newfile
offset
oldfile
orig
pagesize
pathname
pflag
retval
secs
trapno
where

3-196

Host Interface vl.0 Specification

APPENDIX 8: ERROR NUMBERS
HIF implementations are required to return error codes
when a requested operation is not possible. The codes
from 0 to 255 are reserved for compatibility with current
and future error return standards. The currently
assigned codes and their meanings are shown in

Table 8.lf a HIF implementation returns an error code in
the range of 0 to 255, it must carry the identical meaning
to the corresponding error code in this table. Error code
values larger than 255 are available for implementationspecific errors.

Table 8. HIF Error Numbers Assigned
Number

Error Name

Description
Not used.

0
EPERM

Not owner
This error indicates an attempt to modify a file in some way forbidden except to
its owner.

2

ENOENT

No such file or directory
This error occurs when a file name is specified and the file should exist but
does not, or when one of the directories in a path name does not exist.

3

ESRCH

No such process
The process or process group whose number was given does not exist, or any
such process is already dead.

4

EINTR

Interrupted system call
This error indicates that an asynchronous signal (such as interrupt or quit) that
the user has elected to catch occurred during a system call.

5

EIO

I/O error
Some physical I/O error occurred during a read or write. This error may in
some cases occur on a call following the one to which it actually applies.

6

ENXIO

No such device or address
I/O on a special file refers to a sub-device that does not exist or is beyond the
limits of the device.

7

E2BIG

Arg list is too long
An argument list longer th~n 5120 bytes is presented to execve.

8

ENOEXEC

Exec format error
A request is made to execute a file that, although it has the appropriate permissions, does not start with a valid magic number.

9

EBADF

Bad file number
Eithera file descriptor refers to noopenfile, or a read (write) request is made to
a file that is open only for writing (reading).

10

ECHILD

No children
Wait and the process has no living or unwaited-for children.

11

EAGAIN

No more processes
In a fork, the system's process table is full, or the user is not allowed to create
any more processes.

12

ENOMEM

Not enough memory
During an execve or break, a program asks for more memory than the system
is able to supply or else a process size limit would be exceeded.

3-197

29K Family Application Notes
Table 8. HIF Error Numbers Asslgne~ (continued)
Number
13

Error Name
EACCESS

Description
Permission denied
An attempt was made to access a file in a way forbidden by the protection
system.

14

EFAULT

Bad address
The system encountered a hardware fault in attempting to access the arguments of a system call.

15

ENOTBLK

Block device required
A plain file was mentioned where a block device was required. such as in
mount.

16

EBUSY

Device busy
An attempt was made to mount a device that was already mounted. or an
attempt was made to dismount a device on which there is an active file (open
file. current directory. mounted-on file. or active text segment).

17

EEXIST

File exists
An existing file was mentioned in an inappropriate context. e.g .• link.

18

EXDEV

Cross-device link
A hard link to a file on another device was attempted.

19

ENODEV

No such device
An attempt was made to apply an inappropriate system call to a device. e.g .• to
read a write-only device. or the device is not configured by the system.

20

ENOTDIR

Not a directory
A non-directory was specified where a directory is required. for example. in a
path name or as an argument to chdir.

21

EISDIR

Is a directory
An attempt to write on a directory.

22

EINVAL

Invalid argument
This error occurs when some invalid argument for the call is specified. For
example. dismounting a non-mounted device. mentioning an unknown Signal
in signal. or specifying some other argument that is inappropriate for the call.

23

ENFILE

File table overflow
The system's table of open files is full. and temporarily no more open requests
can be accepted.

24

EMFILE

Too many open files
The configuration limit on the number of simultaneously open files has been
exceeded.

25

ENOTTY

Not a typewriter
The file mentioned in SUy or gUy is not a terminal orone of the other devices to
which these calls apply.

26

ETXTBSY

Text file busy
The referenced text file is busy and the current request can not be honored.

27

EFBIG

File too large
The size of a file exceeded the maximum limit.

3-198

Host Interface v1.0 SpeCification
Table 8. HIF Error Numbers Assigned (continued)
Error Name

Description

28

ENOS PC

No space left on device
A write to an ordinary file, the creation of a directory or symbolic link, or the
creation of a directory entry failed because no more disk blocks are available
on the file system.

29

ESPIPE

Illegal seek
A seek was issued to a socket or pipe. This error may also be issued for other
non-seekable devices.

30

EROFS

Read-only file system
An attempt to modify a file or directory was made on a device mounted readonly.

31

EMLINK

Too many links
An attempt was made to establish a new link to the requested file and the limit
of simultaneous links has been exceeded.

32

EPIPE

Broken pipe
A write on a pipe or socket was attempted for which there is no process to read
the data. This condition normally generates a signal; the error is returned if the
signal is caught or ignored.

33

EDOM

Argument too large
The argument of a function in the math package is out of the domain of the
function.

34

ERANGE

Result too large
The value of a function in the math package is unrepresentable within machine
precision.

35

EWOULDBLOCK

Operatiqn would block
An operation that would cause a process to block was attempted on an object
in non-blocking mode.

36

EINPROGRESS

Operation now in progress
An operation that takes a long time to complete was attempted on a non-blocking object.

37

EALREADY

Operation already in progress
An operation was attempted on a non-blocking object that already had an
operation in progress.

38

ENOTSOCK

Socket-operation on non-socket
A socket-oriented operation was attempted on a non-socket device.

39

EDESTADDRREQ

Destination address required
A required address was omitted from an operation on a socket.

40

EMSGSIZE

Message too long
A message sent on a socket was larger than the internal message buffer or
some other network limit.

41

EPROTOTYPE

Protocol wrong type for socket
A protocol was specified that does not support the semantics of the socket type
requested.

Number

3·199

29K Family Application Notes
Table 8. HIF Error Numbers AssIgned (continued)
Error Name

Description

42

ENOPROTOOPT

Option not supported by protocol
A bad option' or level was specified when accessing socket options.

43

EPROTONOSUPPORT Protocol not supported
The protocol has not been configured into the system, or no implementation for
it exists.

44

E~OCKTNOSUPPORT

Socket type not supported
The support for the socket type has not been configured into the system, or no
implementation for it exists.

45

EOPNOTSUPP

Operation not supported on socket
For example, trying to accept a connection on a datagram socket.

46

EPFNOSUPPORT

Protocol family not supported
The protocol family has not been configured into the system or no implementation for it exists.

47

EAFNOSUPPORT

Address family not supported by protocol family
An address was used that is incompatible with the requested protocol.

48

EADDRINUSE

Address already in use
Only one usage of each address is normally permitted.

49

EADDRNOTAVAIL

Cannot assign requested address
This normally results from an attempt to create a socket with an address not on
this machine.

50

ENETDOWN

Network is down
A socket operation encountered a dead network.

51

ENETUNREACH

Network is unreachable
A socket operation was attempted to an unreachable network.

52

ENETRESET

Network dropped connection on reset
The host yo'u were connected to crashed and rebooted.

53

ECONNABORTED

Software caused connection abort
A connection abort was caused internal to your host machine ..

54

ECONNR~SET

Connection reset by peer
A connection was forcibly closed by a peer. This normally results from a loss of
the connection on the remote socket due to a timeout or a reboot.

55

ENOBUFS

No buffer space available
An operation on a socket or pipe was not performed because the system
lacked sufficient buffer space or because a queue was full.

56

EISCONN

Socket is already connected
A connect request was made on an already connected socket; or a sendto or
sendmsg request on a connected socket specified a destination when already
connected.

Number

3-200

Host Interface v1.0 Specification
Table 8. HIF Error Numbers Assigned (continued)
Number

Error Name

57

ENOTCONN

Description
Socket is not connected
A request to send or receive data was disallowed because the socket was not
connected and (when sending on a datagram socket) no address was
supplied.

58

ESHUTDOWN

Cannot send after socket shutdown
A request to send data was disallowed because the socket had already been
shut down with a previous shutdown call.

59

ETOOMANYREFS

Too many references; cannot splice.

60

ETIMEDOUT

Connection timed out
A connect or send request failed because the connected party did not properly
respond after a period of time. (The timeout period is dependent on the
communication protocol.)

61

ECONNREFUSED

Connection refused
No connection could be made because the target machine actively refused it.
This usually results from trying to connect to a service that is inactive on the
foreign host.

62

ELOOP

63

ENAMETOOLONG

Too many levels of symbolic links
A pathname lookup involved more than the maximum limit of symbolic links.
File name too long
A component of a pathname exceeded the maximum name length, or an entire
path name exceeded the maximum path length.

64

EHOSTDOWN

Host is down
A socket operation failed because the destination host was down.

65

EHOSTUNREACH

Host is unreachable
A socket operation was attempted to an unreachable host.

66

ENOTEMPTY

Directory not empty
A non-empty directory was supplied to a remove directory or rename call.

67

EPROCLIM

Too many processes
The limit of the total number of processes has been reached. No new
processes can be created.

68

EUSERS

Too many users
The limit of the total number of users has been reached. No new users may
access the system.

69

EDaUOT

Disk quota exceeded
A write to an ordinary file, the creation of a directory or symbolic link, or the
creation of a directory entry failed because the user's quota of disk blocks was
exhausted; orthe allocation of an inodefor a newly created file failed because
the user's quota of inodes was exhausted.

70

EVDBAD

RVD related disk error

3-201

Table of Contents

CHAPTER 4
General Information

Related Literature ..................................................................................................................................................4-3
Package Outlines ... ................................................................................................................................................4-4

Related Literature

CHAPTER 4
RELATED LITERATURE

Additional Support Literature
The following is a list of AMD 29K Family literature that can be ordered from your local AMD Sales Representative
or the Literature Distribution Center at (800) 222-9323, extension 5000; inside California, call (408) 749-5000.
Technical and marketing information concerning the 29K Family also can be obtained by calling the 29K Hotline at
(800) 2929-AMD.

Order No.
09548
10344
10345
10620
10621
10623
11426
11852

Title
Am29000 Article Reprint Brochure
Am29000 Family Overview Brochure
29K Support Products Brochure'
Am29000 User's Manual
Am29000 Performance Analysis Brochure
Am29000 Memory Design Handbook
Fusion 29K Catalog
Am29027 Handbook

4-3

General Information

PACKAGE OUTLINES*
CGX169

....

BOnoMV!EW

------1.100
H

"
"

BSC------J

1•

L

..

N

P

R

T

1.740

UiO
----------4~~+_

,100 BSC-I

i-

PIO II 07322B

"For reference only.

*For reference only. All dimensions are measured in inches. BSC is an ANSI standard for Basic Space Centering.

4·4

Package Outlines

CQ164

..
r-

.

1.665
1-:710
1.140

..-J

-1~5

-.

--

1.000

sse

..

--

.500

sse
Ilir

Ir

~:250-..

MIN

~

f
I~

~
~

II:

=

i:=
1.665 1.140
1-:710 1.165

!:I:::

g
~

.006.l

·01OT

-'ii
-'ii

t

~

.025

MAX

+

t

5

=

--'
--'

~

'1
u

u uuu

uu

u

uu

TOP VIEW
.004 .008
.008 ±.006
.080
.105

+

•

t

=t~~~lt~==I~_____________
4 ____~f~t~f
130!l2A

4-5

Notes

/

Notes

Notes

Notes

International (Continued) _ _ _ _ _ _ __
North American __________
ALABAMA .............................................................. (205)
ARIZONA ............................................................... (602)
CALIFORNIA,
Culver City ........................................................ (213)
Newport Beach ................................................ F14l
Rosev!"e ........................................................... (916)

882-9122
242-4400
645-1524
752-6262
786-6700

~~~ ~~;~~.:::.:::::::::::::::::::::::::::::::::::::::::::::::::::::: (~6~) ~~g:bg~g

Woodland Hills ................................................. (818) 992-4155
CANADA, Ontario,

~}W~!:d'aie .:::::::::: :::: ::::::::::: ::::::::::::::: :::::::::::::::: !~ ~ ~~ ~~tg~~g

COLORADO .......................................................... (303)
CONNECTICUT .................................................... (203)
FLORIDA,
Clearwater ........................................................ (813)
Ft. Lauderdale .................................................. (305)
Orlando (Casselberry) .................................... (407)
GEORGIA .............................................................. (404)
ILLINOIS,
Chicago (Itasca) .............................................. (312)
Naperville .......................................................... (312)

741-2900
264-7800
530-9971
776-2001
830-8100
449-7920
773-4422
505-9517

~1~~t~ N'D'::::::::::::::::::::::::::::::::::::::::::::::::::::.::::::: ~~6 ~l ~~~:~j ~ g
MASSACHUSETTS .............................................. (617) 273-3970
MICHIGAN ............................................................. (313) 347-1522
MINNESOTA ......................................................... (612) 938-0001
NEW JERSEY,

~~~;r~p~~~·::::::::::::::::::::::::::::::::::::::::::::::::::::::: !~gil ~~~:~6gg
NEW YORK,

~~~~~!~:/~:i:~::::::::::::::::::::::::::::::::::::::::::::::::::: !~~ ~l i~r:~!~g

NORTH CAROLlNA .............................................. (919)
OHIO,
Columbus (Westerville) .................................. (614)
Dayton •.............................................................. (513)
OREGON ............................................................... (503)
PENNSyLVANIA ................................................... (215)
SOUTH CAROLINA .............................................. (803)
TEXAS,
Austin ................................................................ (512)
Dallas ................................................................ (214)
Houston ............................................................. (713)

878-8111

891-6455
439-0470
245-0080
398-8006
772-6760
346-7830
934-9099
785-9001

In terna tional ___________
BELGIUM, Bruxelles ....... TEL ............................. (02) 771-91-42
FAX ............................. (02) 762-37-12
TLX ..................................... 846-61028
FRANCE, Paris ................ TEL ............................ (1) 49-75-10-10
FAX ............................ (1) 49-75-10-13
TLX ........................................ 263282F
WEST GERMANY,
Hannover area ............ TEL .............................. (0511) 736085
FAX .............................. (0511) 721254
TLX ........................................... 922850
MOnchen ...................... TEL ................................. (089) 4114-0
FAX ................................ (089) 406490
TLX ........................................... 523883
Stuttgart ....................... TEL ........................... (0711) 62 3377
FAX .............................. (0711) 625187
TLX ........................................... 721882
HONG KONG, .................. TEL ............................. 852-5-8654525
Wanchai
FAX ............................. 852-5-8654335
TLX .......................... 67955AMDAPHX
ITALY, Milan .................... TEL ................................ (02) 3390541
................................ (02) 3533241
FAX ................................ (02) 3498000
TLX ................................... 843-315286
JAPAN,
Kanagawa .................... ~~~ ::::::::::::::::::::::::::::::::: ~~~:~ ~:~n~
Tokyo ........................... TEL ............................... (03) 345-8241
FAX ..................... _......... (03) 342-5196
TLX ........................ J24064AMDTKOJ
Osaka ........................... TEL ................................. 06-243-3250
FAX ................................. 06-243-3253

KOREA, Seoul ................. TEL ............................... 822-784-0030
FAX ............................... 822-784-8014
LATIN AMERICA,
Ft. Lauderdale ............. TEL ............................. (305) 484-8600
FAX ............................ (305) 485-9736
TLX ................. 5109554261 AMDFTL
NORWAY, Hovik .............. TEL .................................. (03) 010156
FAX .................................. (02) 591959
TLX .................................. 19079HBCN
SINGAPORE .................... TEL ................................... 65-3481188
FAX .................................. 65-3480161
TLX .......................... 55650 AMDMMI
SWEDEN,
Stockholm .................... TEL .............................. (08) 733 03 50
(Sundbyberg)
FAX .............................. (08) 733 22 85
TLX ............................................. 11602
TAIWAN ............................ TEL ............................. 886-2-7213393
FAX ............................. 886-2-7723422
TLX ............................. 886-2-7122066
UNITED KINGDOM,
Manchester area ......... TEL .............................. (0925) 828008
(Warrington)
FAX .............................. (0925) 827693
TLX ................................... 851-628524
London area ................ TEL .............................. (0483) 740440
(Woking)
FAX .............................. (0483) 756196
TLX ................................... 851-859103

North American Representatives _ __
CANADA
Burnaby, B.C.
DAVETEK MARKETING ................................. (604)
Calgary, Alberta
DAVETEK MARKETING ................................. (403)
Kanata, Ontario
VITEL ELECTRONICS .................................... (613)
Mississauga, Ontario
VITEL ELECTRONICS .................•.................. (416)
Lachine, Quebec
VITEL ELECTRONICS .................................... (514)
IDAHO
INTERMOUNTAIN TECH MKTG, INC .......... (208)
ILLINOIS
HEARTLAND TECH MKTG, INC .................. (312)
INDIANA
Huntington - ELECTRONIC MARKETING
CONSULTANTS, INC ...................................... (317)
Indianapolis - ELECTRONIC MARKETING
CONSULTANTS, INC ...................................... (317)
IOWA
LORENZ SALES .............................................. (319)
KANSAS
Merriam -LORENZ SALES ............................ (913)
Wichita - LORENZ SALES ............................. (316)
KENTUCKY
ELECTRONIC MARKETING
CONSULTANTS, INC ...................................... (317)
MICHIGAN
Birmingham - MIKE RAICK ASSOCIATES .. (313)
Holland-COM-TEK SALES, INC ................. (616)
Novi -COM-TEK SALES, INC ....................... (313)
MISSOURI
LORENZ SALES .............................................. (314)
NEBRASKA
LORENZ SALES .............................................. (402)
NEW MEXICO
THORSON DESERT STATES ....................... (505)
NEW YORK
East Syracuse - NYCOM, INC ...................... (315)
Woodbury - COMPONENT
CONSULTANTS, INC ...................................... (516)
OHIO
Centerville - DOLFUSS ROOT & CO ........... (513)
Columbus - DOLFUSS ROOT & CO ............ (614)
Strongsville -DOLFUSS ROOT & CO ......... (216)
PENNSYLVANIA
DOLFUSS ROOT & CO .................................. (412)
PUERTO RICO
COMP REP ASSOC, INC ............................... (809)
UTAH, R2 MARKETING ....................................... (801)
WASHINGTON
ELECTRA TECHNICAL SALES ..................... (206)
WISCONSIN
HEARTLAND TECH MKTG, INC ................... (414)

430-3680
291-4984
592-0060
676-9720
636-5951
888-6071
577-9222
921-3450
921-3450
377-4666
384-6556
721-0500
921-3452
644-5040
399-7273
344-1409
997-4558
475-4660
293-8555
437-8343
364-8020
433-6776
885-4844
238-0300
221-4420
746-6550
595-0631
821-7442
792-0920

Advanced Micro Devices reserves the right to make changes in its product without notice in order to improve design or performance characteristics. The performance
characteristics listed in this document are guaranteed by specific tests, guard banding, design and other practices common to the industry. For specific testing details,
contact your local AMD sales representative. The company assumes no responsibility for the use of any circuits described herein.

~
~

....

Advanced Micro Devices, Inc. 901 Thompson Place, P.O. Box 3453, Sunnyvale, CA 94088, USA
Tel: (408) 732-2400 • TWX: 910-339·9280 • TELEX: 34-6306 • TOLL FREE: (800) 538-8450
APPLICATIONS HOTLINE TOLL FREE: (800) 222·9323 • (408) 749-5703

© 1989 Advanced Micro Devices, Inc.
8/9189
Printed In USA

-_ .. _--_ .. _ - - - -



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
XMP Toolkit                     : Adobe XMP Core 4.2.1-c043 52.372728, 2009/01/18-15:56:37
Create Date                     : 2013:08:06 10:04:09-08:00
Modify Date                     : 2013:08:07 11:33:21-07:00
Metadata Date                   : 2013:08:07 11:33:21-07:00
Producer                        : Adobe Acrobat 9.55 Paper Capture Plug-in
Format                          : application/pdf
Document ID                     : uuid:1f0949f7-0fad-e84c-9c58-1b669d2fef14
Instance ID                     : uuid:6f31228e-8d1d-7f49-8a83-4b71c22b150d
Page Layout                     : SinglePage
Page Mode                       : UseNone
Page Count                      : 447
EXIF Metadata provided by EXIF.tools

Navigation menu