231917 001 80387 Programmers Reference Manual 1987
231917-001_80387_Programmers_Reference_Manual_1987 231917-001_80387_Programmers_Reference_Manual_1987
User Manual: manual pdf -FilePursuit
Open the PDF directly: View PDF .
Page Count: 258
Download | |
Open PDF In Browser | View PDF |
inter LITERATURE SALES ORDER FORM NAME: _________________________________________________ COMPANY: _______________________________________________ ADDRESS: _________________________________________________ CITY: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATE: _ _ _ _ ZIP: _ _ _ __ COUNTRY: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ____ PHONE NO.:('-_ _.....!-_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ _ _ ORDER NO. TITLE QTY. PRICE TOTAL _ _ X _ _ _ = _ _ ___ _ _ X _ _ _ = _ _ ___ __ X ___ = _____ _ _ X ____ = _ _ ___ __ X ___ = _____ _ _ X _ _ _ = _ _ ___ _ _ X _ _ _ = _ _ ___ _ _ X _ _ _ = _ _ ___ _ _ X _ _ _ = _ _ ___ _ _ _ X _ _ _ = ______ Subtotal _ _ ___ Must Add Your Local Sales Tax ______ Must add appropriate postage to subtotal (10% U.S. and Canada, 20% all other) - - - - - - - - - - - - ! » Postage ______ Total _ _ ___ Pay by Visa, MasterCard, American Express, Check, Money Order, or company purchase order payable to Intel Literature Sales. Allow 2-4 weeks for delivery. Visa 0 MasterCard 0 American Express Expiration Date _ _ _ __ Account No. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ___ o Signature: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Mail To: Intel Literature Sales P.O. Box 58130 Santa Clara, CA 95052-8130 International Customers outside the U.S. and Canada should contact their local Intel Sales Office or Distributor listed in the back of most Intel literature. Call Toll Free: (800) 548-4725 for phone orders Prices good unli112/31/87. Source HB 80387 PROGRAMMER'S REFERENCE MANUAL 1987 Intel Corporation makes no warranty for the use of its products and assumes no responsibility for any errors which may appear in this document nor does it make a commitment to update the information contained herein. Intel retains the right to make changes to these specifications at any time, without notice. Contact your local sales office to obtain the latest specifications before placing your order. The following are trademarks of Intel Corporation and may only be used to identify Intel Products: t. Above, BITBUS, COMMputer, CREDIT, Data Pipeline, FASTPATH, Genius, i, ICE, iCEL, iCS, iDBP, iDIS, I"ICE, iLBX, im, iMDDX, iMMX, Inboard, Insite, Intel, intel, intelBOS, Intel Certified, Intelevision, inteligent Identifier, inteligent Programming. Intellec, Intellink, iOSP, iPDS, iPSC, iRMK, iRMX, iSBC, iSBX, iSDM, iSXM, KEPROM, Library Manager, MAPNET, MCS, Megachassis, MICROMAINFRAME, MULTIBUS, MULTICHANNEL, MUL TIMODULE, MultiSERVER, ONCE, OpenNET, OTP, PC BUBBLE, Plug-A-Bubble, PROMPT, Promware, QUEST, QueX, Quick-Pulse Programming, Ripplemode, RMX/80, RUPI, Seamless, SLD, SugarCube, SupportNET, UPI, and VLSiCEL, and the combination of ICE, iCS, iRMX, iSBC, iSBX, iSXM, MCS, or UPI and a numerical suffix, 4-SITE. MDS is an ordering code only and is not used as a product name or trademark. MDS" is a registered trademark of Mohawk Data Sciences Corporation. "MULTIBUS is a patented Intel bus. Unix is a trademark of AT&T Bell Labs. MS-DOS, XENIX, and Multiplan are trademarks of Microsoft Corporation. Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation. SuperCalc is a registered trademark of Computer Associates International. Framework is a trademark of Ashton-Tate. System 370 is a trademark of IBM Corporation. AT is a registered trademark of IBM Corporation. Additional copies of this manual or other Intel literature may be obtained from: Intel Corporation Literature Distribution Mail Stop SC6-59 3065 Bowers Avenue Santa Clara, CA 95051 @INTEL CORPORATION 1987 CG-5/26/87 PREFACE This manual describes the 80387 Numeric Processor Extension (NPX) for the 80386 microprocessor. Understanding the 80387 requires an understanding of the 80386; therefore, a brief overview of 80386 concepts is presented first. A detailed discussion of the 80386 microprocessor can be found in the 80386 Programmer's Reference Manual. THE 80386 MICROSYSTEM The 80386 is the basis of a new VLSI microprocessor system with exceptional capabilities for supporting large-system applications. This powerful microsystem is designed to support multiuser reprogrammable and real-time multitasking applications. Its dedicated system support circuits simplify system hardware; sophisticated hardware and software tools reduce both the time and the cost of product development. The 80386 micro system offers a totalsolution approach, enabling you to develop high-speed, interactive, multiuser, multitasking--even multiprocessor-systems more rapidly and at higher performance than ever before. • • • Reliability and system up-time are becoming increasingly important in all applications. Information must be protected from misuse or accidental loss. The 80386 includes a sophisticated and flexible four-level protection mechanism that can isolate layers of operating system programs from application programs to maintain a high degree of system integrity. The 80386 addresses up to 4 gigabytes of physical memory to support today's application requirements. This large physical memory enables the 80386 to keep many large programs and data structures simultaneously in memory for high-speed access. For applications with dynamically changing memory requirements, such as multiuser business systems, the 80386 CPU provides on-chip memory management and virtual memory support. On an 80386-based system, each user can have up to 64 terabytes of virtual-address space. This large address space virtually eliminates restrictions on the size of programs that may be part of the system. The memory management features are subject to control of systems software; therefore, systems software designers can choose among a variety of memory-organization models. Systems designers can choose to view memory in terms of fixed-length pages, in terms of variable length segments, or as a combination of pages and segments. The sizes of segments can range from one byte to 4 gigabytes. Virtual memory can be implemented either at the level of segments or at the level of pages. Large multiuser or real-time multitasking systems are easily supported by the 80386. High-performance features, such as a very high-speed task switch, fast interrupt-response time, intertask protection, page-oriented virtual memory, and a quick and direct operating system interface, make the 80386 highly suited to multiuser/multitasking applications. The 80386 has two primary operating modes: real-address mode and protected mode. In real-address mode, the 80386/80387 is fully upward compatible from the 8086,8088, 80186, and 80188 microprocessors and from the 80286 real-address mode; all of the extensive libraries of 8086 and 8088 software execute 15 to 20 times faster on the 80386, without any modification. iii PREFACE • In protected-address mode, the advanced memory management and protection features of the 80386 become available, without any reduction in performance. Upgrading 8086 and 8088 application programs to use these new memory management and protection features usually requires only reassembly or recompilation (some programs may require minor modification). Entire 80286 protected-mode applications can run in this mode without modification. • The virtual-8086 mode of the 80386 is available when the primary mode is protected mode. Virtual-8086 mode enables direct execution of multiple 8086/8088 programs within a protected-mode environment. Most 8086 and 8088 application programs can be executed in this environment without alteration (refer to the 80386 Programmer's Reference Manual for differences from 8086). This high degree of compatibility between 80386 and earlier members of the 8086 processor family reduces both the time and the cost of software development. THE ORGANIZATION OF THIS MANUAL This manual describes the 80387 Numeric Processor Extension (NPX) for the 80386 microprocessor. The material in this manual is presented from the perspective of software designers, both at an applications and at a systems software level. • Chapter 1, "Introduction to the 80387 Numerics Processor Extension," gives an overview of the 80387 NPX and reviews the concepts of numeric computation using the 80387. • Chapter 2, "80387 Numerics Processor Architecture," presents the registers and data types of the 80387 to both applications and systems programmers. Chapter 3, "Special Computational Situations," discusses the special values that can be represented in the 80387's real formats---denormal numbers, zeros, infinities, NaNs (not a number )-as well as numerics exceptions. This chapter should be read thoroughly by systems programmers, but may be skimmed by applications programmers. Many of these special values and exceptions may never occur in applications programs. Chapter 4, "80387 Instruction Set," provides functional information for software designers generating applications for systems containing an 80386 CPU with an 80387 NPX. The 80386/80387 instruction set mnemonics are explained in detail. • Chapter 5, "Programming Numeric Applications," provides a description of programming facilities for 80386/80387 systems. A comparative 80387 programming example is given. • Chapter 6, "System-Level Numeric Programming," provides information of interest to systems software writers, including details of the 80387 architecture and operational characteristics. $ Chapter 7, "Numeric Programming Examples," provides several detailed programming examples for the 80387, including conditional branching, the conversion between floating-point values and their ASCII representations, and the use of trigonometric functions. These examples illustrate assembly-language programming on the 80387 NPX. Appendix A, "Machine Instruction Encoding and Decoding," gives reference information on the encoding of NPX instructions. This information is useful to writers of debuggers, exception handlers, and compilers. iv PREFACE • • • • • • Appendix B, "Exception Summary," provides a list of the exceptions that each instruction can cause. This list is valuable to both applications and systems programmers. Appendix C, "Compatability between the 80387 and the 80287/8087," describes the differences from the 80387 that are common to the 80287 and the 8087. Appendix D, "Compatability between the 80387 and the 8087," describes the additional differences between the 80387 and the 8087 that are of concern when porting 8086/ 8087 programs directly to the 80386/80387. Appendix E, "80387 80-Bit CHMOS III Numeric Processor Extension," reproduces a data sheet of 80387 specifications that is separately available. The table of instruction timings in this appendix will be of interest to many readers of this manual. (The AC specifications have been deliberately left out.) The specifications in data sheets are subject to change; consult the most recent data sheet for design-in information. Appendix F, "PC/AT-Compatible 80387 Connection," documents a nonstandard method of connecting an 80387 to an 80386 to achieve compatibility with the IBM PC/AT. The Glossary defines 80387 and floating-point terminology. Refer to it as needed. RELATED PUBLICATIONS To best use the material in this manual, readers should be familiar with the operation and architecture of 80386 systems. The following manuals contain information related to the content of this manual and of interest to programmers of 80387 systems: • • • • Introduction to the 80386, order number 231252 80386 Data Sheet, order number 231630 80386 Hardware Reference Manual, order number 231732 80386 Programmer's Reference Manual, order number 230985 80387 Data Sheet, order number 231920 v TABLE OF CONTENTS CHAPTER 1 INTRODUCTION TO THE 80387 NUMERICS PROCESSOR EXTENSION 1.1 1.2 1.3 1.4 1.5 1.6 History ............................................................................................................. Performance .................................................................................................... Ease of Use ..................................................................................................... Applications ..................................................................................................... Upgradability ...................... .............................................. ............. ............. ..... Programming Interface .... ............................ ....... .................................... ......... Page 1-1 1-1 1-3 1-4 1-5 1-6 CHAPTER 2 80387 NUMERICS PROCESSOR ARCHITECTURE 2.1 80387 Registers .............................................................................................. 2.1.1 The NPX Register Stack .............................................................................. 2.1.2 The NPX Status Word .................................................................................. 2.1.3 Control Word ................................................................................................ 2.1.4 The NPX Tag Word ...................................................................................... 2.1.5 The NPX Instruction and Data Pointers ........................................................ 2.2 Computation Fundamentals ........ .......... ................... ............ ....... ..... ...... ......... 2.2.1 Number System ........................................................................................... 2.2.2 Data Types and Formats .............................................................................. 2.2.2.1 Binary Integers .......................................................................................... 2.2.2.2 Decimal Integers ........................................................................................ 2.2.2.3 Real Numbers ........................................................................................... 2.2.3 Rounding Control ......................................................................................... 2.2.4 Precision Control .......................................................................................... 2-1 2-1 2-3 2-4 2-7 2-7 2-9 2-10 2-11 2-11 2-13 2-13 2-15 2-16 CHAPTER 3 SPECIAL COMPUTATIONAL SITUATIONS 3.1 Special Numeric Values .... .......................... ................. ...... .......... .................... 3.1.1 Denormal Real Numbers ....... ........................ ................. ............. ................. 3.1 .1.1 Denormals and Gradual Underflow ............................................................ 3.1.2 Zeros ............................................................................................................ 3.1.3 Infinity ........................................................................................................... 3.1.4 NaN (Not-a-Number) ........ .......... ............ ........... ................ ........ ............ ........ 3.1.4.1 Signaling NaNs .......................................................................................... 3.1.4.2 Quiet NaNs ................. ............................................................................... 3.1.5 Indefinite ....................................................................................................... 3.1.6 Encoding of Data Types ........ ......... ......... ............. ................... ..................... 3.1.7 Unsupported Formats .................................................................................. 3.2 Numeric Exceptions ............................. ........... .................. ......... ........ ............. 3.2.1 Handling Numeric Exceptions ....................................................................... 3.2.1.1 Automatic Exception Handling ...... .................. ................ ............ .............. 3.2.1.2 Software Exception Handling .... ......... ..... .................. ...... .......................... vii 3-1 3-1 3-4 3-6 3-9 3-10 3-11 3-11 3-12 3-13 3-13 3-18 3-18 3-18 3-19 T ABLE OF CONTENTS Page 3.2.2 Invalid Operation .......................................................................................... 3.2.2.1 Stack Exception ........................................................................................ 3.2.2.2 Invalid Arithmetic Operation .................................................................... .. 3.2.3 Division by Zero ........................................................................................... 3.2.4 Denormal Operand ..................................................................................... .. 3.2.5 Numeric Overflow and Underflow ................................................................ . 3.2.5.1 Overflow .................................................................................................... 3.2.5.2 Underflow .................................................................................................. 3.2.6 Inexact (Precision) ...................................................................................... .. 3.2.7 Exception Priority ....................................................................................... .. 3.2.8 Standard Underflow/Overflow Exception Handler ...................................... .. 3-20 3-20 3-21 3-21 3-22 3-23 3-23 3-24 3-25 3-26 3-26 CHAPTER 4 THE 80387 INSTRUCTION SET 4.1 Compatibility with the 80287 and 8087 ......................................................... .. 4.2 Numeric Operands .......................................................................................... 4.3 Data Transfer Instructions ............................................................................... 4.3.1 FLD source .................................................................................................. 4.3.2 FST destination ........................................................................................... . 4.3.3 FSTP destination ......................................................................................... . 4.3.4 FXCH //destination ...................................................................................... . 4.3.5 FILD source .................................................................................................. 4.3.6 FIST destination ........................................................................................... 4.3.7 FISTP destination ......................................................................................... 4.3.8 FBLD source ................................................................................................ 4.3.9 FBSTP destination ....................................................................................... 4.4 Nontranscendental Instructions ....................................................................... 4.4.1 Addition ........................................................................................................ 4.4.2 Normal Subtraction .................................................................................... .. 4.4.3 Reversed Subtraction ................................................................................. .. 4.4.4 Multiplication ................................................................................................ 4.4.5 Normal Division .......................................................................................... .. 4.4.6 Reversed Division ....................................................................................... .. 4.4.7 FSQRT ......................................................................................................... 4.4.8 FSCALE ....................................................................................................... 4.4.9 FPREM-Partial Remainder (80287/8087-Compatible) .............................. .. 4.4.10 FPREM1 ~Partial Remainder (IEEE Std. 754-Compatible) ....................... . 4.4.11 FRNDINT .................................................................................................... 4.4.12 FXTRACT ..............................................................,.................................... . 4.4.13 FABS .......................................................................................................... 4.4.14 FCHS ......................................................................................................... 4.5 Comparison Instructions viii 4-1 4-1 4-2 4-3 4-3 4-4 4-4 4-4 4-4 4-4 4-4 4-5 4-5 4-7 4-8 4-8 4-8 4-8 4-9 4-9 4-9 4-9 4-10 4-12 4-12 4-13 4-13 4-13 TABLE OF CONTENTS Page 4.5.1 FCOM //source ............................................................................................ 4.5.2 FCOMP //source .......................................................................................... 4.5.3 FCOMPP ...................................................................................................... 4.5.4 FICOM source .............................................................................................. 4.5.5 FICOMP source ..... ........ ....... .... ....... ... ............ ....... .......... ...... ... ...... .............. 4.5.6 FTST ............................................................................................................ 4.5.7 FUCOM //source .......................................................................................... 4.5.8 FUCOMP //source ...................................................................................... ,. 4.5.9 FUCOMPP ........ ......... .......... ..... ............. ................ ............. .......................... 4.5.10 FXAM ......................................................................................................... 4.6 Transcendental Instructions ........ ...................... .............. ............ .......... .......... 4.6.1 FCOS' ........................................................................................................... 4.6.2 FSIN ............... ................... ...................... .............. ....................................... 4.6.3 FSINCOS ..................................................................................................... 4.6.4 FPTAN ......................................................................................................... 4.6.5 FPATAN ....................................................................................................... 4.6.6 F2XM1 .................................... ...................................................... ............... 4.6.7 FYL2X .......................................................................................................... 4.6.8 FYL2XP1 ............... ....................... ..... ......... .................... .... .................... ...... 4.7 Constant Instructions ...................................................................................... 4.7.1 FLDZ ............................................................................................................ 4.7.2 FLD1 ............................................................................................................ 4.7.3 FLDPI ........................................................................................................... 4.7.4 FLDL2T .....................................................................................,.................. 4.7.5 FLDL2E ........................................................................................................ 4.7.6 FLDLG2 ........................................................................................................ 4.7.7 FLDLN2 ........................................................................................................ 4.8 Processor Control Instructions ........ ....... ............ ..................... ........................ 4.8.1 FINIT/FNINIT ................................................................................................ 4.8.2 FLDCW source ............................................................................................. 4.8.3 FSTCW/FNSTCW destination ............ ....................................... ................... 4.8.4 FSTSW/FNSTSW destination .................. .,.................................................. 4.8.5 FSTSW AX/FNSTSW AX .. .......... ........ .......... .......................... ..................... 4.8.6 FCLEX/FNCLEX .. ................. ........... ...... ....... ........... .................................... 4.8.7 FSA VE/FNSAVE destination ........................................................................ 4.8.8 FRSTOR source ..... ........ ........... .................... .............. .............. ................... 4.8.9 FSTENV/FNSTENV destination ................................................................... 4.8.10 FLDENV source .......... ...... ........ ....... ................................... ......... .............. 4.8.11 FINCSTP ............ ........ ........ ..... ....................... ...... ...................................... 4.8.12 FDECSTP ................................................................................................... 4.8.13 FFREE destination ...... ..... ..... ......................... ............. .................. ............. 4.8.14 FNOP ......................................................................................................... 4.8.15 FWAIT (CPU Instruction) ............................................ ., .............................. ix 4-14 4-14 4-14 4-14 4-15 4-15 4-15 4-15 4-15 4-16 4-16 4-17 4-17 4-17 4-17 4-18 4-18 4-19 4-19 4-19 4-20 4-20 4-20 4-20 4-20 4-20 4-21 4-21 4-22 4-22 4-22 4-23 4-23 4-23 4-23 4-25 4-26 4-26 4-27 4-28 4-28 4-28 4-28 TABLE OF CONTENTS Page CHAPTER 5 PROGRAMMING NUMERIC APPLICATIONS 5.1 Programming Facilities ................ ................ ....... ..... .... ....... ........................ ..... 5.1.1 High-Level Languages ........ ................ .................... ... ....................... ...... ...... 5.1.2 C Programs ............ ....... ................... ............... ...... ..... ........ ............. ... .......... 5.1.3 PL/M-386 ..................................................................................................... 5.1.4 ASM386 .................................................................................. ..................... 5.1 .4.1 Defining Data ..... ..... ......... ................ ......... ........... .... ........ ............. ............. 5.1.4.2 Records and Structures ...... ...............................................•. ..................... 5.1.4.3 Addressing Methods ............................................................ ..................... 5.1.5 Comparative Programming Example ............................................................ 5.1.6 80387 Emulation ......... ............. ....................... ..... .... ....................... ......... .... 5.2 Concurrent Processing with the 80387 ....... ....... ..... .... ........... ............ .... ......... 5.2.1 Managing Concurrency ................................................................................ 5.2.1.1 Incorrect Exception Synchronization .................................... ..................... 5.2.1.2 Proper Exception Synchronization ... .............. ... .... ....... .... ........ .... ............. 5-1 5-1 5-1 5-3 5-4 5-4 5-6 5-7 5-8 5-13 5-13 5-14 5-16 5-16 CHAPTER 6 SYSTEM-LEVEL NUMERIC PROGRAMMING 6.1 80386/80387 Architecture ........ ...... ..... ......... ......... ... ....... ...... ............... ..... ..... 6.1.1 Instruction and Operand Transfer ....... ......... .......... ...... ....... ............ ............. 6.1.2 Independent of CPU Addressing Modes ..... ............ ......... .... .... .... ..... ............ 6.1.3 Dedicated I/O Locations ............................................................................... 6.2 Processor Initialization and Control .... ..................... .... ......... ........ ........ ...... ..... 6.2.1 System Initialization ...................................................................................... 6.2.2 Hardware Recognition of the NPX ............................................................... 6.2.3 Software Recognition of the NPX ........ ....... ............ ....... ...... ..... ........ ............ 6.2.4 Configuring the Numerics Environment ........................................................ 6.2.5 Initializing the 80387 ..................................................................................... 6.2.6 80387 Emulation .......................................................................................... 6.2.7 Handling Numerics Exceptions ..................................................................... 6.2.8 Simultaneous Exception Response ................ ..... ... ..... .... .... ..... .... .... .... ........ 6.2.9 Exception Recovery Examples ...................... ................... ...... ........... ... ........ 6-1 6-1 6-1 6-2 6-2 6-2 6-2 6-3 6-3 6-5 6-6 6-7 6-8 6-8 CHAPTER 7 NUMERIC PROGRAMMING EXAMPLES 7.1 Conditional Branching Example ....................................................................... 7.2 Exception Handling Examples ......................................................................... 7.3 Floating-Point to ASCII Conversion Examples ................................................. 7.3.1 Function Partitioning ...... .... ........ ...... .......... ........ ............. .............. ... ....... ...... 7.3.2 Exception Considerations ............................................................................. 7.3.3 Special Instructions .. ............... ...... .......... ....... .......... ..... ... ........ ...... ....... ....... 7.3.4 Description of Operation ............. .......... ............... .... ........... ........... ....... ....... x 7-1 7-2 7-6 7-18 7-18 7-18 7-19 TABLE OF CONTENTS Page 7.3.5 Scaling the Value .......................................................................................... 7.3.5.1 Inaccuracy in Scaling ................................................................................. 7.3.5.2 Avoiding Underflow and Overflow ............................................................. 7.3.5.3 Final Adjustments ..... ......... ...... ............ ... .... .... ....... ................ ... ....... .......... 7.3.6 Output Format .............................................................................................. 7.4 Trigonometric Calculation Examples (Not Tested) .. ..... ............... .... ................. 7-19 7-20 7-20 7-20 7-21 7-21 APPENDIX A MACHINE INSTRUCTION ENCODING AND DECODING APPENDIX B EXCEPTION SUMMARY APPENDIX C COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 APPENDIX D COMPATIBILITY BETWEEN THE 80387 AND THE 8087 APPENDIX E 80387 80-BIT CHMOS III NUMERIC PROCESSOR EXTENSION APPENDIX F PC/AT-COMPATIBLE 80387 CONNECTION GLOSSARY OF 80387 AND FLOATING-POINT TERMINOLOGY Figures Figure 1-1 2-1 2-2 2-3 2-4 2-5 2-6 Title Evolution and Performance of Numeric Processors ................................ 80387 Register Set .......... ... ... ............ ........ ... ............. ..... ..... ....... ........... 80387 Status Word ................................................................................ 80387 Control Word Format ............... ........ ... ................ ........ ...... ........... 80387 Tag Word Format ........................................................................ Protected Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format .. ...... ..... ....... ................ .................. ...... ....... ............... .... Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format ...................................................................................... xi Page 1-2 2-2 2-3 2-6 2-7 2-8 2-8 TABLE OF CONTENTS Figure 2-7 2-8 2-9 2-10 3-1 3-2 3-3 4-1 4-2 4-3 4-4 4-5 4-6 5-1 5-2 5-3 5-4 5-5 5-6 5-7 5-8 6-1 7-1 7-2 7-3 7-4 7-5 7-6 7-7 7-8 Title Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format ...................................................................................... Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format ..................................................................................... . 80387 Double-Precision Number System .............................................. . 80387 Data Formats .............................................................................. Floating-Point System with Denormals .................................................. . Floating-Point System without Denormals ............................................. . Arithmetic Example Using Infinity ........................................................... . FSAVE/FRSTOR Memory Layout (32-Bit) ............................................. . FSAVE/FRSTOR Memory Layout (16-Bit) ............................................. . Protected Mode 80387 Environment, 32-Bit Format ............................. . Real Mode 80387 Environment, 32-Bit Format ..................................... . Protected Mode 80387 Environment, 16-Bit Format ............................. . Real Mode 80387 Environment, 16-Bit Format ..................................... . Sample C-386 Program .......................................................................... Sample 80387 Constants ....................................................................... Status Word Record Definition .............................................................. . Structure Definition ................................................................................ . Sample PL/M-386 Program ................................................................... . Sample ASM386 Program ..................................................................... . Instructions and Register Stack ............................................................. . Exception Synchronization Examples .................................................... . Software Routine to Recognize the 80287 ............................................ . Conditional Branching for Compares .................................................... .. Conditional Branching for FXAM ........................................................... . Full-State Exception Handler .................................................................. Reduced-Latency Exception Handler .................................................... . Reentrant Exception Handler ................................................................. . Floating-Point to ASCII Conversion Routine .......................................... . Relationships between Adjacent Joints ................................................ .. Robot Arm Kinematics Example ............................................................ . Page 2-9 2-9 2-10 2-12 3-5 3-5 3-19 4-24 4-25 4-26 4-27 4-27 4-28 5-2 5-5 5-6 5-7 5-9 5-10 5-12 5-15 6-4 7-2 7-3 7-4 7-5 7-6 7-7 7-22 7-24 Tables Table 1-1 1-2 1-3 2-1 2-2 Title Numeric Processing Speed Comparisons ............................................... Numeric Data Types ............................................................................... Principal NPX Instructions ...................................................................... Condition Code Interpretation ..... '" ..... .......... ......... ...... .............. ... ...... .... Correspondence between 80387 and 80386 Flag Bits ........................... xii Page 1-2 1-7 1-8 2-5 2-6 TABLE OF CONTENTS Table 2-3 2-4 2-5 3-1 3-2 3-3 3-4 3-5 3-6 3-7 3-8 3-9 3-10 3-11 4-1 4-2 4-3 4-4 4-5 4-6 4-7 4-8 4-9 4-10 4-11 4-12 5-1 5-2 5-3 6-1 Title Page Summary of Format Parameters .......................................................... .. Real Number Notation ..................................................................... '" ... . Rounding Modes .................................................................................... Arithmetic and Nonarithmetic Instructions ............................................. . Denormalization Process ....................................................................... . Zero Operands and Results ................................................................. . Infinity Operands and Results ............................................................... .. Rules for Generating QNaNs ................................................................ .. Binary Integer Encodings ....................................................................... . Packed Decimal Encodings .................................................................. .. Single and Double Real Encodings ....................................................... .. Extended Real Encodings .................................................................... .. Masked Responses to Invalid Operations ............................................ .. Masked Overflow Results ..................................................................... .. Data Transfer Instructions ..................................................................... . Nontranscendental Instructions ............................................................. . Basic Nontranscendental Instructions and Operands ............................ . Condition Code Interpretation after FPREM and FPREM1 Instructions ......................................................................................... Comparison Instructions ....................................................................... .. Condition Code Resulting from Comparisons ....................................... .. Condition Code Resulting from FTST ................................................... .. Condition Code Defining Operand Class .............................................. .. Transcendental Instructions ................................................................... . Results of FPATAN ............................................................................... . Constant Instructions ............................................................................ . Processor Control Instructions .............................................................. . PL/M-386 Built-In Procedures ............................................................... . ASM386 Storage Allocation Directives .................................................. . Addressing Method Examples ............................................................... . NPX Processor State Following Initialization ........................................ .. 2-13 2-14 xiii 2-17 3-2 3-3 3-7 3-9 3-12 3-14 3-15 3-16 3-17 3-21 3-23 4-3 4-6 4-7 4-11 4-13 4-14 4-15 4-16 4-16 4-18 4-20 4-21 5-3 5-4 5-7 6-6 CUSTOMER SUPPORT CUSTOMER SUPPORT Customer Support is Intel's complete support service that provides Intel customers with hardware support, software support, customer training, and consulting services. For more information contact your local sales offices. After a customer purchases any system hardware or software product, service and support become major factors in determining whether that product will continue to meet a customer's expectations. Such support requires an international support organization and a breadth of programs to meet a variety of customer needs. As you might expect, Intel's customer support is quite extensive. It includes factory repair services and worldwide field service offices providing hardware repair services, software support services, customer training classes, and consulting services. HARDWARE SUPPORT SERVICES Intel is committed to providing an international service support package through a wide variety of service offerings available from Intel Hardware Support. SOFfWARE SUPPORT SERVICES Intel's software support consists of two levels of contracts. Standard support includes TIPS (Technical Information Phone Service), updates and subscription service (product-specific troubleshooting guides and COMMENTS Magazine). Basic support includes updates and the SUbscription service. Contracts are sold in environments which represent product groupings (i.e., iRMX environment). CONSULTING SERVICES Intel provides field systems engineering services for any phase of your development or support effort. You can use our systems engineers in a variety of ways ranging from assistance in using a new product, developing an application, personalizing training, and customizing or tailoring an Intel product to providing technical and management consulting. Systems Engineers are well versed in technical areas such as microcommunications, real-time applications, embedded microcontrollers, and network services. You know your application needs; we know our products. Work· ing together we can help you get a successful product to market in the least possible time. CUSTOMER TRAINING Intel offers a wide range of instructional programs covering various aspects of system design and implementation. In just three to ten days a limited number of individuals learn more in a single workshop than in weeks of self-study. For optimum convenience, workshops are scheduled regularly at Training Centers worldwide or we can take our workshops to you for on-site instruction. Covering a wide variety of topics, Intel's major course categories include: architecture and assembly language, programming and operating systems, bitbus and LAN applications. Introduction to the 80387 Numerics Processor Extension 1 CHAPTER 1 INTRODUCTION TO THE 80387 NUMERICS PROCESSOR EXTENSION The 80387 NPX is a high-performance numerics processing element that extends the 80386 architecture by adding significant numeric capabilities and direct support for floating-point, extended-integer, and BCD data types. The 80386 CPU with 80387 NPX easily supports powerful and accurate numeric applications through its implementation of the IEEE Standard 754 for Binary Floating-Point Arithmetic. The 80387 provides floating-point performance comparable to that of large minicomputers while offering compatibility with object code for 8087 and 80287. 1.1 HISTORY The 80387 Numeric Processor Extension (NPX) is compatible with its predecessors, the earlier Intel 8087 NPX and 80287 NPX. As the 80386 runs 8086 programs, so programs designed to use the 8087 and 80287 should run unchanged on the 80387. The 8087 NPX was designed for use in 8086-family systems. The 8086 was the first microprocessor family to partition the processing unit to permit high-performance numeric capabilities. The 8087 NPX for this processor family implemented a complete numeric processing environment in compliance with an early proposal for the IEEE 754 FloatingPoint Standard. With the 80287 Numeric Processor Extension, high-speed numeric computations were extended to 80286 high-performance multitasking and multiuser systems. Multiple tasks using the numeric processor extension were afforded the full protection of the 80286 memory management and protection features. The 80387 Numeric Processor Extension is Intel's third generation numerics processor. The 80387 implements the final IEEE standard, adds new trigonometric instructions, and uses a new design and CHMOS-III process to allow higher clock rates and require fewer clocks per instruction. Together, the 80387 with additional instructions and the improved standard bring even more convenience and reliability to numerics programming and make this convenience and reliability available to applications that need the high-speed and large memory capacity of the 32-bit environment of the 80386 CPU. Figure 1-1 illustrates the relative performance of 5-MHz 8086/8087, 8-MHz 80286/80287, and 20-MHz 80386/80387 systems in executing numerics-oriented applications. 1.2 PERFORMANCE Table 1-1 compares the execution times of several 80387 instructions with the equivalent operations executed on an 8-MHz 80287. As indicated in the table, the 16-MHz 80387 NPX provides about 5 to 6 times the performance of an 8-MHz 80287 NPX. A 16-MHz 1-1 INTRODUCTION TO THE 80387 RELATIVE PERFORMANCE 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 80386/80387 (20 MHz) 80286/80287 (8 MHz) 808618087 (5 MHz) 1983 1980 1987 YEAR INTRODUCED G40003 Figure 1-1. Evolution and Performance of Numeric Processors Table 1-1. Numeric Processing Speed Comparisons Approximate Performance Ratios: 16 MHz 80386/80387 -78 MHz 80286/80287 Floating-Point Instruction FADD FDIV FYL2X FPATAX F2XM1 ST, ST(i) dword_var stack (0), (1) assumed stack (0) assumed stack (0) assumed Addition Division Logarithm Arctangent Exponentiation 6.2 4.7 6.0 2.6* 2.7* *The ratio is higher if the operand is not in range of the 80287 instruction. 80387 multiplies 32-bit and 64-bit floating-point numbers in about 1.9 and 2.8 microseconds, respectively. Of course, the actual performance of the NPX in a given system depends on the characteristics of the individual application. Although the performance figures shown in Table 1-1 refer to operations on real (floatingpoint) numbers, the 80387 also manipulates fixed-point binary and decimal integers of up to 64 bits or 18 digits, respectively. The 80387 can improve the speed of multiple-precision software algorithms for integer operations by 10 to 100 times. Because the 80387 NPX is an extension of the 80386 CPU, no software overhead is incurred in setting up the NPX for computation. The 80387 and 80386 processors coordinate their activities in a manner transparent to software. Moreover, built-in coordination facilities allow the 80386 CPU to proceed with other instructions while the 80387 NPX is simultaneously executing numeric instructions. Programs can exploit this concurrency of execution to further increase system performance and throughput. 1-2 INTRODUCTION TO THE 80387 1.3 EASE OF USE The 80387 NPX offers more than raw execution speed for computation-intensive tasks. The 80387 brings the functionality and power of accurate numeric computation into the hands of the general user. These features are available in most high-level languages available for the 80386. Like the 8087 and 80287 that preceded it, the 80387 is explicitly designed to deliver stable, accurate results when programmed using straightforward "pencil and paper" algorithms. The IEEE standard 754 specifically addresses this issue, recognizing the fundamental importance of making numeric computations both easy and safe to use. For example, most computers can overflow when two single-precision floating-point numbers are multiplied together and then divided by a third, even if the final result is a perfectly valid 32-bit number. The 80387 delivers the correctly rounded result. Other typical examples of undesirable machine behavior in straightforward calculations occur when computing financial rate of return, which involves the expression (1 + i)n or when solving for roots of a quadratic equation: -b ± Vb 2 - 4ac 2a If a does not equal 0, the formula is numerically unstable when the roots are nearly coin- cident or when their magnitudes are wildly different. The formula is also vulnerable to spurious over/underflows when the coefficients a, b, and c are all very big or all very tiny. When single-precision (4-byte) floating-point coefficients are given as data and the formula is evaluated in the 80387's normal way, keeping all intermediate results in its stack, the 80387 produces impeccable single-precision roots. This happens because, by default and with no effort on the programmer's part, the 80387 evaluates all those subexpressions with so much extra precision and range as to overwhelm any threat to numerical integrity. If double-precision data and results were at issue, a better formula would have to be used, and once again the 80387's default evaluation of that formula would provide substantially enhanced numerical integrity over mere double-precision evaluation. On most machines, straightforward algorithms will not deliver consistently correct results (and will not indicate when they are incorrect). To obtain correct results on traditional machines under all conditions usually requires sophisticated numerical techniques that are foreign to most programmers. General application programmers using straightforward algorithms will produce much more reliable programs using the 80387. This simple fact greatly reduces the software investment required to develop safe, accurate computation-based products. Beyond traditional numerics support for scientific applications, the 80387 has built-in facilities for commercial computing. It can process decimal numbers of up to 18 digits without round-off errors, performing exact arithmetic on integers as large as 264 or 1018 • Exact arithmetic is vital in accounting applications where rounding errors may introduce monetary losses that cannot be reconciled. 1-3 INTRODUCTION TO THE 80387 The NPX contains a number of optional facilities that can be invoked by sophisticated users. These advanced features include directed rounding, gradual underflow, and programmed exception-handling facilities. These automatic exception-handling facilities permit a high degree of flexibility in numeric processing software, without burdening the programmer. While performing numeric calculations, the NPX automatically detects exception conditions that can potentially damage a calculation (for example, X -7- 0 or y'X when X < 0). By default, on-chip exception logic handles these exceptions so that a reasonable result is produced and execution may proceed without program interruption. Alternatively, the NPX can signal the CPU, invoking a software exception handler to provide special results whenever various types of exceptions are detected. 1.4 APPLICATIONS The 80386's versatility and performance make it appropriate to a broad array of numeric applications. In general, applications that exhibit any of the following characteristics can benefit by implementing numeric processing on the 80387: Numeric data vary over a wide range of values, or include nonintegral values. Algorithms produce very large or very small intermediate results. • Computations must be very precise; i.e., a large number of significant digits must be maintained. • Performance requirements exceed the capacity of traditional microprocessors. Consistently safe, reliable results must be delivered using a programming staff that is not expert in numerical techniques. Note also that the 80387 can reduce software development costs and improve the performance of systems that use not only real numbers, but operate on multiprecision binary or decimal integer values as well. A few examples, which show how the 80387 might be used in specific numerics applications, are described below. In many cases, these types of systems have been implemented in the past with minicomputers or small mainframe computers. The advent of the 80387 brings the size and cost savings of microprocessor technology to these applications for the first time. Business data processing-The NPX's ability to accept decimal operands and produce exact decimal results of up to 18 digits greatly simplifies accounting programming. Financial calculations that use power functions can take advantage of the 80387's exponentiation and logarithmic instructions. Many business software packages can benefit from the speed and accuracy of the 80387; for example, Lotus" 1-2-3*, Multiplan', SuperCalc", and Framework". 1-4 INTRODUCTION TO THE 80387 • Simulation-The large (32-bit) memory space of the 80386 coupled with the raw speed of the 80386 and 80387 processors make 80386/80387 microsystems suitable for attacking large simulation problems, which heretofore could only be executed on expensive mini and mainframe computers. For example, complex electronic circuit simulations using SPICE can now be performed on a microcomputer, the 80386/80387. Simulation of mechanical systems using finite element analysis can employ more elements, resulting in more detailed analysis or simulation of larger systems. • Graphics transformations-The 80387 can be used in graphics terminals to locally perform many functions that normally demand the attention of a main computer; these include rotation, scaling, and interpolation. By also using an 82786 Graphics Display Controller to perform high-speed drawing and window management, very powerful and highly self-sufficient terminals can be built from a relatively small number of 80386 family parts. • Process control-The 80387 solves dynamic range problems automatically, and its extended precision allows control functions to be fine-tuned for more accurate and efficient performance. Control algorithms implemented with the NPX also contribute to improved reliability and safety, while the 80387's speed can be exploited in real-time operations. • Computer numerical control (CNC)-The 80387 can move and position machine tool heads with accuracy in real-time. Axis positioning also benefits from the hardware trigonometric support provided by the 80387. • Robotics-Coupling small size and modest power requirements with powerful computational abilities, the 80387 is ideal for on-board six-axis positioning. Navigation-Very small, lightweight, and accurate inertial guidance systems can be implemented with the 80387. Its built-in trigonometric functions can speed and simplify the calculation of position from bearing data. • Data acquisition-The 80387 can be used to scan, scale, and reduce large quantities of data as it is collected, thereby lowering storage requirements and time required to process the data for analysis. The preceding examples are oriented toward traditional numerics applications. There are, in addition, many other types of systems that do not appear to the end user as computational, but can employ the 80387 to advantage. Indeed, the 80387 presents the imaginative system designer with an opportunity similar to that created by the introduction of the microprocessor itself. Many applications can be viewed as numerically-based if sufficient computational power is available to support this view (e.g., character generation for a laser printer). This is analogous to the thousands of successful products that have been built around "buried" microprocessors, even though the products themselves bear little resemblance to computers. 1.5 UPGRADABILITY The architecture of the 80386 CPU is specifically adapted to allow easy upgradability to use an 80387, simply by plugging in the 80387 NPx. For this reason, designers of 80386 systems may wish to incorporate the 80387 NPX into their designs in order to offer two levels of price and performance at little additional cost. 1-5 INTRODUCTION TO THE 80387 Two features of the 80386 CPU make the design and support of upgradable 80386 systems particularly simple: The 80386 can be programmed to recognize the presence of an 80387 NPX; that is, software can recognize whether it is running on an 80386 with or without an 80387 NPX. After determining whether the 80387 NPX is available, the 80386 CPU can be instructed to let the NPX execute all numeric instructions. If an 80387 NPX is not available, the 80386 CPU can emulate all 80387 numeric instructions in software. This emulation is completely transparent to the application software-the same object code may be used by 80386 systems both with and without an 80387 NPX. No relinking or recompiling of application software is necessary; the same code will simply execute faster with the 80387 NPX than without. To facilitate this design of upgradable 80386 systems, Intel provides a software emulator for the 80387 that provides the functional equivalent of the 80387 hardware, implemented in software on the 80386. Except for timing, the operation of this 80387 emulator (EMUL387) is the same as for the 80387 NPX hardware. When the emulator is combined as part of the systems software, the 80386 system with 80387 emulation and the 80386 with 80387 hardware are virtually indistinguishable to an application program. This capability makes it easy for software developers to maintain a single set of programs for both systems. System manufacturers can offer the NPX as a simple plug-in performance option without necessitating any changes in the user's software. 1.6 PROGRAMMING INTERFACE The 80386/80387 pair is programmed as a single processor; all of the 80387 registers appear to a programmer as extensions of the basic 80386 register set. The 80386 has a class of instructions known as ESCAPE instructions, all having a common format. These ESC instructions are numeric instructions for the 80387 NPX. These numeric instructions for the 80387 are simply encoded into the instruction stream along with 80386 instructions. All of the CPU memory-addressing modes may be used in programming the NPX, allowing convenient access to record structures, numeric arrays, and other memory-based data structures. All of the memory management and protection features of the CPU (both paging and segmentation) are extended to the NPX as well. Numeric processing in the 80387 centers around the NPX register stack. Programmers can treat these eight 80-bit registers either as a fixed register set, with instructions operating on explicitly-designated registers, or as a classical stack, with instructions operating on the top one or two stack elements. Internally, the 80387 holds all numbers in a uniform 80-bit extended format. Operands that may be represented in memory as 16-, 32-, or 64-bit integers, 32-, 64-, or 80-bit floatingpoint numbers, or 18-digit packed BCD numbers, are automatically converted into extended format as they are loaded into the NPX registers. Computation results are subsequently converted back into one of these destination data formats when they are stored into memory from the NPX registers. 1-6 INTRODUCTION TO THE 80387 Table 1-2 lists each of the seven data types supported by the 80387, showing the data format for each type. All operands are stored in memory with the least significant digits starting at the initial (lowest) memory address. Numeric instructions access and store memory operands using only this initial address. For maximum system performance, all operands should start at memory addresses divisible by four. Table 1-3 lists the 80387 instructions by class. No special programming tools are necessary to use the 80387, because all of the NPX instructions and data types are directly supported by the ASM386 Assembler, by high-level languages from Intel, and by assemblers and compilers produced by many independent software vendors. Software routines for the 80387 may be written in ASM386 Assembler or any of the following higher-level languages from Intel: PL/M-386 C-386 In addition, all of the development tools supporting the 8086/8087 and 80286/80287 can also be used to develop software for the 80386/80387. All of these high-level languages provide programmers with access to the computational power and speed of the 80387 without requiring an understanding of the architecture of the 80386 and 80387 chips. Such architectural considerations as concurrency and synchronization are handled automatically by these high-level languages. For the ASM386 programmer, specific rules for handling these issues are discussed in a later section of this manual. The following operating systems are known or expected to support the 80387: RMX-286/386, MS-DOS, Xenix-286/386, and Unix-286/386. Advanced in-circuit debugging support is provided by ICE-386. Table 1-2. Numeric Data Types Data Type Bits Significant Digits (DeCimal) Approximate Range (DeCimal) Word integer 16 4 -32,768 :oS X :oS +32,767 Short integer 32 9 -2X10 9 :oS X:oS +2X10 9 Long integer 64 18 -9X10 'B :oS X:oS +9X10 'B Packed decimal 80 18 -99 ... 99 :oS X :oS +99 ... 99 (18 digits) Single real 32 6-7 1.18 X 1O-3B :oS I X I :oS 3.40 X 103B Double real 64 15-16 2.23 X 10- 30B :oS I X I :oS 1.80 X 10308 80 19 3.30 X 10-- 4932 :oS I X I :oS 1.21 X 104932 Extended real" "Equivalent to double extended format of IEEE Std 754 1-7 INTRODUCTION TO THE 80387 Table 1-3. Principal NPX Instructions Class Instruction Types Data Transfer Load (all data types), Store (all data types), Exchange Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed, Divide Reversed, Square Root, Scale, Remainder, Integer Part, Change Sign, Absolute Value, Extract Comparison Compare, Examine, Test Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine, 2x y. Log 2 (X+1) Constants 0, 1, Processor Control Load Control Word, Store Control Word, Store Status Word, Load Environment, Store Environment, Save, Restore, Clear Exceptions, Initialize 7r, ~ 1, y. Log 2 (X), Log,02, Loge2, Log 2 1O, Log 2 e 1-8 80387 Numerics Processor Architecture 2 CHAPTER 2 80387 NUMERICS PROCESSOR ARCHITECTURE To the programmer, the 80387 NPX appears as a set of additional registers, data types, and instructions~all of which complement those of the 80386. Refer to Chapter 4 for detailed explanations of the 80387 instruction set. This chapter explains the new registers and data types that the 80387 brings to the architecture of the 80386. 2.1 80387 REGISTERS The additional registers consist of • Eight individually-addressable 80-bit numeric registers, organized as a register stack • Three sixteen-bit registers containing: the NPX status word the NPX control word the tag word • Two 48-bit registers containing pointers to the current instruction and operand (these registers are actually located in the 80386) All of the NPX numeric instructions focus on the contents of these NPX registers. 2.1.1 The NPX Register Stack The 80387 register stack is shown in Figure 2-1. Each of the eight numeric registers in the 80387's register stack is 80 bits wide and is divided into fields corresponding to the NPX's extended real data type. Numeric instructions address the data registers relative to the register on the top of the stack. At any point in time, this top-of-stack register is indicated by the TOP (stack TOP) field in the NPX status word. Load or push operations decrement TOP by one and load a value into the new top register. A store-and-pop operation stores the value from the current TOP register and then increments TOP by one. Like 80386 stacks in memory, the 80387 register stack grows down toward lower-addressed registers. Many numeric instructions have several addressing modes that permit the programmer to implicitly operate on the top of the stack, or to explicitly operate on specific registers relative to the TOP. The ASM386 Assembler supports these register addressing modes, using the expression ST(O), or simply ST, to represent the current Stack Top and STU) to specify the 2-1 80387 ARCHITECTURE 80387 DATA REGISTERS 79 78 RO 64 63 0 TAG FIELD 1 0 P-~------~------------------------~ SIGN EXPONENT SIGNIFICAND R1 R2 R3 R4 R5 R6 R7 15 0 47 INSTRUCTION POINTER CONTROL REGISTER DATA POINTER STATUS REGISTER TAG WORD G40003 Figure 2-1. 80387 Register Set ith register from TOP in the stack (0 <: i <: 7). For example, if TOP contains 011 B (register 3 is the top of the stack), the following statement would add the contents of two registers in the stack (registers 3 and 5): FADD ST, ST(2) The stack organization and top-relative addressing of the numeric registers simplify subroutine programming by allowing routines to pass parameters on the register stack. By using the stack to pass parameters rather than using "dedicated" registers, calling routines gain more flexibility in how they use the stack. As long as the stack is not full, each routine simply loads the parameters onto the stack before calling a particular subroutine to perform a numeric calculation. The subroutine then addresses its parameters as ST, ST( 1), etc., even though TOP may, for example, refer to physical register 3 in one invocation and physical register 5 in another. 2-2 80387 ARCHITECTURE 2.1.2 The NPX Status Word The 16-bit status word shown in Figure 2-2 reflects the overall state of the 80387. This status word may be stored into memory using the FSTSW /FNSTSW, FSTENV / FNSTENV, and FSAVE/FNSAVE instructions, and can be transferred into the 80386 AX register with the FSTSW AX/FNSTSW AX instructions, allowing the NPX status to be inspected by the CPU. The B-bit (bit 15) is included for 8087 compatibility only. It reflects the contents of the ES bit (bit 7 of the status word), not the status of the BUSY # output of the 80387. 80387 BUSY 15 B r!-I-l ~ ~ l C 3 I I TOP I I C C 2 1 C 0 TOP OF STACK POINTER CONDITION CODE 7 E S 0 S P F E U E 0 Z D E E E I E ERROR SUMMARY STATUS - - - - - - ' STACK FAULT - - - - - - - - -...... EXCEPTION FLAGS PRECISION _ _ _ _ _ _ _ _ _---iI U N D E R F L O W - - - - - - - - - -....... OVERFLOW - - - - - - - - - - -....... ZERO DIVIDE _ _ _ _ _ _ _ _ _ _ _ _....J DENORMALIZED OPERAND - - - - - - - - -.... INVALID OPERATION - - - - - - - - - - - -.... ES IS SET IF ANY UNMASKED EXCEPTION BIT IS SET; CLEARED OTHERWISE. SEE TABLE 2-1 FOR INTERPRETATION OF CONDITION CODE. TOP VALUES: 000 ~ REGISTER 0 IS TOP OF STACK 001 ~ REGISTER liS TOP OF STACK 111 ~ REGISTER 7 IS TOP OF STACK FOR DEFINITIONS OF EXCEPTIONS, REFER TO CHAPTER 3. G40003 Figure 2-2. 80387 Status Word 2-3 80387 ARCHITECTURE. The four NPX condition code bits (C 3-CO) are similar to the flags in a CPU: the 80387 updates these bits to reflect the outcome of arithmetic operations. The effect of these instructions on the condition code bits is summarized in Table 2-1. These condition code bits are used principally for conditional branching. The FSTSW AX instruction stores the NPX status word directly into the CPU AX register, allowing these condition codes to be inspected efficiently by 80386 code. The 80386 SAHF instruction can copy C 3-CO directly to 80386 flag bits to simplify conditional branching. Table 2-2 shows the mapping of these bits to the 80386 flag bits. Bits 12-14 of the status word point to the 80387 register that is the current Top of Stack (TOP). The significance of the stack top has been described in the prior section on the register stack. Figure 2-2 shows the six exception flags in bits 0-5 of the status word. Bit 7 is the exception summary status (ES) bit. ES is set if any unmasked exception bits are set, and is cleared otherwise. If this bit is set, the ERROR# signal is asserted. Bits 0-5 indicate whether the NPX has detected one of six possible exception conditions since these status bits were last cleared or reset. They are "sticky" bits, and can only be cleared by the instructions FINIT, FCLEX, FLDENV, FSA VE, and FRSTOR. Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid operations due to stack overflow or underflow from other kinds of invalid operations. When SF is set, bit 9 (C l ) distinguishes between stack overflow (C l = 1) and underflow (C l = 0). 2.1.3 Control Word The NPX provides the programmer with several processing options, which are selected by loading a word from memory into the control word. Figure 2-3 shows the format and encoding of the fields in the control word. The low-order byte of this control word configures the 80387 exception masking. Bits 0-5 of the control word contain individual masks for each of the six exception conditions recognized by the 80387. The high-order byte of the control word configures the 80387 processing options, including Precision control • Rounding control The precision-control bits (bits 8-9) can be used to set the 80387 internal operating precision at less than the default precision (64-bit significand). These control bits can be used to provide compatibility with the earlier-generation arithmetic processors having less precision than the 80387. The precision-control bits affect the results of only the following five arithmetic instructions: ADD, SUBeR), MUL, DIV(R), and SQRT. No other operations are affected by PC. 2-4 80387 ARCHITECTURE Table 2-1. Condition Code Interpretation Instruction CO(S) I C3 (Z) C1 (A) Three least significant bits of quotient FPREM,FPREM1 FCOM, FCOMP, FCOMPP, FTST, FUCOM, FUCOMP, FUCOMPP, FICOM, FICOMP 02 I 00 01 or O/U# C2 (C) Reduction O=complete 1 = incomplete Result of comparison Zero or O/U# Operand is not comparable Operand class Sign or O/U# Operand class FCHS, FABS, FXCH, FiNCTOP, FDECTOP, Constant loads, FXTRACT, FLD, FILD, FBLD, FSTP (ext real) UNDEFINED Zero or O/U# UNDEFINED FIST, FBSTP, FRNDINT, FST, FSTP, FADD, FMUL, FDIV, FDIVR, FSUB, FSUBR, FSCALE, FSORT, FPATAN, F2XM1, FYL2X, FYL2XP1 UNDEFINED Roundup orO/U# UNDEFINED FPTAN, FSIN, FCOS, FSINCOS UNDEFINED Roundup or O/U# undefined if C2=1 FXAM FLDENV, FRSTOR Reduction 0= complete 1 = incomplete Each bit loaded from memory FLDCW, FSTENV, FSTCW, FSTSW, FCLEX, FINIT, FSAVE UNDEFINED O/U# When both IE and SF bits of status word are set, indicating a stack exception, this bit distinguishes between stack overflow (C1 =1) and underflow (C1 =0). Reduction If FPREM and FPREM1 produces a remainder that is less than the modulus, reduction is complete. When reduction is incomplete the value at the top of the stack is a partial remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the reduction bit is set if the operand at the top of the stack is too large. In this case the original operand remains at the top of the stack. Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the instruction was upward. UNDEFINED Do not rely on finding any specific value in these bits. 2-5 80387 ARCHITECTURE Table 2-2. Correspondence between 80387 and 80386 Flag Bits 80387 Flag 80386 Flag CF Co C, (none) PF ZF C2 C. l r-~'-----------------------------RESERVED r--! !H '_-I=~::~~~;GC~~:~~~~ ,"wmN CONmO< 15 o 7 Ix;x;+1 +1+Ix;xl*I*I*1 ==~:S_K_s__________t__ t...."J UNDERFLOW - - - - - - - - - - - - -.... OVERFLOW - - - - - - - - - - - - - - - - - -______---' ZERO DIVIDE - - - - - - - - - - - - - - -.... DENORMALIZED OPERAND ------------~ INVALID OPERATION ---------------001 PRECISION CONTROL 00-24 BITS (SINGLE PRECISION) 01-(RESERVED) 10-53 BITS (DOUBLE PRECISION) 11-64 BITS (EXTENDED PRECISION) ROUNDING CONTROL OO-ROUND TO NEAREST OR EVEN 01-ROUND DOWN (TOWARD-oo) 10-ROUND UP (TOWARD +(0) 11-CHOP (TRUNCATE TOWARD ZERO) ·This "infinity control" bit is not meaningful to the 80387. To maintain compatibility with the 80287, this bit can be programmed; however, regardless of its value, the 80387 treats infinity in the affine sense (- 00 < + (0). G40003 Figure 2-3. 80387 Control Word Format 2-6 80387 ARCHITECTURE The rounding-control bits (bits 10-11) provide for the common round-to-nearest mode, as well as directed rounding and true chop. Rounding control affects only the arithmetic instructions (refer to Chapter 3 for lists of arithmetic and non arithmetic instructions). 2.1.4 The NPX Tag Word The tag word indicates the contents of each register in the register stack, as shown in Figure 2-4. The tag word is used by the NPX itself to distinguish between empty and non empty register locations. Programmers of exception handlers may use this tag information to check the contents of a numeric register without performing complex decoding of the actual data in the register. The tag values from the tag word correspond to physical registers 0-7. Programmers must use the current top-of-stack (TOP) pointer stored in the NPX status word to associate these tag values with the relative stack registers ST(O) through ST(7). The exact values of the tags are generated during execution of the FSTENV and FSA VE instructions according to the actual contents of the non empty stack locations. During execution of other instructions, the 80387 updates the TW only to indicate whether a stack location is empty or nonempty. 2.1.5 The NPX Instruction and Data Pointers The instruction and data pointers provide support for programmed exception-handlers. These registers are actually located in the 80386, but appear to be located in the 80387 because they are accessed by the ESC instructions FLDENV, FSTENV, FSAVE, and FRS TOR. Whenever the 80386 decodes an ESC instruction, it saves the instruction address, the operand address (if present), and the instruction opcode. When stored in memory, the instruction and data pointers appear in one of four formats, depending on the operating mode of the 80386 (protected mode or real-address mode) and depending on the operand-size attribute in effect (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode, the real-address mode formats are used. Figures 2-5 through 2-8 show these pointers as they are stored following an FSTENV instruction. TAG VALUES: 00 ~ VALID 01 ~ ZERO 10 ~ INVALID OR INFINITY 11 ~ EMPTY G40003 Figure 2-4. 80387 Tag Word Format 2-7 80387 ARCHITECTURE 32-BIT PROTECTED MODE FORMAT 31 o 7 15 23 RESERVED CONTROL WORD OH RESERVED STATUS WORD 4H RESERVED TAG WORD 8H IP OFFSET 000001 CH 10H CS SELECTOR OPCODE w .. o 14H DATA OPERAND OFFSET RESERVED 18H OPERAND SELECTOR G40003 Figure 2-5. Protected Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 32-BIT REAL·ADDRESS MODE FORMAT 31 23 RESERVED CONTROL WORD OH RESERVED STATUS WORD 4H RESERVED TAG WORD 8H INSTRUCTION POINTER " ..0 RESERVED 000 01 INSTRUCTION POINTER 31 .. 1. RESERVED o 0 0 0/ o 7 15 OPERAND POINTER 10 1 OPCODE OPERAND POINTER 10 .• 0 15 .• 0 /0 0 0 0 0 0 0 0 0 000 31 .. 1. CH 10H 14H 18H G40003 Figure 2-6. Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format The FSTENV and FSA VE instructions store this data into memory, allowing exception handlers to determine the precise nature of any numeric exceptions that may be encountered. The instruction address saved in the 80386 (as in the 80287) points to any prefixes that preceded the instruction. This is different from the 8087, for which the instruction address points only to the ESC instruction opcode. Note that the processor control instructions FINIT, FLDCW, FSTCW, FSTSW, FCLEX, FSTENV, FLDENV, FSA VE, FRSTOR, and FWAIT do not affect the data pointer. Note also that, except for the instructions just mentioned, the value of the data pointer is undefined if the prior ESC instruction did not have a memory operand. 2-8 80387 ARCHITECTURE 16-BIT PROTECTED MODE FORMAT o 7 15 CONTROL WORD OH STATUS WORD 2H TAG WORD 4H IP OFFSET 6H CS SELECTOR SH OPERAND OFFSET AH OPERAND SELECTOR CH G40003 Figure 2-7_ Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 16-BIT REAL-ADDRESS MODE AND VIRTUAL-SOS6 MODE FORMAT o 7 15 CONTROL WORD OH STATUS WORD 2H TAG WORD 4H INSTRUCTION POINTER,s..o 1P19__ 16 OPCODE 10 1 6H '0 .. 0 OPERAND POINTER ,s..o OP '9.. '6 1010 0 0 0 0 0 0 0 0 0 0 SH AH CH G40003 Figure 2-8_ Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 2.2 COMPUTATION FUNDAMENTALS This section covers 80387 programming concepts that are common to all applications. It describes the 80387's internal number system and the various types of numbers that can be employed in NPX programs_ The most commonly used options for rounding and precision (selected by fields in the control word) are described, with exhaustive coverage of less frequently used facilities deferred to later sections. Exception conditions that may arise during execution of NPX instructions are also described along with the options that are available for responding to these exceptions. 2-9 80387 ARCHITECTURE 2.2.1 Number System The system of real numbers that people use for pencil and paper calculations is conceptually infinite and continuous. There is no upper or lower limit to the magnitude of the numbers one can employ in a calculation, or to the precision (number of significant digits) that the numbers can represent. When considering any real number, there are always arbitrarily many numbers both larger and smaller. There are also arbitrarily many numbers between (i.e., with more significant digits than) any two real numbers. For example, between 2.5 and 2.6 are 2.51,2.5897,2.500001, etc. While ideally it would be desirable for a computer to be able to operate on the entire real number system, in practice this is not possible. Computers, no matter how large, ultimately have fixed-size registers and memories that limit the system of numbers that can be accommodated. These limitations determine both the range and the precision of numbers. The result is a set of numbers that is finite and discrete, rather than infinite and continuous. This sequence is a subset of the real numbers that is designed to form a useful approximation of the real number system. Figure 2-9 superimposes the basic 80387 real number system on a real number line (decimal numbers are shown for clarity, although the 80387 actually represents numbers in binary). The dots indicate the subset of real numbers the 80387 can represent as data and final results of calculations. The 80387's range of double-precision, normalized numbers is approximately ± 2.23 X 10.308 to ± 1.80 X 10308 • Applications that are required to deal with data and final results outside this range are rare. For reference, the range of the IBM System 370* is about ±0.54 X 10-78 to ±0.72 X 1076 • 1 :... 1 1 1 t- 1 NEGATIVE RANGE (NORMALIZED) ~I -5 -4 -3 -2 -1 S5 I 308 1.ao x 10 I I I I I -2.23 X 10- I I 1 J 308 0 1 I" 1 I I 1 POSITIVE RANGE (NORMALIZED) 4 5 s· ~I I I I 'l.a~ ~~0.J x +2 fo[L_- • • • (NOT REPRESENTABLE) 1.99999999999999999 G40003 Figure 2-9. 80387 Double-Precision Number System 2-10 80387 ARCHITECTURE The finite spacing in Figure 2-9 illustrates that the NPX can represent a great many, but not all, of the real numbers in its range. There is always a gap between two adjacent 80387 numbers, and it is possible for the result of a calculation to fall in this space. When this occurs, the NPX rounds the true result to a number that it can represent. Thus, a real number that requires more digits than the 80387 can accommodate (e.g., a 20-digit number) is represented with some loss of accuracy. Notice also that the 80387's representable numbers are not distributed evenly along the real number line. In fact, an equal number of representable numbers exists between successive powers of 2 (i.e., as many representable numbers exist between 2 and 4 as between 65,536 and 131,072). Therefore, the gaps between representable numbers are larger as the numbers increase in magnitude. All integers in the range ± 264 (approximately ± 10 18 ), however, are exactly representable. In its internal operations, the 80387 actually employs a number system that is a substantial superset of that shown in Figure 2-9. The internal format (called extended real) extends the 80387's range to about ±3.30 X 10.4932 to ± 1.21 X 104932 , and its precision to about 19 (equivalent decimal) digits. This format is designed to provide extra range and precision for constants and intermediate results, and is not normally intended for data or final results. From a practical standpoint, the 80387's set of real numbers is sufficiently large and dense so as not to limit the vast majority of microprocessor applications. Compared to most computers, including mainframes, the NPX provides a very good approximation of the real number system. It is important to remember, however, that it is not an exact representation, and that arithmetic on real numbers is inherently approximate. Conversely, and equally important, the 80387 does perform exact arithmetic on integer operands. That is, if an operation on two integers is valid and produces a result that is in range, the result is exact. For example, 4 -7- 2 yields an exact integer, I -7- 3 does not, and 240 X 230 + I does not, because the result requires greater than 64 bits of precision. 2.2.2 Data Types and Formats The 80387 recognizes seven numeric data types for memory-based values, divided into three classes: binary integers, packed decimal integers, and binary reals. A later section describes how these formats are stored in memory (the sign is always located in the highest-addressed byte). Figure 2-10 summarizes the format of each data type. In the figure, the most significant digits of all numbers (and fields within numbers) are the leftmost digits. 2.2.2.1 BINARY INTEGERS The three binary integer formats are identical except for length, which governs the range that can be accommodated in each format. The leftmost bit is interpreted as the number's sign: O=positive and I = negative. Negative numbers are represented in standard two's complement notation (the binary integers are the only 80387 format to use two's complement). The quantity zero is represented with a positive sign (all bits are 0). The 80387 word integer format is identical to the 16-bit signed integer data type of the 80386; the 80387 short integer format is identical to the 32-bit signed integer data type of the 80386. 2-11 80387 ARCHITECTURE MOST SIGNIFICANT BYTE DATA FORMATS WORD INTEGER RANGE 7 10' 01 7 16 BITS 10' 01 7 , 15 SHORT INTEGER HIGHEST ADDRESSED BYTE PRECISION 01 7 01 7 10 19 01 7 01 7 10'8 ,(TWO'S COMPLEMENT) 0 64 BITS sl 79 SINGLE PRECISION 10+ 36 24BITS Sl 31 DOUBLE PRECISION EXTENDED PRECISION 10' 308 10:1:4932 53 BITS 64 BITS SI 01 WWO'S COMPLEMENT) 0 32 BITS 18 DIGITS 01 7 g~~~EMENT) 63 PACKED BCD 01 7 0 31 LONG INTEGER 01 7 X Ie MAGNITUDE d17 d'6 d,s d'4 d'3 d'2 d n d,o d g dB d 7 d 6 d s d 4 d 3 d 2 d t do 72 E:~~~i~T I 0 I SIGNIFICAND 23 BIASED EXPONENT I 0 I 63 52 SI BIASED EXPONENT 79 SIGNIFICAND I 0 hl 6463" SIGNIFICAND I 0 (1) S ~ SIGN BIT (0 ~ positive, 1 ~ negative) (2) do ~ DECIMAL DIGIT (TWO PER TYPE) (3) X ~ BITS HAVE NO SIGNIFICANCE; 80387 IGNORES WHEN LOADING, ZEROS WHEN STORING (4) " ~ POSITION OF IMPLICIT BINARY POINT (5) I ~ INTEGER BIT OF SIGNIFICAND; STORED IN TEMPORARY REAL, IMPLICIT IN SINGLE AND DOUBLE PRECISION (6) EXPONENT BIAS (NORMALIZED VALUES): SINGLE: 127 (7FH) DOUBLE: 1023 (3FFH) EXTENDED REAL: 16383 (3FFFH) (7) PACKED BCD: (-1)' (0" ... 0,) (8) REAL: (-1)' (2 E · . . .' ) (FoF, ... ) G40003 Figure 2-10. 80387 Data Formats 2-12 80387 ARCHITECTURE The binary integer formats exist in memory only. When used by the 80387, they are automatically converted to the 80-bit extended real format. All binary integers are exactly representable in the extended real format. 2.2.2.2 DECIMAL INTEGERS Decimal integers are stored in packed decimal notation, with two decimal digits "packed" into each byte, except the leftmost byte, which carries the sign bit (O=positive, 1 = negative). Negative numbers are not stored in two's complement form and are distinguished from positive numbers only by the sign bit. The most significant digit of the number is the leftmost digit. All digits must be in the range 0-9. The decimal integer format exists in memory only. When used by the 80387, it is automatically converted to the 80-bit extended real format. All decimal integers are exactly representable in the extended real format. 2.2.2.3 REAL NUMBERS The 80387 represents real numbers of the form: ... where ... = 0 or I E = any integer between Emin and Emax, inclusive bi = 0 or 1 p = number of bits of precision s Table 2-3 summarizes the parameters for each of the three real-number formats. Table 2-3. Summary of Format Parameters Format Parameter Single Double Extended Format width in bits 32 64 80 P (bits of precision) 24 53 64 8 11 15 Emax +127 +1023 +16383 Emin -126 -1022 -16382 Exponent bias +127 +1023 +16383 Exponent width in bits 2-13 80387 ARCHITECTURE The 80387 stores real numbers in a three-field binary format that resembles scientific, or exponential, notation. The format consists of the following fields: The number's significant digits are held in the significand field, bo"blb2b3 .. bp_l. (The term "significand" is analogous to the term "mantissa" used to describe floating point numbers on some computers.) The exponent field, e = E + bias, locates the binary point within the significant digits (and therefore determines the number's magnitude). (The term "exponent" is analogous to the term "characteristic" used to describe floating point numbers on some computers.) The I-bit sign field indicates whether the number is positive or negative. Negative numbers differ from positive numbers only in the sign bits of their significands. Table 2-4 shows how the real number 178.125 (decimal) is stored in the 80387 single real format. The table lists a progression of equivalent notations that express the same value to show how a number can be converted from one form to another. (The ASM386 and PL/M-386 language translators perform a similar process when they encounter programmer-defined real number constants.) Note that not every decimal fraction has an exact binary equivalent. The decimal number 1/10, for example, cannot be expressed exactly in binary (just as the number 113 cannot be expressed exactly in decimal). When a translator encounters such a value, it produces a rounded binary approximation of the decimal value. The NPX usually carries the digits of the significand in normalized form. This means that, except for the value zero, the significand contains an integer bit and fraction bits as follows: I "fff...ff where" indicates an assumed binary point. The number of fraction bits varies according to the real format: 23 for single, 52 for double, and 63 for extended real. By normalizing real numbers so that their integer bit is always a I, the 80387 eliminates leading zeros in small Table 2-4. Real Number Notation Notation Value Ordinary Decimal 178.125 Scientific Decimal 1,,78125E2 Scientific Binary 1,,0110010001 E111 Scientific Binary (Biased Exponent) 1,,0110010001E10000110 80387 Single Format (Normalized) Sign Biased Exponent 0 10000110 Significand 01100100010000000000000 1,(implicit) 2-14 80387 ARCHITECTURE values (I X I < 1). This technique maximizes the number of significant digits that can be accommodated in a significand of a given width. Note that, in the single and double formats, the integer bit is implicit and is not actually stored; the integer bit is physically present in the extended format only. If one were to examine only the significand with its assumed binary point, all normalized real numbers would have values greater than or equal to 1 and less than 2. The exponent field locates the actual binary point in the significant digits. Just as in decimal scientific notation, a positive exponent has the effect of moving the binary point to the right, and a negative exponent effectively moves the binary point to the left, inserting leading zeros as necessary. An unbiased exponent of zero indicates that the position of the assumed binary point is also the position of the actual binary point. The exponent field, then, determines a real number's magnitude. In order to simplify comparing real numbers (e.g., for sorting), the 80387 stores exponents in a biased form. This means that a constant is added to the true exponent described above. As Table 2-3 shows, the value of this bias is different for each real format. It has been chosen so as to force the biased exponent to be a positive value. This allows two real numbers (of the same format and sign) to be compared as if they are unsigned binary integers. That is, when comparing them bitwise from left to right (beginning with the leftmost exponent bit), the first bit position that differs orders the numbers; there is no need to proceed further with the comparison. A number's true exponent can be determined simply by subtracting the bias value of its format. The single and double real formats exist in memory only. If a number in one of these formats is loaded into an 80387 register, it is automatically converted to extended format, the format used for all internal operations. Likewise, data in registers can be converted to single or double real for storage in memory. The extended real format may be used in memory also, typically to store intermediate results that cannot be held in registers. Most applications should use the double format to store real-number data and results; it provides sufficient range and precision to return correct results with a minimum of programmer attention. The single real format is appropriate for applications that are constrained by memory, but it should be recognized that this format provides a smaller margin of safety. It is also useful for the debugging of algorithms, because roundoff problems will manifest themselves more quickly in this format. The extended real format should normally be reserved for holding intermediate results, loop accumulations, and constants. Its extra length is designed to shield final results from the effects of rounding and overflow (underflow in intermediate calculations. However, the range and precision of the double format are adequate for most microcomputer applications. 2.2.3 Rounding Control Internally, the 80387 employs three extra bits (guard, round, and sticky bits) that enable it to round numbers in accord with the infinitely precise true result of a computation; these bits are not accessible to programmers. Whenever the destination can represent the infinitely precise true result, the 80387 delivers it. Rounding occurs in arithmetic and store operations when the format of the destination cannot exactly represent the infinitely precise true result. 2-15 80387 ARCHITECTURE For example, a real number may be rounded if it is stored in a shorter real format, or in an integer format. Or, the infinitely precise true result may be rounded when it is returned to a register. The NPX has four rounding modes, selectable by the RC field in the control word (see Figure 2-3). Given a true result b that cannot be represented by the target data type, the 80387 determines the two representable numbers a and c that most closely bracket b in value (a < b < c). The processor then rounds (changes) b to a or to c according to the mode selected by the RC field as shown in Table 2-5. Rounding introduces an error in a result that is less than one unit in the last place to which the result is rounded. "Round to nearest" is the default mode and is suitable for most applications; it provides the most accurate and statistically unbiased estimate of the true result. • The "chop" or "round toward zero" mode is provided for integer arithmetic applications. • "Round up" and "round down" are termed directed rounding and can be used to implement interval arithmetic. Interval arithmetic generates a certifiable result independent of the occurrence of rounding and other errors. The upper and lower bounds of an interval may be computed by executing an algorithm twice, rounding up in one pass and down in the other. Rounding control affects only the arithmetic instructions (refer to Chapter 3 for lists of arithmetic and non arithmetic instructions). 2.2.4 Precision Control The 80387 allows results to be calculated with either 64, 53, or 24 bits of precision in the significand as selected by the precision control (PC) field of the control word. The default setting, and the one that is best suited for most applications, is the full 64 bits of significance provided by the extended real format. The other settings are required by the IEEE standard and are provided to obtain compatibility with the specifications of certain existing programming languages. Specifying less precision nullifies the advantages of the extended format's extended fraction length. When reduced precision is specified, the rounding of the fractional value clears the unused bits on the right to zeros. 2-16 80387 ARCHITECTURE Table 2-5. Rounding Modes RC Field Rounding Mode Rounding Action 00 Round to nearest 01 Round down (toward -00) a 10 Round up (toward +00) c 11 Chop (toward 0) Smaller in magnitude of a or c. NOTE: a < b< Closer to b of a or c; if equally close, select even number (the one whose least significant bit is zero). c; a and c are successive representable numbers; b is not representable. 2-17 Special Computational Situations 3 CHAPTER 3 SPECIAL COMPUTATIONAL SITUATIONS Besides being able to represent positive and negative numbers, the 80387 data formats may be used to describe other entities. These special values provide extra flexibility, but most users will not need to understand them in order to use the 80387 successfully. This section describes the special values that may occur in certain cases and the significance of each. The 80387 exceptions are also described, for writers of exception handlers and for those interested in probing the limits of computation using the 80387. The material presented in this section is mainly of interest to programmers concerned with writing exception handlers. Many readers will only need to skim this section. When discussing these special computational situations, it is useful to distinguish between arithmetic instructions and nonarithmetic instructions. Nonarithmetic instructions are those that have no operands or transfer their operands without substantial change; arithmetic instructions are those that make significant changes to their operands. Table 3-1 defines these two classes of instructions. 3.1 SPECIAL NUMERIC VALUES The 80387 data formats encompass encodings for a variety of special values in addition to the typical real or integer data values that result from normal calculations. These special values have significance and can express relevant information about the computations or operations that produced them. The various types of special values are • Denormal real numbers • • Zeros Positive and negative infinity • NaN (Not-a-Number) Indefinite • Unsupported formats The following sections explain the origins and significance of each of these special values. Tables 3-6 through 3-9 at the end of this section show how each of these special values is encoded for each of the numeric data types. 3.1.1 Denormal Real Numbers The 80387 generally stores nonzero real numbers in normalized floating-point form; that is, the integer (leading) bit of the significand is always a one. (Refer to Chapter 2 for a review of operand formats.) This bit is explicitly stored in the extended format, and is implicitly 3-1 SPECIAL COMPUTATIONAL SITUATIONS Table 3-1. Arithmetic and Nonarithmetic Instructions Nonarithmetic Instructions Arithmetic Instructions F2XM1 FAOO(P) FBLO FBSTP FCOMP(P)(P) FCOS FOIV(R)(P) FIAOO FICOM(P) FIOIV(R) FILO FIMUL FIST(P) FISUB(R) FLO (conversion) FMUL(P) FPATAN FPREM FPREM1 FPTAN FRNOINT FSCALE FSIN FSINCOS FSQRT FST(P) (conversion) FSUB(R)(P) FTST FUCOM(P)(P) FXTRACT FYL2X FYL2XP1 FABS FCHS FCLEX FOECSTP FFREE FINCSTP FINIT FLO (register-to-register) FLO (extended format from memory) FLO constant FLDCW FLDENV FNOP FRSTOR FSAVE FST(P) (register-to-register) FSTP (extended format to memory) FSTCW FSTENV FSTSW FWAIT FXAM FXCH assumed to be a one (1,,) in the single and double formats. Since leading zeros are eliminated, normalized storage allows the maximum number of significant digits to be held in a significand of a given width. When a numeric value becomes very close to zero, normalized floating-point storage cannot be used to express the value accurately. The term tiny is used here to precisely define what values require special handling by the 80387. A number R is said to be tiny when -2Emin < R < 0 or 0 < R < +2 Emin . (As defined in Chapter 2, Emin is -126 for single format, -1022 for double format, and -16382 for extended format.) In other words, a nonzero number is tiny if its exponent would be too negative to store in the destination format. To accommodate these instances, the 80387 can store and operate on reals that are not normalized, i.e., whose significands contain one or more leading zeros. Denormals typically arise when the result of a calculation yields a value that is tiny. 3-2 SPECIAL COMPUTATIONAL SITUATIONS Denormal values have the following properties: The biased floating-point exponent is stored at its smallest value (zero) The integer bit of the significand (whether explicit or implicit) is zero The leading zeros of denormals permit smaller numbers to be represented, at the possible cost of some lost precision (the number of significant bits is reduced by the leading zeros). In typical algorithms, extremely small values are most likely to be generated as intermediate, rather than final, results. By using the NPX's extended real format for holding intermediate values, quantities as small as ± 3.4 X 10-4932 can be represented; this makes the occurrence of denormal numbers a rare phenomenon in 80387 applications. Nevertheless, the NPX can load, store, and operate on denormalized real numbers when they do occur. Denormals receive special treatment by the 80387 in three respects: The 80387 avoids creating denormals whenever possible. In other words, it always normalizes real numbers except in the case of tiny numbers. • The 80387 provides the unmasked underflow exception to permit programmers to detect cases when denormals would be created. The 80387 provides the denormal exception to permit programmers to detect cases when denormals enter into further calculations. Denormalizing means incrementing the true result's exponent and inserting a corresponding leading zero in the significand, shifting the rest of the significand one place to the right. Denorma! values may occur in any of the single, double, or extended formats. Table 3-2 illustrates how a result might be denormalized to fit a single format destination. Denormalization produces either a denormal or a zero. Denormals are readily identified by their exponents, which are always the minimum for their formats; in biased form, this is always the bit string: 00 .. 00. This same exponent value is also assigned to the zeros, but a denormal has a nonzero significand. A denormal in a register is tagged special. Tables 3-8 and 3-9 later in this chapter show how denormal values are encoded in each of the real data formats. The denormalization process causes loss of significance if low-order one-bits bits are shifted off the right of the significand. In a severe case, all the significand bits of the true result are shifted out and replaced by the leading zeros. In this case, the result of denormalization is a true zero, and, if the value is in a register, it is tagged as a zero. Table 3-2. Denormalization Process Operation Sign Exponent Significand True Result Denormalize Denormalize Denormalize Denormal Result 0 0 0 0 0 -129 -128 -127 -126 -126 1,,01011100 .. 00 0,,101011100 .. 00 0,,0101011100 .. 00 0,,00101011100 .. 00 0,,00101011100 .. 00 3-3 SPECIAL COMPUTATIONAL SITUATIONS Denormals are rarely encountered in most applications. Typical debugged algorithms generate extremely small results during the evaluation of intermediate subexpressions; the final result is usually of an appropriate magnitude for its single or double format real destination. If intermediate results are held in temporary real, as is recommended, the great range of this format makes underflow very unlikely. Denormals are likely to arise only when an application generates a great many intermediates, so many that they cannot be held on the register stack or in extended format memory variables. If storage limitations force the use of single or double format reals for intermediates, and small values are produced, underflow may occur, and, if masked, may generate denormals. When a denormal number is single or double format is used as a source operand and the denormal exception is masked, the 80387 automatically normalizes the number when it is converted to extended format. 3.1.1.1 DE NORMALS AND GRADUAL UNDERFLOW Floating-pont arithmetic cannot carry out all operations exactly for all operands; approximation is unavoidable when the exact result is not representable as a floating-point variable. To keep the approximation mathematically tractable, the hardware is made to conform to accuracy standards that can be modeled by certain inequalities instead of equations. Let the assignment X+-Y@Z (where @ is some operation) represent a typical operation. In the default rounding mode (round to nearest), each operation is carried out with an absolute error no larger than half the separation between the two floating-point numbers closest to the exact results. Let x be the value stored for the variable whose name in the program is X, and similarly y for Y, and z for Z. Normally y and z will differ by accumulated errors from what is desired and from what would have been obtained in the absence of error. For the calculation of x we assume that y and z are the best approximations available, and we seek to compute x as well as we can. If y@z is representable exactly, then we expect x = y@z, and that is what we get for every algebraic operation on the 80387 (i.e., when y@z is one of y+z, y-z, yXz, y-;-z, sqrt z). But if y@z must be approximated, as is usually the case, then x must differ from y@z by no more than half the difference between the two representable numbers that straddle y@z. That difference depends on two factors: 1. 2. The precision to which the calculation is carried out, as determined either by the precision control bits or by the format used in memory. On the 80387, the precisions are single (24 significant bits), double (53 significant bits), and extended (64 significant bits). How close y@z is to zero. In this respect the presence of denormal numbers on the 80387 provides a distinct advantage over systems that do not admit denormal numbers. In any floating-point number system, the density of representable numbers is greater near zero than near the largest representable magnitudes. However, machines that do not use denormal numbers suffer from an enormous gap between zero and its closest neighbors. Figures 3-1 and 3-2 show what happens near zero in two kinds of floating-point number systems. 3-4 SPECIAL COMPUTATIONAL SITUATIONS 0+++++++1 +++++++1-+-+-+-+-+-+-+-1---+---+---+---+---·---+---+---1-------+-------+-------. -----Normal Humbers-----~ Denormals Figure 3-1. Floating-Point System with Denormals I ••• +t • • 1-+·.-+-+-+-+-+-1---+---+---+---+---+---+---+---1-------.-------.-------. ----Hormal N u m b e r s - - - - - .. Figure 3-2. Floating-Point System without Denormals Figure 3-1 shows a floating-point number system that (like the 80387) admits denormal numbers. For simplicity, only the non-negative numbers appear and the figure illustrates a number system that carries just four significant bits instead of the 24, 53, or 64 significant bits that the 80387 offers. Each vertical mark stands for a number representable in four significant bits, and the bolder marks stand for the normal powers of 2. The denormal numbers lie between 0 and the nearest normal power of 2. They are no less dense than the remaining normal nonzero numbers. Figure 3-2 shows a floating-point number system that (unlike the 80387) does not admit denormal numbers. There are two yawning gaps, one on the positive side of zero (as illustrated) and one on the negative side of zero (not illustrated). The gap between zero and the nearest neighbor of zero differs from the gap between that neighbor and the next bigger number by a factor of about 8.4 X 106 for single, 4.5 X 10 15 for double, and 9.2 X lOIS for extended format. Those gaps would horribly complicate error analysis. The advantage of denormal numbers is apparent when one considers what happens in either case when the underflow exception is masked and y@z falls into the space between zero and the smallest normal magnitude. The 80387 returns the nearest denormal number. This action might be called "gradual underflow." The effect is no different than the rounding that can occur when y@z falls in the normal range. On the other hand, the system that does not have denormal numbers returns zero as the result, an action that can be much more inaccurate than rounding. This action could be called "abrupt underflow." 3-5 SPECIAL COMPUTATIONAL SITUATIONS 3.1.2 Zeros The value zero in the real and decimal integer formats may be signed either positive or negative, although the sign of a binary integer zero is always positive. For computational purposes, the value of zero always behaves identically, regardless of sign, and typically the fact that a zero may be signed is transparent to the programmer. If necessary, the FXAM instruction may be used to determine a zero's sign. If a zero is loaded or generated in a register, the register is tagged zero. Table 3-3 lists the results of instructions executed with zero operands and also shows how a zero may be created from nonzero operands. 3-6 SPECIAL COMPUTATIONAL SITUATIONS Table 3-3. Zero Operands and Results Operation FLD,FBLD FILD FST,FSTP FBSTP FIST,FISTP Addition Subtraction Multiplication Multiplication Division FPREM, FPREM1 FPREM FPREM1 Operands Result JO -0 +0 +0 -0 +X -X +0 -0 +0 -0 +X -X +0 plus +0 -0 plus -0 +0 plus -0, -0 plus +0 -X plus +X, +X plus -X ±O plus ±X, ±X plus ±O +0 minus -0 -0 minus +0 +0 minus +0, -0 minus -0 +X minus +X, -X minus -X ±O minus ±X ±X minus ±O +0 X +0, -0 X -0 +0 X -0, -0 X +0 +0 X +X, +X X +0 + 0 X - X, - X X + 0 -0 X +X, -X X +0 -0 X -X, -X X -0 +X X +Y, -X X -Y +X X -V, -X X +Y ±O -;- ±O ±X -;- ±O +0 -;- +X, -0 -;- -X + 0 -;- - X, - 0 -;- + X -X -;- -V, +X -;- +Y -X -;- +Y, +X -;- -Y ±O rem ±O ±X rem ±O +0 rem ±X -0 rem ±X +X rem ±Y -X rem ±Y +X rem ±Y -X rem ±Y +0 -0 +0 +0 -0 +0 ' -0 ' +0 -0 +0 -0 +03 -0 3 +0 -0 ±02 ±02 #X +0 -0 ±02 ±02 -#X #X +0 -0 +0 -0 -0 +0 +0' -0' Invalid Operation $00 (Zero Divide) +0 -0 +0' -0' Invalid Operation Invalid Operation +0 -0 +0 Y exactly divides X -0 Y exactly divides X + 0 Y exactly divides X - 0 Y exactly divides X X and Y denote nonzero positive operands. 1 When extreme underflow denormalizes the result to zero. 2 Sign determined by rounding mode: + for nearest, up, or chop, - for down. 3 When 0 < X < 1 and rounding mode is not up. Sign of original zero operand. # Sign of original X operand. -# Complement of sign of original X operand. $ Exclusive OR of the signs of the operands. 3-7 SPECIAL COMPUTATIONAL SITUATIONS Table 3-3. Zero Operands and Results (Cont'd.) Operands Operation FSQRT Compare FTST FCHS FABS F2XM1 FRNDINT FSCALE FXTRACT FPTAN FSIN (or SIN result of FSINCOS) FCOS (or COS result of FSINCOS) FPATAN FYL2X FYL2XP1 Result +0 -0 ±O:+X ±O:±O ±O:-X ±O +0 -0 +0 -0 ±O +0 -0 +0 -0 ± 0 scaled by - CD ± 0 scaled by + CD ± 0 scaled by X +0 -0 ±O ±O +0 -0 ±O < +X ±O = ±O ±O> -X ±O = 0 C3 =1; C2 =C,=C O=0 C3 =C, = 1; C2 =C O=0 -0 +0 +0 +0 -0 +0 -0 *0 Invalid Operation '0 ST= +0,ST(1)= -CD, Zero divide ST= -0,ST(1)= -CD, Zero divide *0 '0 ±O +1 ±O -i- +X ±O -i- -X ±X -i- ±O ±O -i- +0 ±O -i- -0 +CD -i- ±O -CD -:- ±O ±O -i- +CD ±O -:- -CD ±Y X 10g(±0) ±O X 10g(±0) +Y X log(±0+1) -Y X log(±0+1) '0 *1r #1r/2 '0 1r +1r/2 -1r/2 '0 *1r Zero Divide Invalid Operation *0 -*0 . X and Y denote nonzero positive operands. • Sign of original zero operand. # Sign of original X operand. - # Complement of sign of original X operand. 3-8 SPECIAL COMPUTATIONAL SITUATIONS 3.1.3 Infinity The real formats support signed representations of infinities. These values are encoded with a biased exponent of all ones and a significand of l~OO .. OO; if the infinity is in a register, it is tagged special. A programmer may code an infinity, or it may be created by the NPX as its masked response to an overflow or a zero divide exception. Note that depending on rounding mode, the masked response may create the largest valid value representable in the destination rather than infinity. The signs of the infinities are observed, and comparisons are possible. Infinities are always interpreted in the affine sense; that is, -CXl < (any finite number) < +CXl. Arithmetic on infinities is always exact and, therefore, signals no exceptions, except for the invalid operations specified in Table 3-4. Table 3-4. Infinity Operands and Results Operation Addition Subtraction Multiplication Division FSQRT FPREM, FPREM1 FRNDINT X Y $ $ Result Operands + co plus + (X) -co plus - ( X ) +co plus - ( X ) -co plus +00 ±co plus ±X ±X plus ±oo +co minus - ( X ) -co minus +00 +co minus +00 -co minus - ( X ) ± co minus ± X ±X minus ±oo ±co X ±oo ±co X ±Y, ±Y X ±oo ±O X ±co, ±oo X ±O ±co -:-- ±co ±co -:- ±X ±X -:- ±oo ±co -:- ±O -m +co ±co rem ±co ±co rem ±X ±X rem ±co ±m Zero or nonzero positive operand. Nonzero positive operand. Sign of original infinity operand. Complement of sign of original infinity operand. Sign of original operand. Exclusive OR of signs of operands. 3-9 +00 -(X) Invalid Operation Invalid Operation *00 *00 +00 -(X) I nvalid Operation Invalid Operation *00 -*00 $00 $00 Invalid Operation Invalid Operation $00 $0 $co Invalid Operation +co Invalid Operation Invalid Operation $X, Q = 0 'co SPECIAL COMPUTATIONAL SITUATIONS Table 3-4. Infinity Operands and Results (Cont'd.) Operation FSCALE FXTRACT Compare FTST FPATAN F2XM1 FYL2X, FYL2XP1 X Y # 1 Operands ± 00 scaled by - - 00 ± 00 scaled by + 00 ± 00 scaled by ± X ± 0 scaled by - 00 ± 0 scaled by 00 ± Y scaled by + 00 ± Y scaled by - 00 ±oo +00 : +00 -00 : -00 +00 : -00 -00 : +00 +00 : ±X -00 : ±X ±X: +00 ±X :-00 +00 - 00 ±oo -0- ±X ±Y-o- +00 ±Y -0- -00 ±oo -0- +00 ± 00 -0- -00 ±oo -0- ±O +0 -0- +00 +0 -0- -00 -0 -0- +00 -0 -0- -00 +00 -00 ± 00 X log(1) ± 00 X 10g(Y> 1) ±oo X log(0-00 -00 < +00 +00 > X -00 < X X < +00 X> +00 +00 >0 - 00 <0 *7rj2 #0 #7r *7rj4 *37rj4 *7rj2 +0 +7r -0 -7r +00 -1 Invalid Operation *00 -*00 #00 Invalid Operation Invalid Operation Zero or nonzero positive operand. Nonzero positive operand. Sign of original infinity operand. Complement of sign of original infinity operand. Sign of the original Y operand. Sign of original zero operand. 3.1.4 NaN (Not-a-Number) A NaN (Not a Number) is a member of a class of special values that exists in the real formats only. A NaN has an exponent of 11..11B, may have either sign, and may have any significand except l~OO .. OOB, which is assigned to the infinities. A NaN in a register is tagged special. 3-10 SPECIAL COMPUTATIONAL SITUATIONS There are two classes of NaNs: signaling (SNaN) and quiet (QNaN). Among the QNaNs, the value real indefinite is of special interest. 3.1.4.1 SIGNALING NaNs A signaling NaN is a NaN that has a zero as the most significant bit of its significand. The rest of the significand may be set to any value. The 80387 never generates a signaling NaN as a result; however, it recognizes signaling NaNs when they appear as operands. Arithmetic operations (as defined at the beginning of this chapter) on a signaling NaN cause an invalidoperation exception (except for load operations, FXCH, FCHS, and FABS). By unmasking the invalid operation exception, the programmer can use signaling NaN s to trap to the exception handler. The generality of this approach and the large number of NaN values that are available provide the sophisticated programmer with a tool that can be applied to a variety of special situations. For example, a compiler could use signaling NaNs as references to un initialized (real) array elements. The compiler could preinitialize each array element with a signaling NaN whose significand contained the index (relative position) of the element. If an application program attempted to access an element that it had not initialized, it would use the NaN placed there by the compiler. If the invalid operation exception were unmasked, an interrupt would occur, and the exception handler would be invoked. The exception handler could determine which element had been accessed, since the operand address field of the exception pointers would point to the NaN, and the NaN would contain the index number of the array element. 3.1.4.2 QUIET NaNs A quiet NaN is a NaN that has a one as the most significant bit of its significand. The 80387 creates the quiet NaN real indefinite (defined below) as its default response to certain exceptional conditions. The 80387 may derive other QNaNs by converting an SNaN. The 80387 converts a SNaN by setting the most significant bit of its significand to one, thereby generating an QNaN. The remaining bits of the significand are not changed; therefore, diagnostic information that may be stored in these bits of the SNaN is propagated into the QNaN. The 80387 will generate the special QNaN, real indefinite, as its masked response to an invalid operation exception. This NaN is signed negative; its significand is encoded 1~100.. 00. All other NaNs represent values created by programmers or derived from values created by programmers. Both quiet and signaling NaNs are supported in all operations. A QNaN is generated as the masked response for invalid-operation exceptions and as the result of an operation in which at least one of the operands is a QNaN. The 80387 applies the rules shown in Table 3-5 when generating a QNaN: Note that handling of a QNaN operand has greater priority than all exceptions except certain invalid-operation exceptions (refer to the section "Exception Priority" in this chapter). 3-11 inter SPECIAL COMPUTATIONAL SITUATIONS Table 3-5. Rules for Generating QNaNs Operation Action Real operation on an SNaN and aQNaN Deliver the QNaN operand. Real operation on two SNaNs Deliver the QNaN that results from converting the SNaN that has the larger significand. Real operation on two QNaNs Deliver the QNaN that has the larger significand. Real operation on an SNaN and another number Deliver the QNaN that results from converting the SNaN. Real operation on a QNaN and another number Deliver the QNaN. Invalid operation that does not involve NaNs Deliver the default QNaN real indefinite. Quiet NaNs could be used, for example, to speed up debugging. In its early testing phase, a program often contains multiple errors. An exception handler could be written to save diagnostic information in memory whenever it was invoked. After storing the diagnostic data, it could supply a quiet NaN as the result of the erroneous instruction, and that NaN could point to its associated diagnostic area in memory. The program would then continue, creating a different NaN for each error. When the program ended, the NaN results could be used to access the diagnostic data saved at the time the errors occurred. Many errors could thus be diagnosed and corrected in one test run. 3.1.5 Indefinite For every 80387 numeric data type, one unique encoding is reserved for representing the special value indefinite. The 80387 produces this encoding as its response to a masked invalidoperation exception. In the case of reals, the indefinite value is a QNaN as discussed in the prior section. Packed decimal indefinite may be stored by the NPX in a FBSTP instruction; attempting to use this encoding in a FBLD instruction, however, will have an undefined result; thus indefinite cannot be loaded from a packed decimal integer. In the binary integers, the same encoding may represent either indefinite or the largest negative number supported by the format (-2'5, -2 31 , or _263). The 80387 will store this encoding as its masked response to an invalid operation, or when the value in a source register represents or rounds to the largest negative integer representable by the destination. In situations where its origin may be ambiguous, the invalid-operation exception flag can be examined to see if the value was produced by an exception response. When this encoding is loaded or used by an integer arithmetic or compare operation, it is always interpreted as a negative number; thus indefinite cannot be loaded from a binary integer. 3-12 SPECIAL COMPUTATIONAL SITUATIONS 3.1.6 Encoding of Data Types Tables 3-6 through 3-9 show how each of the special values just described is encoded for each of the numeric data types. In these tables, the least-significant bits are shown to the right and are stored in the lowest memory addresses. The sign bit is always the left-most bit of the highest-addressed byte. 3.1.7 Unsupported Formats The extended format permits many bit patterns that do not fall into any of the previously mentioned categories. Some of these encodings were supported by the 80287 NPX; however, most of them are not supported by the 80387 NPX. These changes are required due to changes made in the final version of the IEEE 754 standard that eliminated these data types. The categories of encodings formerly known as pseudozeros, pseudo-NaNs, pseudoinfinities, and unnormal numbers are not supported by the 80387. The 80387 raises the invalidoperation exception when they are encountered as operands. The encodings formerly known as pseudodenormal numbers are not generated by the 80387; however, they are correctly utilized when encountered in operands to 80387 instructions. The exponent is treated as if it were 00 .. 01 and the mantissa is unchanged. The denormal exception is raised. 3-13 SPECIAL COMPUTATIONAL SITUATIONS Table 3-6. Binary Integer Encodings Class Sign Magnitude 0 11 .. 11 (Smallest) ·· · 0 00 .. 01 Zero 0 00 .. 00 (Smallest) 1 11,.11 1 00 .. 00 (Largest) ·· · U> CD ~ ';;; 0 0.. ·· ·· ·· U> CD ~ «I C> CD z (Largest/lndefinite*) ··· ··· ··· ·· · Word: Short: Long: 15 bits 31 bits 63 bits *If this encoding is used as a source operand (as in an integer load or integer arithmetic instruction), the 80387 interprets it as the largest negative number representable in the format... -2 15, -2 31 , or -263. The 80387 delivers this encoding to an integer destination in two cases: 1. If the result is the largest negative number. 2. As the response to a masked invalid operation exception, in which case it represents the special value integer indefinite. 3-14 SPECIAL COMPUTATIONAL SITUATIONS Table 3-7. Packed Decimal Encodings Magnitude Class Sign digit (Largest) Q) digit ... digit ··· · 0000000 1 001 1 001 1 001 ... 1 001 0000 0000 ·· ·· 1 001 0000 0000 ... 0001 0000000 0000000 0000 0000 0000 0000 ... 0000 Zero 1 0000000 0000 0000 0000 0000 ., . 0000 (Smallest) 1 0000000 0000 0000 0000 ·· · 0000 ... 0001 1 001 1 001 1 001 1 0 a1 ... 1 001 1111 1111 U U U U** UUUU ... UUUU -; en Q) z (Largest) Indefinite* ··· 1· ··· · 0000000 1 1111111 - - 1 byt e * I 0 'in ,~ digit Zero Q) III I (Smallest) III 0 digit ··· 0· ~ a.. a I · 9 bytes The packed decimal indefinite is stored by FBSTP in response to a masked invalid operation exception. Attempting to load this value via FBLD produces an undefined result. UUUU means bit values are undefined and may contain any value. 3-15 SPECIAL COMPUTATIONAL SITUATIONS Table 3-8. Single and Double Real Encodings Sign Biased Exponent Significand 0 11 .. 11 11 .. 11 0 ·· 11 .. 11 10.. 00 0 11 .. 11 0 ·· 01 .. 11 11 .. 11 00 .. 01 0 11 .. 11 00 .. 00 0 11..10 11..11 0 ·· 00 .. 01 00 .. 00 0 00 .. 00 11..11 0 ·· 00 .. 00 00 .. 01 Zero 0 00 .. 00 00 .. 00 Zero 1 00 .. 00 00 .. 00 1 00 .. 00 00 .. 01 1 00 .. 00 11..11 1 00 .. 01 1 ·· 00 .. 00 11..10 11..11 1 11..11 00 ..00 1 11..11 00 .. 01 1 11..11 01..11 1 11..11 10.. 00 1 11..11 11..11 Class Quiet In z«J Z Signaling ff--ff' ·· ·· In CD Infinity > 'iii :;::: 0 D.. Normals ·· In iij CD a: Denormals In ·· Denormals iij CD a: Normals In CD ~ ·· ·· ·· «J en CD Infinity z ·· Signaling In z«J z Indefinite ·· Quiet Single: Double: ---8bits---11 bits-- 'Integer bit is implied and not stored. 3-16 ·· ·· - - - 2 3 bits - - - 5 2 bits SPECIAL COMPUTATIONAL SITUATIONS Table 3-9. Extended Real Encodings Class Quiet on CD .!: := on on z1\1 Z 0 Signaling Q. Infinity Normals on CD Z 01 CD z 0 11 .. 10 111 .. 11 · 0· · .. 01 00 · 100 .. 00 ·· 0 11 .. 10 011 .. 11 ·· ·· 0 0 00 .. 00 ·· 000 .. 01 Zero 0 00 .. 00 000 .. 00 Zero 1 00 .. 00 000 .. 00 1 ·· 00 .. 00 1 ·· 000 .. 01 00 .. 00 011 .. 11 ·· 1 00 .. 00 100.. 00 1 ·· 00 .. 00 111 .. 11 ·· 1 00 .. 00 000 .. 00 1 ·· 11 .. 10 011 .. 11 ·· 1 00 .. 01 100 .. 00 1 ·· 11..10 111 .. 11 1 11 .. 11 100 .. 00 ·· 1 11 .. 11 1 00 .. 01 1 ·· 11 .. 11 1 01 .. 11 1 11 .. 11 1 10 .. 00 1 ·· 11 .. 11 111 .. 11 ---15 bits--- - - - 6 4 bits--- (II 1\1 100 .. 00 1 00 .. 00 Signaling z 1 00 .. 01 11 .. 11 011..11 Infinity CD 11 .. 11 0 00 .. 00 Normals > :;::; 1 01 .. 11 0 ·· 00 .. 00 Unsupported 8087 Un normals (II 11 .. 11 0 1\1 01 CD 0 ·· ·· 111 .. 11 ~ Z 1 10 .. 00 ·· Pseudodenormals CD 11 .. 11 00 .. 00 Denormals on 1 11 .. 11 0 ·· ·· ·· Denormals a: 11 .. 11 0 Pseudodenormals iii ·· ·· 000 .. 00 ~ I-- 0 0 'iii 0 Significand i.ff-ff · 00 ..·01 CD Q. Biased Exponent ·· Unsupported 8087 Un normals on Sign 1\1 Indefinite Quiet ·· 3-17 ·· ·· ·· ·· ·· ·· ·· SPECIAL COMPUTATIONAL SITUATIONS 3.2 NUMERIC EXCEPTIONS The 80387 can recognize six classes of numeric exception conditions while executing numeric instructions: 1. 1- Invalid operation Stack fault • IEEE standard invalid operation 2. Z- Divide-by-zero 3. 4. D- Denormalized operand 0 - Numeric overflow 5. 6. U- Numeric underflow P- Inexact result (precision) 3.2.1 Handling Numeric Exceptions When numeric exceptions occur, the NPX takes one of two possible courses of action: The NPX can itself handle the exception, producing the most reasonable result and allowing numeric program execution to continue undisturbed. • A software exception handler can be invoked by the CPU to handle the exception. Each of the six exception conditions described above has a corresponding flag bit in the 80387 status word and a mask bit in the 80387 control word. If an exception is masked (the corresponding mask bit in the control word = 1), the 80387 takes an appropriate default action and continues with the computation. If the exception is unmasked (mask=O), the 80387 asserts the ERROR# output to the 80386 to signal the exception and invoke a software exception handler. Note that when exceptions are masked, the NPX may detect multiple exceptions in a single instruction, because it continues executing the instruction after performing its masked response. For example, the 80387 could detect a denormalized operand, perform its masked response to this exception, and then detect an underflow. 3.2.1.1 AUTOMATIC EXCEPTION HANDLING The 80387 NPX has a default fix-up activity for every possible exception condition it may encounter. These masked-exception responses are designed to be safe and are generally acceptable for most numeric applications. As an example of how even severe exceptions can be handled safely and automatically using the NPX's default exception responses, consider a calculation of the parallel resistance of several values using only the standard formula (Figure 3-3). If Rl becomes zero, the circuit resistance becomes zero. With the divide-by-zero and precision exceptions masked, the 80387 NPX will produce the correct result. 3-18 SPECIAL COMPUTATIONAL SITUATIONS R, R, EQUIVALENT RESISTANCE R, ~ _1_ R, + _1_ R, + _1_ R, 122164-11 Figure 3-3. Arithmetic Example Using Infinity By masking or unmasking specific numeric exceptions in the NPX control word, NPX programmers can delegate responsibility for most exceptions to the NPX, reserving the most severe exceptions for programmed exception handlers. Exception-handling software is often difficult to write, and the NPX's masked responses have been tailored to deliver the most reasonable result for each condition. For the majority of applications, masking all exceptions other than invalid-operation yields satisfactory results with the least programming effort. An invalid-operation exception normally indicates a program error that must be corrected; this exception should not normally be masked. The exception flags in the NPX status word provide a cumulative record of exceptions that have occurred since these flags were last cleared. Once set, these flags can be cleared only by executing the FCLEX (clear exceptions) instruction, by reinitializing the NPX, or by overwriting the flags with an FRSTOR or FLDENV instruction. This allows a programmer to mask all exceptions (except invalid operation), run a calculation, and then inspect the status word to see if any exceptions were detected at any point in the calculation. 3.2.1.2 SOFTWARE EXCEPTION HANDLING If the NPX encounters an unmasked exception condition, it signals the exception to the 80386 CPU using the ERROR# status line between the two processors. The next time the 80386 CPU encounters a WAIT or ESC instruction in its instruction stream, the 80386 will detect the active condition of the ERROR# status line and automatically trap to an exception response routine using interrupt #16, the "processor extension error" exception. 3-19 SPECIAL COMPUTATIONAL SITUATIONS This exception response routine is normally a part of the systems software. Typical exception responses may include: Incrementing an exception counter for later display or printing Printing or displaying diagnostic information (e.g., the 80387 environment and registers) • Aborting further execution • Using the exception pointers to build an instruction that will run without exception and executing it For 80386 systems having systems software support for the 80387 NPX, applications programmers should consult the operating system's reference manuals for the appropriate system response to NPX exceptions. For systems programmers, specific details on writing software exception handlers are included in Chapter 6. 3.2.2 Invalid Operation This exception may occur in response to two general classes of operations: 1. Stack operations 2. Arithmetic operations The stack flag (SF) of the status word indicates which class of operation caused the exception. When SF is 1 a stack operation has resulted in stack overflow or underflow; when SF is 0, an arithmetic instruction has encountered an invalid operand. 3.2.2.1 STACK EXCEPTION When SF is 1, indicating a stack operation, the O/U# bit of the condition code (bit C 1 ) distinguishes between stack overflow and underflow as follows: O/U# = 1 Stack overflow- an instruction attempted to push down a non empty stack location. O/U# = 0 Stack underflow- an instruction attempted to read an operand from an empty stack location. When the invalid-operation exception is masked, the 80387 returns the QNaN indefinite. This value overwrites the destination register, destroying its original contents. When the invalid-operation exception is not masked, the 80386 exception "processor extension error" is triggered. TOP is not changed, and the source operands remain unaffected. 3-20 SPECIAL COMPUTATIONAL SITUATIONS 3.2.2.2 INVALID ARITHMETIC OPERATION This class includes the invalid operations defined in IEEE Std 754. The 80387 reports an invalid operation in any of the cases shown in Table 3-10. Also shown in this table are the 80387's responses when the invalid exception is masked. When unmasked, the 80386 exception "processor extension error" is triggered, and the operands remain unaltered. An invalid operation generally indicates a program error. 3.2.3 Division by Zero If an instruction attempts to divide a finite nonzero operand by zero, the 80387 will report a zero-divide exception. This is possible for F(I)DIV(R)(P) as well as the other instructions Table 3-10. Masked Responses to Invalid Operations Condition Masked Response Any arithmetic operation on an unsupported format. Return the QNaN indefinite. Any arithmetic operation on a signaling NaN. Return a QNaN (refer to the section "Rules for Generating QNaNs"). Compare and test operations: one or both operands is a NaN. Set condition codes "not comparable." Addition of opposite-signed infinities or subtraction of like-signed infinities. Return the QNaN indefinite. Multiplication: Division: 00 00 -i- 00; x 0; or 0 X or 0 -i- Return the QNaN indefinite. 00. O. Return the QNaN indefinite. Remainder instructions FPREM, FPREM1 when modulus (divisor) is zero or dividend is 00. Return the QNaN indefinite; set C2 . Trigonometric instructions FCOS, FPTAN, FSIN, FSINCOS when argument is 00. Return the QNaN indefinite; set C2 • FSORT of negative operand (except FSORT (- 0) = - 0), FYL2X of negative operand (except FYL2X (-0) = -00), FYL2XP1 of operand more negative than -1. Return the QNaN indefinite. FIST(P) instructions when source register is empty, a NaN, 00, or exceeds representable range of destination. Store integer indefinite. FBSTP instruction when source register is empty, a NaN, 00, or exceeds 18 decimal digits. Store packed decimal indefinite. FXCH instruction when one or both registers are tagged empty. Change empty registers to the QNaN indefinite and then perform exchange. 3-21 SPECIAL COMPUTATIONAL SITUATIONS that perform division internally: FYL2X and FXTRACT. The masked response for FDIV and FYL2X is to return an infinity signed with the exclusive OR of the signs of the operands. For FXTRACT, ST(1) is set to -00; ST is set to zero with the same sign as the original operand. If the divide-by-zcro exception is unmasked, the 80386 exception "processor extension error" is triggered; the operands remain unaltered. 3.2.4 Denormal Operand If an arithmetic instruction attempts to operate on a denormal operand, the NPX reports the denormal-operand exception. Denormal operands may have reduced significance due to lost low-order bits, therefore it may be advisable in certain applications to preclude operations on these operands. This can be accomplished by an exception handler that responds to unmasked denormal exceptions. Most users will mask this exception so that computation may proceed; any loss of accuracy will be analyzed by the user when the final result is delivered. When this exception is masked, the 80387 sets the D-bit in the status word, then proceeds with the instruction. Gradual underflow and denormal numbers as handled on the 80387 will produce results at least as good as, and often better than what could be obtained from a machine that flushes underflows to zero. In fact, a denormal operand in single- or doubleprecision format will be normalized to the extended-real format when loaded into the 80387. Subsequent operations will benefit from the additional precision of the extended-real format used internally. When this exception is not masked, the D-bit is set and the exception handler is invoked. The operands are not changed by the instruction and are available for inspection by the exception handler. If an 8087/80287 program uses the denormal exception to automatically normalize denormal operands, then that program can run on an 80387 by masking the denormal exception. The 8087/80287 denormal exception handler would not be used by the 80387 in this case. A numerics program runs faster when the 80387 performs normalization of denormal operands. A program can detect at run-time whether it is running on an 80387 or 8087/ 80287 and disable the denormal exception when an 80387 is used. The following code sequence is recommended to distinguish between an 80387 and an 8087/80287. Use default infinity mode: projective for 8087/80287, affine for 80387 Generate infinty F I Ii I T FL D1 FLDZ FDI V FLD ST Form negative infinity F CH 5 FCOMPP FSTSW MOV temp AX, temp Compare +infinity with -infinity 8087/80287 will say they are equal SAHF JliZ Us i ng_80387 3-22 SPECIAL COMPUTATIONAL SITUATIONS The denormal-operand exception of the 80387 permits emulation of arithmetic on unnormal operands as provided by the 8087/80287. The standard does not require the denormal exception nor does it recognize the unnormal data type. 3.2.5 Numeric Overflow and Underflow If the exponent of a numeric result is too large for the destination real format, the 80387 signals a numeric overflow. Conversely, if the exponent of a result is too small to be represented in the destination format, a numeric underflow is signaled. If either of these exceptions occur, the result of the operation is outside the range of the destination real format. Typical algorithms are most likely to produce extremely large and small numbers in the calculation of intermediate, rather than final, results. Because of the great range of the extended-precision format (recommended as the destination format for intermediates), overflow and underflow are relatively rare events in most 80387 applications. 3.2.5.1 OVERFLOW The overflow exception can occur whenever the rounded true result would exceed in magnitude the largest finite number in the destination format. The exception can occur in the execution of most of the arithmetic instructions and in some of the conversion instructions; namely, FST{P), F(I)ADD{P), F(I)SUB{R){P), F{I)MUL{P), FDIV{R){P), FSCALE, FYL2X, and FYL2XPl. The response to an overflow condition depends on whether the overflow exception is masked: • Overflow exception masked. The value returned depends on the rounding mode as Table 3-11 illustrates. Table 3-11. Masked Overflow Results Rounding Mode To nearest Sign of True Result Result - + +00 -00 + Largest finite positive number Toward -00 - Toward +00 - Largest finite negative number Toward zero - + Largest finite positive number Largest finite negative number -00 +00 + 3-23 SPECIAL COMPUTATIONAL SITUATIONS • Overflow exception not masked. The unmasked response depends on whether the instruction is supposed to store the result on the stack or in memory: Destination is the stack. The true result is divided by 224,576 and rounded. (The bias 24,576 is equal to 3 X 2 13 .) The significand is rounded to the appropriate precision (according to the precision control (PC) bit of the control word, for those instructions controlled by PC, otherwise to extended precision). The roundup bit (C 1) of the status word is set if the significand was rounded upward. The biasing of the exponent by 24,576 normally translates the number as nearly as possible to the middle of the exponent range so that, if desired, it can be used in subsequent scaled operations with less risk of causing further exceptions. With the instruction FSCALE, however, it can happen that the result is too large and overflows even after biasing. In this case, the unmasked response is exactly the same as the masked round-to-nearest response, namely ± infinity. The intention of this feature is to ensure the trap handler will discover that a translation of the exponent by -24574 would not work correctly without obliging the programmer of Decimal-toBinary or Exponential functions to determine which trap handler, if any, should be invoked. Destination is memory (this can occur only with the store instructions). No result is stored in memory. Instead, the operand is left intact in the stack. Because the data in the stack is in extended-precision format, the exception handler has the option either of reexecuting the store instruction after proper adjustment of the operand or of rounding the significand on the stack to the destination's precision as the standard requires. The exception handler should ultimately store a value into the destination location in memory if the program is to continue. 3.2.5.2 UNDERFLOW Underflow can occur in the execution of the instructions FST(P), FADD(P), FSUB(RP), FMUL(P), F(I)DIV(RP), FSCALE, FPREM(I), FPTAN, FSIN, FCOS, FSINCOS, FPATAN, F2XM1, FYL2X, and FYL2XPl. Two related events contribute to underflow: 1. Creation of a tiny result which, because it is so small, may cause some other exception later (such as overflow upon division). 2. Creation of an inexact result; i.e. the delivered result differs from what would have been computed were both the exponent range and precision unbounded. Which of these events triggers the underflow exception depends on whether the underflow exception is masked: 1. Underflow exception masked. The underflow exception is signaled when the result is both tiny and inexact. 2. Underflow exception not masked. The underflow exception is signaled when the result is tiny, regardless of inexactness. 3-24 SPECIAL COMPUTATIONAL SITUATIONS The response to an underflow exception also depends on whether the exception is masked: 1. Masked response. The result is denormal or zero. The precision exception is also triggered. 2. Unmasked response. The unmasked response depends on whether the instruction is supposed to store the result on the stack or in memory: • Destination is the stack. The true result is multiplied by 224 ,576 and rounded. (The bias 24,576 is equal to 3 X 213 ,) The significand is rounded to the appropriate precision (according to the precision control (PC) bit of the control word, for those instructions controlled by PC, otherwise to extended precision). The roundup bit (C I ) of the status word is set if the significand was rounded upward. The biasing of the exponent by 24,576 normally translates the number as nearly as possible to the middle of the exponent range so that, if desired, it can be used in subsequent scaled operations with less risk of causing further exceptions. With the instruction FSCALE, however, it can happen that the result is too tiny and underflows even after biasing. In this case, the unmasked response is exactly the same as the masked round-to-nearest response, namely ± 0, The intention of this feature is to ensure the trap handler will discover that a translation by +24576 would not work correctly without obliging the programmer of Decimal-to-Binary or Exponential functions to determine which trap handler, if any, should be invoked. • Destination is memory (this can occur only with the store instructions). No result is stored in memory. Instead, the operand is left intact in the stack. Because the data in the stack is in extended-precision format, the exception handler has the option either of reexecuting the store instruction after proper adjustment of the operand or of rounding the significand on the stack to the destination's precision as the standard requires. The exception handler should ultimately store a value into the destination location in memory if the program is to continue. 3.2.6 Inexact (Precision) This exception condition occurs if the result of an operation is not exactly representable in the destination format. For example, the fraction 1/3 cannot be precisely represented in binary form. This exception occurs frequently and indicates that some (generally acceptable) accuracy has been lost. All the transcendental instructions are inexact by definition; they always cause the inexact exception. The C I (roundup) bit of the status word indicates whether the inexact result was rounded up eC I = 1) or chopped eC I = 0). The inexact exception accompanies the underflow exception when there is also a loss of accuracy. When underflow is masked, the underflow exception is signaled only when there is a loss of accuracy; therefore the precision flag is always set as well. When underflow is unmasked, there mayor may not have been a loss of accuracy; the precision bit indicates which is the case, 3-25 SPECIAL COMPUTATIONAL SITUATIONS This exception is provided for applications that need to perform exact arithmetic only. Most applications will mask this exception. The 80387 delivers the rounded or over /underflowed result to the destination, regardless of whether a trap occurs. 3.2.7 Exception Priority The 80387 deals with exceptions according to a predetermined precedence. Precedence in exception handling means that higher-priority exceptions are flagged and results are deliv. ered according to the requirements of that exception. Lower-priority exceptions may not be flagged even if they occur. For example, dividing an SNaN by zero causes an invalid-operand exception (due to the SNaN) and not a zero-divide exception; the masked result is the QNaN real indefinite, not 00. A denormal or inexact (precision) exception, however, can accompany a numeric underflow or overflow exception. The exception precedence is as follows: 1. Invalid operation exception, subdivided as follows: a. b. c. d. 2. 3. Stack underflow. Stack overflow. Operand of unsupported format. SNaN operand. QNaN operand. Though this is not an exception, if one operand is a QNaN, dealing with it has precedence over lower-priority exceptions. For example, a QNaN divided by zero results in a QNaN, not a zero-divide exception. Any other invalid-operation exception not mentioned above or zero divide. 4. Denormal operand. If masked, then instruction execution continues, and a lower-priority exception can occur as well. 5. 6. Numeric overflow and underflow. Inexact result (precision) can be flagged as well. Inexact result (precision). 3.2.8 Standard Underflow/Overflow Exception Handler As long as the underflow and overflow exceptions are masked, no additional software is required to cause the output of the 80387 to conform to the requirements of IEEE Std 754. When unmasked, these exceptions give the exception handler an additional option in the case of store instructions. No result is stored in memory; instead, the operand is left intact on the stack. The handler may round the significand of the operand on the stack to the destination's precision as the standard requires, or it may adjust the operand and reexecute the faulting instruction. 3-26 The 80387 Instruction Set 4 CHAPTER 4 THE 80387 INSTRUCTION SET This chapter describes the operation of all 80387 instructions. Within this section, the instructions are divided into six functional classes: Data Transfer instructions • Nontranscendental instructions Comparison instructions Transcendental instructions Constant instructions Processor Control instructions Throughout this chapter, the instruction set is described as it appears to the ASM386 programmer who is coding a program. Not included in this chapter are details of instruction format, encoding, and execution times. This detailed information may be found in Appendix A and Appendix E. Refer also to Appendix B for a summary of the exceptions caused by each instruction. 4.1 COMPATIBILITY WITH THE 80287 AND 8087 The instruction set for the 80387 NPX is largely the same as that for the 80287 NPX (used with 80286 systems) and that for the 8087 NPX (used with 8086 and 8088 systems). Most object programs generated for the 80287 or 8087 will execute without change on the 80387. Several instructions are new to the 80387, and several 80287 and 8087 instructions perform no useful function on the 80387. Appendix C and Appendix D give details of these instruction set differences. 4.2 NUMERIC OPERANDS The typical NPX instruction accepts one or two operands as inputs, operates on these, and produces a result as an output. An operand is most often the contents of a register or of a memory location. The operands of some instructions are predefined; for example, FSQR T always takes the square root of the number in the top NPX stack element. Others allow, or require, the programmer to explicitly code the operand(s) along with the instruction mnemonic. Still others accept one explicit operand and one implicit operand, which is usually the top NPX stack element. All 80387 instructions that have a data operand use ST as one operand or as the only operand. Whether supplied by the programmer or utilized automatically, the two basic types of operands are sources and destinations. A source operand simply supplies one of the inputs to an instruction; it is not altered by the instruction. Even when an instruction converts the source operand from one format to another (e.g., real to integer), the conversion is actually performed in an internal work area to avoid altering the source operand. A destination 4-1 80387 INSTRUCTION SET operand may also provide an input to an instruction. It is distinguished from a source operand, however, because its content may be altered when it receives the result produced by the operation; that is, the destination is replaced by the result. Many instructions allow their operands to be coded in more than one way. For example, FADD (add real) may be written without operands, with only a source or with a destination and a source. The instruction descriptions in this section employ the simple convention of separating alternative operand forms with slashes; the slashes, however, are not coded. Consecutive slashes indicate an option of no explicit operands. The operands for FADD are thus described as / /source/destination, source This means that FADD may be written in any of three ways: Written Form Action FADD FADD source FADD destination, source Add ST to ST(1), put result in ST(1), then pop ST Add source to ST(O) Add source to destination The assembler can allow the same instruction to be specified in different ways; for example: FADD F ADD ST(l) = = FADDP ST(l), ST FADD ST, ST(l) When reading this section, it is important to bear in mind that memory operands may be coded with any of the CPU's memory addressing methods provided by the ModRjM byte. To review these methods (BASE + (INDEX X SCALE) + DISPLACEMENT) refer to the 80386 Programmer's Reference Manual. Chapter 5 also provides several addressing mode examples. 4.3 DATA TRANSFER INSTRUCTIONS These instructions (summarized in Table 4-1) move operands among elements of the register stack, and between the stack top and memory. Any of the seven data types can be converted to extended real and loaded (pushed) onto the stack in a single operation; they can be stored to memory in the same manner. The data transfer instructions automatically update the 80387 tag word to reflect whether the register is empty or full following the instruction. 4-2 80387 INSTRUCTION SET Table 4-1. Data Transfer Instructions Real Transfers Load Real Store real Store real and pop Exchange registers FLD FST FSTP FXCH Integer Transfers FILD FIST FISTP Integer load Integer store Integer store and pop Packed Decimal Transfers FBLD FBSTP Packed decimal (BCD) load Packed decimal (BCD) store and pop 4.3.1 FLO source FLD (load real) loads (pushes) the source operand onto the top of the register stack. This is done by decrementing the stack pointer by one and then copying the content of the source to the new stack top. ST(7) must be empty to avoid causing an invalid-operation exception. The new stack top is tagged nonempty. The source may be a register on the stack (ST(i)) or any of the real data types in memory. If the source is a register, the register number used is that before TOP is decremented by the instruction. Coding FLD ST(O) duplicates the stack top. Single and double real source operands are converted to extended real automatically. Loading an extended real operand does not require conversion; therefore, the I and D exceptions do not occur in this case. 4.3.2 FST destination FST (store real) copies the NPX stack top to the destination, which may be another register on the stack or a single or double (but not extended-precision) memory operand. If the destination is single or double real, the copy of the significand is rounded to the width of the destination according to the RC field of the control word, and the copy of the exponent is converted to the width and bias of the destination format. The over/underflow condition is checked for as well. If, however, the stack top contains zero, ± 00, or a NaN, then the stack top's significand is not rounded but is chopped (on the right) to fit the destination. Neither is the exponent converted, rather it also is chopped on the right and transferred "as is". This preserves the value's identification as 00 or a NaN (exponent all ones) so that it can be properly loaded and used later in the program if desired. Note that the 80387 does not signal the invalid-operation exception when the destination is a nonempty stack element. 4-3 80387 INSTRUCTION SET 4.3.3 FSTP destination FSTP (store real and pop) operates identically to FST except that the NPX stack is popped following the transfer. This is done by tagging the top stack element empty and then incrementing TOP. FSTP also permits storing to an extended-precision real memory variable, whereas FST does not. If the source operand is a register, the register number used is that before TOP is incremented by the instruction. Coding FSTP ST(O) is equivalent to popping the stack with no data transfer. 4.3.4 FXCH / /destination FXCH (exchange registers) swaps the contents of the destination and the stack top registers. If the destination is not coded explicitly, ST(l) is used. Many 80387 instructions operate only on the stack top; FXCH provides a simple means of effectively using these instructions on lower stack elements. For example, the following sequence takes the square root of the third register from the top (assuming that ST is nonempty): FXCH ST(3) FSQRT FXCH ST(3) 4.3.5 FILD source FILD (integer load) converts the source memory operand from its binary integer format (word, short, or long) to extended real and pushes the result onto the NPX stack. ST(7) must be empty to avoid causing an exception. The (new) stack top is tagged nonempty. FILD is an exact operation; the source is loaded with no rounding error. 4.3.6 FIST destination FIST (integer store) stores the content of the stack top to an integer according to the RC field (rounding control) of the control word and transfers the result to the destination, leaving the stack top unchanged. The destination may define a word or short integer variable. Negative zero is stored in the same encoding as positive zero: 0000 ... 00. 4.3.7 FISTP destination FISTP (integer and pop) operates like FIST except that it also pops the NPX stack following the transfer. The destination may be any of the binary integer data types. 4.3.8 FBLD source FBLD (packed decimal (BCD) load) converts the content of the source operand from packed decimal to extended real and pushes the result onto the NPX stack. ST(7) must be empty to avoid causing an exception. The sign of the source is preserved, including the case where 4-4 80387 INSTRUCTION SET the value is negative zero. FBLD is an exact operation; the source is loaded with no rounding error. The packed decimal digits of the source are assumed to be in the range 0-9. The instruction does not check for invalid digits (A-FH), and the result of attempting to load an invalid encoding is undefined. 4.3.9 FBSTP destination FBSTP (packed decimal (BCD) store and pop) converts the content of the stack top to a packed decimal integer, stores the result at the destination in memory, and pops the stack. FBSTP rounds a non integral value according to the RC (rounding control) field of the control word. 4.4 NONTRANSCENDENTAL INSTRUCTIONS The 80387's non transcendental instruction set (Table 4-2) provides a wealth of variations on the basic add, subtract, multiply, and divide operations, and a number of other useful functions. These range from a simple absolute value to a square root instruction that executes faster than ordinary division; 80387 programmers no longer need to spend valuable time eliminating square roots from algorithms because they run too slowly. Other nontranscendental instructions perform exact modulo division, round real numbers to integers, and scale values by powers of two. The 80387's basic nontranscendental instructions (addition, subtraction, multiplication, and division) are designed to encourage the development of very efficient algorithms. In particular, they allow the programmer to reference memory as easily as the NPX register stack. Table 4-3 summarizes the available operation/operand forms that are provided for basic arithmetic. In addition to the four normal operations, two "reversed" instructions make subtraction and division "symmetrical" like addition and multiplication. The variety of instruction and operand forms give the programmer unusual flexibility: • Operands may be located in registers or memory. • Results may be deposited in a choice of registers. Operands may be a variety of NPX data types: extended real, double real, single real, short integer or word integer, with automatic conversion to extended real performed by the 80387. Five basic instruction forms may be used across all six operations, as shown in Table 4-3. The classical stack form may be used to make the 80387 operate like a classical stack machine. No operands are coded in this form, only the instruction mnemonic. The NPX picks the source operand from the stack top and the destination from the next stack element. It then pops the stack, performs the operation, and returns the result to the new stack top, effectively replacing the operands by the result. 4-5 80387 INSTRUCTION SET Table 4-2. Nontranscendental Instructions Addition Add real Add real and pop Integer add FADD FADDP FIADD Subtraction Subtract real Subtract real and pop Integer subtract Subtract real reversed Subtract real reversed and pop Integer subtract reversed FSUB FSUBP FISUB FSUBR FSUBRP FISUBR Multiplication Multiply real Multiply real and pop Integer multiply FMUL FMULP FIMUL Division FDIV FDIVP FIDIV FDIVR FDIVRP FIDIVR Divide real Divide real and pop Integer divide Divide real reversed Divide real reversed and pop Integer divide reversed Other Operations FSQRT FSCALE FPREM FPREM1 FRNDINT FXTRACT FABS FCHS Square root Scale Partial remainder IEEE standard partial remainder Round to integer Extract exponent and significand Absolute value Change sign The register form is a generalization of the classical stack form; the programmer specifies the stack top as one operand and any register on the stack as the other operand. Coding the stack top as the destination provides a convenient way to access a constant, held elsewhere in the stack, from the stack top. The destination need not always be ST, however. All two operand instructions allow use of another register as the destination. This coding (ST is the source operand) allows, for example, adding the stack top into a register used as an accumulator. Often the operand in the stack top is needed for one operation but then is of no further use in the computation. The register pop form can be used to pick up the stack top as the source 4-6 80387 INSTRUCTION SET Table 4-3. Basic Nontranscendental Instructions and Operands Instruction Form Classical stack Classical stack, extra pop Register Register pop Real memory Integer memory Mnemonic Form Fop FopP Fop FopP Fop Flop Operand Forms destination, source {ST(1), ST} {ST(1), ST} ST(i), ST or ST, ST(i) ST(i), ST { ST,} single/double { ST,} word-integer/short-integer ASM386 Example FADD FADDP FSUB FMULP FDIV FIDIV ST, ST(3) ST(2), ST AZIMUTH PULSES NOTES: Braces ({ }) surround implicit operands; these are not coded, and are shown here for information only. op= ADD SUB SUBR MUL DIV DIVR destination destination destination destination destination destination ffffff- destination + source destination - source source - destination destination· source destination -:- source source -:- destination operand, and then discard it by popping the stack. Coding operands of ST( 1), ST with a register pop mnemonic is equivalent to a classical stack operation: the top is popped and the result is left at the new top. The two memory forms increase the flexibility of the 80387's nontranscendental instructions. They permit a real number or a binary integer in memory to be used directly as a source operand. This is useful in situations where operands are not used frequently enough to justify holding them in registers. Note that any memory addressing method may be used to define these operands, so they may be elements in arrays, structures, or other data organizations, as well as simple scalars. The six basic operations are discussed further in the next paragraphs, and descriptions of the remaining seven operations follow. 4.4.1 Addition FADD FADDP FIADD j jsourcejdestination, source j jdestination, source source The addition instructions (add real, add real and pop, integer add) add the source and destination operands and return the sum to the destination. The operand at the stack top may be doubled by coding: FADD 5T, 5T 26 \ FPREM can be employed to reduce ST. With 7r / 4 as a modulus, FPREM can reduce an argument so that it is within range of FPTAN and so that no further reduction is required by FPT AN. Because FPREM produces an exact result, the argument reduction does not introduce roundoff error into the calculation, even if several iterations are required to bring the argument into range. However, 7r is never accurate. The rounding of 7r, when it is used by FPREM to reduce an argument for a periodic trigonometric function, does not create the effect of a rounded argument, but of a rounded period. When reduction is complete, FPREM provides the least-significant three bits of the quotient generated by FPREM (in C 3 , C Co). This is also important for transcendental argument reduction, because it locates the original angle in the correct one of eight 7r / 4 segments of the unit circle (see Table 4-4). j , 4.4.10 FPREM1-Partial Remainder (IEEE Std. 754-Compatible) FPREM 1 computes the remainder of division of ST by ST( I) and leaves the result in ST. FPREMI finds a remainder REMl and a quotient QI such that REMI = ST - ST(l)*QI 4-10 80387 INSTRUCTION SET Table 4-4. Condition Code Interpretation after FPREM and FPREM1 Instructions Condition Code Interpretation after FPREM and FPREM1 C2(PF) C3 C1 CO 1 X X X 01 00 02 OMOD8 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 Incomplete Reduction: further interation required or complete reduction 2 3 4 5 6 7 Complete Reduction: CO, C3, C1 contain three least significant bits of quotient The quotient Q 1 is chosen to be the integer nearest to the exact value of ST /ST( 1). When ST /ST(I) is exactly N + 1/2 (for some integer N), there are two integers equally close to ST/ST(I). In this case the value chosen for QI is the even integer. The result produced by FPREMI is always exact; no rounding is necessary, and therefore the precision exception does not occur and the rounding control has no effect. The FPREMI instruction is designed to be executed iteratively in a software-controlled loop. FPREM I operates by performing successive scaled subtractions; therefore, obtaining the exact remainder when the operands differ greatly in magnitude can consume large amounts of execution time. Because the 80387 can only be preempted between instructions, the remainder function could seriously increase interrupt latency in these cases. For this reason, the maximum number of iterations is limited. The instruction may terminate before it has completely terminated the calculation. The C2 bit of the status word indicates whether the calculation is complete or whether the instruction must be executed again. FPREM I can reduce the exponent of ST by up to (but not including) 64 in one execution. If FPREM I produces a remainder that is less than the modulus (i.e., the divisor), the function is complete and bit C2 of the status word condition code is cleared. If the function is incom- plete, C2 is set to 1; the result in ST is then called the partial remainder. Software can inspect C2 by storing the status word following execution of FPREM I, reexecuting the instruction (using the partial remainder in ST as the dividend) until C2 is cleared. When C2 is cleared, FPREMI also provides the least-significant three bits of the quotient generated by FPREMI (in C 3 , C], Co). 4-11 80387 INSTRUCTION SET The uses for FPREM 1 are the same as those for FPREM. FPREM 1 differs from FPREM it these respects: • FPREM and FPREM 1 choose the value of the quotient differently; the low-order three bits of the quotient as reported in bits C3,Cl,CO of the status word may differ by one in some cases. FPREM and FPREM 1 may produce different remainders. FPREM produces a remainder R such that 0 -< R < 1ST( 1) 1or -I ST( 1) 1< R -< 0, depending on the sign of the dividend. FPREMI produces a remainder Rl such that -I ST(I) 1/2 < Rl < +1 ST(I) 1/2. 4.4.11 FRNDINT FRNDINT (round to integer) rounds the top stack element to an integer according to the RC bits of the control word. For example, assume that ST contains the 80387 real number encoding of the decimal value 155.625. FRNDINT will change the value to 155 if the RC field of the control word is set to down or chop, or to 156 if it is set to up or nearest. 4.4.12 FXTRACT FXTRACT (extract exponent and significand) performs a superset of the IEEErecommended logb(x) function by "decomposing" the number in the stack top into two numbers that represent the actual value of the operand's exponent and significand fields. The "exponent" replaces the original operand on the stack and the "significand" is pushed onto the stack. (ST(7) must be empty to avoid causing the invalid-operation exception.) Following execution of FXTRACT, ST (the new stack top) contains the value of the original significand expressed as a real number: its sign is the same as the operand's, its exponent is o true (16,383 or 3FFFH biased), and its significand is identical to the original operand's. ST(1) contains the value of the original operand's true (unbiased) exponent expressed as a real number. If the original operand is zero, FXTRACT leaves -co in ST(1) (the exponent) while ST is assigned the value zero with a sign equal to that of the original operand. The zero-divide exception is raised in this case, as well. To illustrate the operation of FXTRACT, assume that ST contains a number whose true exponent is +4 (Le., its exponent field contains 4003H). After executing FXTRACT, ST(1) will contain the real number +4.0; its sign will be positive, its exponent field will contain 400lH (+2 true) and its significand field will contain laOO... OOB. In other words, the value in ST(l) will be 1.0 X 22 = 4. If ST contains an operand whose true exponent is -7 (i.e., its exponent field contains 3FF8H), then FXTRACT will return an "exponent" of -7.0; after the instruction executes, ST(1)'s sign and exponent fields will contain COOIH (negative sign, true exponent of 2), and its significand will be lal100 ... 00B. In other words, the value in ST(l) will be -1.75 X 22 = -7.0. In both cases, following FXTRACT, ST's sign and significand fields will be the same as the original operand's, and its exponent field will contain 3FFFH (0 true). 4-12 80387 INSTRUCTION SET FXTRACT is useful for power and range scaling operations. Both FXTRACT and the base 2 exponential instruction F2XM 1 are needed to perform a general power operation. Converting numbers in 80387 extended real format to decimal representations (e.g., for printing or displaying) requires not only FBSTP but also FXTRACT to allow scaling that does not overflow the range of the extended format. FXTRACT can also be useful for debugging, because it allows the exponent and significand parts of a real number to be examined separately. 4.4.13 FABS FABS (absolute value) changes the top stack element to its absolute value by making its sign positive. Note that the invalid-operation exception is not signaled even if the operand is a signaling NaN or has a format that is not supported. 4.4.14 FCHS FCHS (change sign) complements (reverses) the sign of the top stack element. Note that the invalid-operation exception is not signaled even if the operand is a signaling NaN or has a format that is not supported. 4.5 COMPARISON INSTRUCTIONS The instructions of this class allow comparison of numbers of all supported real and integer data types. Each of these instructions (Table 4-5) analyzes the top stack element, often in relationship to another operand, and reports the result as a condition code in the status word. The basic operations are compare, test (compare with zero), and examine (report type, sign, and normalization). Special forms of the compare operation are provided to optimize algorithms by allowing direct comparisons with binary integers and real numbers in memory, as well as popping the stack after a comparison. The FSTSW (store status word) instruction may be used following a comparison to transfer the condition code to memory or to the 80386 AX register for inspection. The 80386 SAHF Table 4-5. Comparison Instructions FCOM FCOMP FCOMPP FICOM FICOMP FTST FUCOM FUCOMP FUCOMPP FXAM Compare real Compare real and pop Compare real and pop twice Integer compare Integer compare and pop Test Unordered compare real Unordered compare real and pop Unordered compare real and pop twice Examine 4-13 80387 INSTRUCTION SET instruction is recommended for copying the 80387 flags from AX to the 80386 flags for easy conditional branching. Note that instructions other than those in the comparison group may update the condition code. To ensure that the status word is not altered inadvertently, store it immediately following a comparison operation. 4.5.1 FCOM / /source FCOM (compare real) compares the stack top to the source operand. The source operand may be a register on the stack, or a single or double real memory operand. If an operand is not coded, ST is compared to ST(1). The sign of zero is ignored, so that +0 = -0. Following the instruction, the condition codes reflect the order of the operands as shown in Table 4-6. If either operand is a NaN (either quiet or signaling) or an undefined format, or if a stack fault occurs, the invalid-operation exception is raised and the condition bits are set to "unordered. " 4.5.2 FCOMP / /source FCOMP (compare real and pop) operates like FCOM, and in addition pops the stack. 4.5.3 FCOMPP FCOMPP (compare real and pop twice) operates like FCOM and additionally pops the stack twice, discarding both operands. FCOMPP always compares ST to ST( 1); no operands may be explicitly specified. 4.5.4 FICOM source FICOM (integer compare) converts the source operand, which may reference a word or short binary integer variable, to extended real and compares the stack top to it. The condition code bits in the status word are set as for FCOM. Table 4-6. Condition Code Resulting from Comparisons Order C3(ZF) C2(PF) CO (CF) 80386 Conditional Branch ST> Operand ST < Operand ST = Operand Unordered 0 0 1 1 0 0 0 1 0 1 0 1 JA JB JE JP 4-14 80387 INSTRUCTION SET 4.5.5 FICOMP source FICOMP (integer compare and pop) operates identically to FICOM and additionally discards the value in ST by popping the NPX stack. 4.5.6 FTST FTST (test) tests the top stack element by comparing it to zero. The result is posted to the condition codes as shown in Table 4-7. 4.5.7 FUCOM / / source FUCOM (unordered compare real) operates like FCOM, with two differences: 1. It does not cause an invalid-operation exception when one of the operands is a NaN. If either operand is a NaN, the condition bits of the status word are set to unordered as shown in Table 4-6. 2. Only operands on the NPX stack can be compared. 4.5.8 FUCOMP / / source FUCOMP (unordered compare real and pop) operates like FUCOM and in addition pops the NPX stack. 4.5.9 FUCOMPP FUCOMPP (unordered compare real and pop) operates like FUCOM and in addition pops the NPX stack twice, discarding both operands. FUCOMPP always compares ST to ST( I); no operands can be explicitly specified. Table 4-7. Condition Code Resulting from FTST Order 8T> 0.0 8T < 0.0 8T = 0.0 Unordered C3 (ZF) C2 (ZF) CO (ZF) 0 0 1 1 0 0 0 1 0 1 0 1 4-15 83086 Conditional Branch JA JB JE JP 80387 INSTRUCTION SET 4.5.10 FXAM FXAM (examine) reports the content of the top stack element as positive/negative and NaN, denormal, normal, zero, infinity, unsupported, or empty. Table 4-8 lists and interprets all the condition code values that FXAM generates. 4.6 TRANSCENDENTAL INSTRUCTIONS The instructions in this group (Table 4-9) perform the time-consuming core calculations for all common trigonometric, inverse trigonometric, hyperbolic, inverse hyperbolic, logarithmic, and exponential functions. The transcendentals operate on the top one or two stack elements, and they return their results to the stack. The trigonometric operations assume their arguments are expressed in radians. The logarithmic and exponential operations work in base 2. The results of transcendental instructions are highly accurate. The absolute value of the relative error of the transcendental instructions is guaranteed to be less than 2- 62 • (Relative error is the ratio between the absolute error and the exact value.) Table 4-8. Condition Code Defining Operand Class C3 C2 Cl CO Value at TOP 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 + Unsupported +NaN - Unsupported -NaN +Normal +Infinity -Norma! -Infinity +0 + Empty -0 -Empty + Denormal -Denormal 1 0 1 0 1 0 1 0 1 0 1 0 0 Table 4-9. Transcendental Instructions FSIN FCOS FSINCOS FPTAN FPATAN F2XM1 FYL2X FYL2XP1 Sine Cosine Sine and cosine Tangent of ST Arctangent of ST(l )/ST 2x-1 Y IOg2X; Y is ST(l), X is ST Y o10g2(X + 1); Y is ST(l), X is ST 0 4-16 80387 INSTRUCTION SET The trigonometric functions accept a practically unrestricted range of operands, whereas the other transcendental instructions require that arguments be more restricted in range. FPREM or FPREMI may be used to bring the otherwise valid operand of a periodic function into range. Prologue and epilogue software may be used to reduce arguments for other instructions to the expected range and to adjust the result to correspond to the original arguments if necessary. The instruction descriptions in this section document the allowed operand range for each instruction. 4.6.1 FCOS When complete, this function replaces the contents of ST with COS(ST). ST, expressed in radians, must lie in the range 101 < 263 (for most practical purposes unrestricted). If ST is in range, C2 of the status word is cleared and the result of the operation is produced. If the operand is outside of the range, C2 is set to one (function incomplete) and ST remains intact (i.e., no reduction of the operand is performed). It is the programmers responsibility to reduce the operand to an absolute value smaller than 263. The instructions FPREMI and FPREM are available for this purpose. 4.6.2 FSIN When complete, this function replaces the contents of ST with SIN(ST). FSIN is equivalent to FCOS in the way it reduces the operand. ST is expressed in radians. 4.6.3 FSINCOS When complete, this instruction replaces the contents of ST with SIN(ST), then pushes COS(ST) onto the stack. (ST(7) must be empty to avoid an invalid exception.) FSINCOS is equivalent to FCOS in the way it reduces the operand. ST is expressed in radians. 4.6.4 FPTAN When complete, FPTAN (partial tangent) computes the function Y = TAN (ST). ST is expressed in radians. Y replaces ST, then the value 1 is pushed, becoming the new stack top. (ST(7) must be empty to avoid an invalid exception.) When the function is complete ST(l) = TAN (arg) and ST = 1. FPTAN is equivalent to FCOS in the way it reduces the operand. The fact that FPTAN places two results on the stack maintains compatibility with the 8087/80287 and aids the calculation of other trigonometric functions that can be derived from tan via standard trigonometric identities. For example, the cot function is given by this identity: cot x = 1 / tan x . 4-17 80387 INSTRUCTION SET Therefore, simply executing the reverse divide instruction FDIVR after FPT AN yields the cot function. 4.6.5 FPATAN FPATAN (arctangent) computes the function 8 = ARCTAN (Y IX). X is taken from ST(O) and Y from ST(l). The instruction pops the NPX stack and returns 8 to the (new) stack top, overwriting the Y operand. The result is expressed in radians. The range of operands is not restricted; however, the range of the result depends on the relationship between the operands according to Table 4-10. The fact that the argument of FPATAN is a ratio aids calculation of other trigonometric functions, including Arcsin and Arccos. These can be derived from Arctan via standard trigonometric identities. For example, the Arcsin function can be easily calculated using this identity: Arcsin x = Arctan (x I V1 - X2) . Thus, to find Arcsin (Y), push Y onto the NPX stack, then calculate X = vi 1 - y2, pushing the result X onto the stack. Executing FPAT AN then leaves Arcsin (Y) at the top of the stack. 4.6.6 F2XM1 F2XMl (2 to the X minus 1) calculates the function Y = 2X - 1. X is taken from the stack top and must be in the range -1 <: X <: 1. The result Y replaces the argument X at the stack top. If the argument is out of range, the results are undefined. This instruction is designed to produce a very accurate result even when X is close to O. For values of the argument very close in magnitude to 1, a larger error will be incurred. To obtain Y = 2x , add 1 to the result delivered by F2XM1. Table 4-10. Results of FPATAN Sign(V) Sign(X) + + + + + + Yes - Yes Yes - - + + - IVI causes mathematical functions such as sin and sqrt to return values of type double. Figure 5-1 illustrates the ease with which C programs interface with the 80387. 5-1 PROGRAMMING NUMERIC APPLICATIONS XENIX286 C386 COMPILER, VO.2 COMPILATION OF MODULE SAMPLE OBJECT MODULE PLACED IN sample. obi COMPILER INVOKED BY: c386 sample.c stmt level /****************************************************** SAMPLE C PROGRAM ******************************************************/ 7 8 9 10 36 /.* Include /usr/;nclude/stdio.h if necessary **1 /** Include math declarations for transcendenatals and others **/ #;nc tude #define PI 3.141592654 37 38 main() 39 { 40 41 double double sin resul t, cos resut t; angle_deg = o.o~ angLe_rad; 42 int i. no_ot_trial = 4; 43 fore i '" 1; ; <= no of trial; iH)( angle_rad = angTe_deg '* PI I 180.0; sin_resut t = sin (angle_rad); 44 45 46 = cos 47 cos_resut t 48 printf(lIsine of %f degrees equals %f\nll, angle deg, sin result); printf("cosine of %f degrees equals %f\n\n". angLe~deg,~cos_result}; angle_deg = angLe_deg + 30.0; 49 50 51 (angle_rad); } 52 /** etc. **/ 53 } c386 COMPILATION COMPLETE. 0 WARNINGS, 0 ERRORS Figure 5-1. Sample C-386 Program 5-2 PROGRAMMING NUMERIC APPLICATIONS 5.1.3 PLlM-386 Programmers in PLfM-386 can access a very useful subset of the 80387's numeric capabilities. The PLfM-386 REAL data type corresponds to the NPX's single real (32-bit) format. This data type provides a range of about 8.43 X 10~37 <:: I X I <:: 3.38 X 1038 , with about seven significant decimal digits. This representation is adequate for the data manipulated by many microcomputer applications. The utility of the REAL data type is extended by the PLfM-386 compiler's practice of holding intermediate results in the 80387's extended real format. This means that the full range and precision of the processor are utilized for intermediate results. Underflow, overflow, and rounding exceptions are most likely to occur during intermediate computations rather than during calculation of an expression's final result. Holding intermediate results in extended-precision real format greatly reduces the likelihood of overflow and underflow and eliminates roundoff as a serious source of error until the final assignment of the result is performed. The compiler generates 80387 code to evaluate expressions that contain REAL data types, whether variables or constants or both. This means that addition, subtraction, multiplication, division, comparison, and assignment of REALs will be performed by the NPX. INTEGER expressions, on the other hand, are evaluated on the CPU. Five built-in procedures (Table 5-1) give the PLfM-386 programmer access to 80387 functions manipulated by the processor control instructions. Prior to any arithmetic operations, a typical PLfM-386 program will set up the NPX using the INIT$REAL$MATH$UNIT procedure and then issue SET$REAL$MODE to configure the NPX. SET$REAL$MODE loads the 80387 control word, and its 16-bit parameter has the format shown for the control word in Chapter I. The recommended value of this parameter is 033EH (round to nearest, 64-bit precision, all exceptions masked except invalid operation). Other settings may be used at the programmer's discretion. If any exceptions are unmasked, an exception handler must be provided in the form of an interrupt procedure that is designated to be invoked via CPU interrupt vector number 16. The exception handler can use the GET$REAL$ERROR procedure to obtain the low-order Table 5-1. PLlM-386 Built-In Procedures Procedure 80387 Instruction Description INIT$REAL$MATH$UNIT(1) FINIT Initialize processor. SET$REAL$MODE FLDCW Set exception masks, rounding precision, and infinity controls. GET$REAL$ERROR(2) FNSTSW & FNCLEX Store, then clear, exception flags. SAVE$REAL$STATUS FNSAVE Save processor state. RESTORE$REAL$STATUS FRSTOR Restore processor state. 5~3 'j ... PROGRAMMING NUMERIC APPLICA nONS byte of the 80387 status word and to then clear the exception flags. The byte returned by GET$REAL$ERROR contains the exception flags; these can be examined to determine the source of the exception. The SAVE$REAL$STATUS and RESTORE$REAL$STATUS procedures are provided for multitasking environments where a running task that uses the 80387 may be preempted by another task that also uses the 80387. It is the responsibility of the operating system to issue SAVE$REAL$STATUS before it executes any statements that affect the 80387; these include the INIT$REAL$MATH$UNIT and SET$REAL$MODE procedures as well as arithmetic expressions. SAVE$REAL$STATUS saves the 80387 state (registers, status, and control words, etc.) on the CPU's stack. RESTORE$REAL$STATUS reloads the state information; the preempting task must invoke this procedure before terminating in order to restore the 80387 to its state at the time the running task was preempted. This enables the preempted task to resume execution from the point of its preemption. 5.1.4 ASM386 The ASM386 assembly language provides programmers with complete access to all of the facilities of the 80386 and 80387 processors. The programmer's view of the 80386/80387 hardware is a single machine with these resources: 160 instructions • 12 data types 8 general registers 6 segment registers • 8 floating-point registers, organized as a stack 5.1.4.1 DEFINING DATA The ASM386 directives shown in Table 5-2 allocate storage for 80387 variables and constants. As with other storage allocation directives, the assembler associates a type with any variable defined with these directives. The type value is equal to the length of the storage unit in bytes (10 for DT, 8 for DQ, etc.). The assembler checks the type of any variable coded in an instruction to be certain that it is compatible with the instruction. For example, the coding FIADD ALPHA will be flagged as an error if ALPHA's type is not 2 or 4, Table 5-2. ASM386 Storage Allocation Directives Directive DW DD DO DT Interpretation Data Types Define Word Define Doubleword Dfine Ouadword Define Tenbyte Word integer Short integer, short real Long integer, long real Packed decimal, temporary real 5-4 PROGRAMMING NUMERIC APPLICATIONS because integer addition is only available for word and short integer (doubleword) data types. The operand's type also tells the assembler which machine instruction to produce; although to the programmer there is only an FlADD instruction, a different machine instruction is required for each operand type. On occasion it is desirable to use an instruction with an operand that has no declared type. For example, if register BX points to a short integer variable, a programmer may want to code FlADD [BX]. This can be done by informing the assembler of the operand's type in the instruction, coding FIADD DWORD PTR [BX]. The corresponding overrides for the other storage allocations are WORD PTR, QWORD PTR, and TBYTE PTR. The assembler does not, however, check the types of operands used in processor control instructions. Coding FRS TOR [BP] implies that the programmer has set up register BP to point to the location (probably in the stack) where the processor's 94-byte state record has been previously saved. The initial values for 80387 constants may be coded in several different ways. Binary integer constants may be specified as bit strings, decimal integers, octal integers, or hexadecimal strings. Packed decimal values are normally written as decimal integers, although the assembler will accept and convert other representations of integers. Real values may be written as ordinary decimal real numbers (decimal point required), as decimal numbers in scientific notation, or as hexadecimal strings. Using hexadecimal strings is primarily intended for defining special values such as infinities, NaNs, and denormalized numbers. Most programmers will find that ordinary decimal and scientific decimal provide the simplest way to initialize 80387 constants. Figure 5-2 compares several ways of setting the various 80387 data types to the same initial value. THE FOLLOWING ALL ALLOCATE THE CONSTANT: -126 NOTE TWO'S COMPLETE STORAGE OF NEGATIVE BINARY INTEGERS. ; EVE N WORLINTEGER SHORLIHTEGER DW DD 111111111000010B OFFFFFF82H LONLINTEGER 5 I NGLE_R EAL DO UBLCR EAL PAC KELD ECI MAL DQ DD DD DT - 126 - 126 . 0 -1.26E2 - 126 FORCE WORD ALIGNMENT BIT STRING HEX STRING MUST START WITH DIGIT ORDINARY DECIMAL , , HOTE PRESENCE OF "SCIENTIFIC" ORDINARY DECIMAL INTEGER IN THE FOLLOWING, SIGN AND EXPONENT IS 'COOS' SIGNlFICAND IS '7[00 ... 00', 'R' INFORMS ASSEMBLER THAT THE STRING REPRESENTS A REAL DATA TYPE. ; EX TE NDELR EAL DT OCOOS7EOOOOOOOOOOOOOOR HEX STRING Figure 5-2. Sample 80387 Constants 5-5 PROGRAMMING NUMERIC APPLICATIONS Note that preceding 80387 variables and constants with the ASM386 EVEN directive ensures that the operands will be word-aligned in memory. The best performance is obtained when data transfers are double-word aligned. All 80387 data types occupy integral numbers of words so that no storage is "wasted" if blocks of variables are defined together and preceded by a single EVEN declarative. 5.1.4.2 RECORDS AND STRUCTURES The ASM386 RECORD and STRUC (structure) declaratives can be very useful in NPX programming. The record facility can be used to define the bit fields of the control, status, and tag words. Figure 5-3 shows one definition of the status word and how it might be used in a routine that polls the 80387 until it has completed an instruction. Because structures allow different but related data types to be grouped together, they often provide a natural way to represent "real world" data organizations. The fact that the structure template may be "moved" ahout in memory adds to its flexibility. Figure 5-4 shows a simple structure that might be used to represent data consisting of a series of test score samples. A structure could also be used to define the organization of the information stored and loaded by the FSTENV and FLDENV instructions. ; RESERVE SPACE FOR STATUS WORD STATULWORD ; LAY OUT STATUS WORD FIELDS STATUS RECORD 1, BUS Y : 1, CONLCODE3 : 3, STACCTOP: 1, COND_CODE2: 1, CONLCODE 1: 1, CONLCODEO: 1, I MLREQ: LF LAG: 1, P_FLAG: 1, 1, LF L AG: 1, LFLAG: Z_FLAG: 1, 1, LFLAG: LF LAG: 1 ; REDUCE UNTIL COMPLETE REDUCE: FPREMl FNSTSW STAT ULW 0 RD STATUS_WORD, MASK_COMD_CODE2 TE S T JNZ REDUCE Figure 5-3. Status Word Record Definition 5-6 PROGRAMMING NUMERIC APPLICATIONS SAMPLE STRUC N_OBS DD SHORT INTEGER MEAN DQ DOUBLE REAL MODE DW WORD INTEGER STD_DEV DQ ; DOUBLE REAL ; ARRAY OF OBSERVATIOHS -- WORD INTEGER TEST_SCORES DW 1000 DUP I?l SAMPLE ENDS Figure 5-4. Structure Definition Table 5-3. Addressing Method Examples Coding Interpretation FIAOO ALPHA ALPHA is a simple scalar (mode is direct). FDIVR ALPHA. BETA BETA is a field in a structure that is "overlaid" on ALPHA (mode is direct). FMUL aWORO PTR [BX] BX contains the address of a long real variable (mode is register indirect). F8UB ALPHA [81] ALPHA is an array and 81 contains the offset of an array element from the start of the array (mode is indexed). FILD [BP].BETA BP contains the address of a structure on the CPU stack and BETA is a field in the structure (mode is based). FBLO TBYTE PTR [BX] [01] BX contains the address of a packed decimal array and 01 contains the offset of an array element (mode is based indexed). 5.1.4.3 Addressing Methods 80387 memory data can be accessed with any of the memory addressing methods provided by the ModRjM byte and (optionally) the SIB byte. This means that 80387 data types can be incorporated in data aggregates ranging from simple to complex according to the needs of the application. The addressing methods and the ASM386 notation used to specify them in instructions make the accessing of structures, arrays, arrays of structures, and other organizations direct and straightforward. Table 5-3 gives several examples of 80387 instructions coded with operands that illustrate different addressing methods. 5-7 PROGRAMMING NUMERIC APPLICATIONS 5.1.5 Comparative Programming Example Figures 5-5 and 5-6 show the PL/M-386 and ASM386 code for a simple 80387 program, called ARRSUM. The program references an array (X$ARRAY), which contains 0-100 single real values; the integer variable N$OF$X indicates the number of array elements the program is to consider. ARRSUM steps through X$ARRA Y accumulating three sums: SUM$X, the sum of the array values • SUM$INDEXES, the sum of each array value times its index, where the index of the first element is 1, the second is 2, etc. SUM$SQUARES, the sum of each array element squared (A true program, of course, would go beyond these steps to store and use the results of these calculations.) The control word is set with the recommended values: round to nearest, 64-bit precision, interrupts enabled, and all exceptions masked except invalid operation. It is assumed that an exception handler has been written to field the invalid operation if it occurs, and that it is invoked by interrupt pointer 16. Either version of the program will run on an actual or an emulated 80387 without altering the code shown. The PL/M-386 version of ARRSUM (Figure 5-5) is very straightforward and illustrates how easily the 80387 can be used in this language. After declaring variables, the program calls built-in procedures to initialize the processor (or its emulator) and to load to the control word. The program clears the sum variables and then steps through X$ARRA Y with a DO-loop. The loop control takes into account PL/M-386's practice of considering the index of the first element of an array to be o. In the computation of SUM$INDEXES, the built-in procedure FLOAT converts 1+1 from integer to real because the language does not support "mixed mode" arithmetic. One of the strengths of the NPX, of course, is that it does support arithmetic on mixed data types (because all values are converted internally to the 80-bit extended-precision real format). The ASM386 version (Figure 5-6) defines the external procedure INIT387, which makes the different initialization requirements of the processor and its emulator transparent to the source code. After defining the data and setting up the segment registers and stack pointer, the program calls INIT387 and loads the control word. The computation begins with the next three instructions, which clear three registers by loading (pushing) zeros onto the stack. As shown in Figure 5-7, these registers remain at the bottom of the stack throughout the computation while temporary values are pushed on and popped off the stack above them. The program uses the CPU LOOP instruction to control its iteration through X_ARRAY; register ECX, which LOOP automatically decrements, is loaded with N_OF _X, the number of array elements to be summed. Register ESI is used to select (index) the array elements. The program steps through X_ARRA Y from back to front, so ESI is initialized to point at the element just beyond the first element to be processed. The ASM386 TYPE operator is used to determine the number of bytes in each array element. This permits changing X~RRA Y to a double-precision real array by simply changing its definition (DD to DQ) and reassembling. 5-8 PROGRAMMING NUMERIC APPLICATIONS XENIX286 PL/M-386 DEBUG X291a COMPILATION OF MODULE ARRAYSUM OBJECT MODULE PLACED IN arraysum.obj COMPILER INVOKED BY: plm386 arraysum.plm /*********************************************************** * ARRAYSUM MODDULE *********************************************************** / array$sum: declare declare declare declare do; (sum$x, sum$indexes, sum$squares) real; x$array(100) real; (n$of$x, i) integer; controt$387 literally I033eh'; 1* Assume x$array and n$oUx are initialized wI caLL init$reat$math$unit; call set$real$mode(control$387); 6 7 J* Clear sums */ sum$x, sum$indexes, sum$squares ::: 0.0; /* loop through array, accumuLating sums */ do i :::: 0 to n$of$x - 1; sum$x = sum$x + x$array( i); sum$indexes = sum$indexes + (x$array(; )*float( ;+1); sum$squares == stmSsquares + (x$array(; )*x$array( i»; 10 11 12 13 end; /* etc. */ end arraySsurn; 14 MODULE INFORMATION: CODE AREA SIZE " CONSTANT AREA SIZE" VARIABLE AREA SIZE" MAXIMUM STACK SIZE" OOOOOOAOH 00000004H 000001A4H 00000004H 1600 40 4200 40 32 LINES READ o PROGRAM o PROGRAM ~ARNINGS ERRORS DICTIONARY SUMMARY: 8KB MEMORY USED OKB 0 I SK SPACE USED END OF PL/M-386 COMPILATION Figure 5-5. Sample PLlM-386 Program 5-9 PROGRAMMING NUMERIC APPLICATIONS XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MaCULE ARRAY SUM OBJECT MODULE PLACED IN arraysum.obj ASSEMBLER INVOKED BY: asm386 arraysuffi.asm LaC OBJ LINE SOURCE name arraysum ; Define initiaL ization routine extrn init387:far ; Allocate space for data 00000000 3E03 00000002 ???????? 00000006 (100 ???????? 10 11 12 00000196 ???????? 13 14 1S 16 17 0000019A ???????? OOOD019E ???????? 00000000 00000000 00000004 00000006 OOOOOOOA OOOOOOOF 00000011 66B8···· 8ED8 6688···· B800000000 8EOO BCOOOOOOOO dd 100 dup (?) sum_squares sum_ indexes dd? dd? dd? 18 ; At locate CPU stack space stack stacKseg 22 ; Begin code 23 24 25 code 26 assume 27 28 29 30 31 32 33 34 3S 36 38 39 40 41 42 43 44 45 00000023 09EE 00000025 09EE 00000027 D9EE dd ? 19 20 21 37 00000016 9AOOOOOOOO· 00000010 092000000000 data segment rw pubL i c cont ra L_387 dw 033eh 46 47 48 49 segment er pubL i c ds:data, ~s:stack start: mov mov mov mov mov mov ax, data ds, ax ax, stack eax, Oh 55, ax esp, stackstart stack Assume x array and n of x have been initiali7t'~d -- Prepare the 80387 or its emulator call fldcw lnit387 control_387 CLear three registers to hoLd running sums fldz fldz fldz Figure 5-6. Sample ASM386 Program 5-10 400 PROGRAMMING NUMERIC APPLICATIONS LOC OBJ LINE SOURCE 50 51 52 00000029 8B0002000000 0000002F F7E9 00000031 8BFO 00000033 83EE04 00000036 098606000000 0000003C Occ3 0000003E 09CO mov 60 clement;. 1 Loop through x_array and aCClJT1Ulate Sl.ll1 sum_next: 64 backup one element and push on 65 66 67 68 69 70 71 72 the stack sub fLd esi, type x_array x_array[esi] add to the sum and dup l ; cate x on the stack 73 fadd ftd 8tO), st st square it and add into the sum of 76 77 (index+1) and discard 78 79 80 81 82 83 00000044 FF0002000000 0000004A E2E7 ecx esi I eax ESI now contains index of Last 74 75 00000040 OCC8 00000042 OEc2 ecx, n_of_x irrul mov 61 62 63 00000033 Setup ECX as loop counter and ESI as ; ndex into x array 53 54 55 56 57 58 S9 fmul faddp st, st 5t(2), st reduce index for next iteration 84 dec 85 loop sum_next 86 87 Pop sums ; nto memory 88 0000004C 0000004C 00000052 00000058 0000005E 89 90 91 091096010000 09109A010000 09109E010000 98 pop_results: fstp fstp fstp 92 twait 93 94 95 96 97 98 99 ASSEMBl.Y COMPLETE, NO UARNINGS, sum_squares sLITl_indexes sum_x Etc. code end ends start, ds:data, ss:stack NO ERRORS. Figure 5-6. Sample ASM386 Program (Cont'd.) 5-11 PROGRAMMING NUMERIC APPLICATIONS FLDZ,FLDZ,FLDZ FLO X- ARRA YISI] ST(O) 0.0 SU M_SQUARES ST(O) ST(l) 0.0 SU M_INDEXES ST(l) ST(2) 0.0 SU ST(2) -- -- ST(3) FADD- ST(3) ST 2.5 X_ARRAY (19) SUM_SQUARES 0.0 SUM_INDEXES 0.0 SUM_X FLO- ST ST(O) 2.5 X _ARRAY (19) ST(O) 2.5 JLARRAY (19) ST(l) 0.0 SUM_SQUARES ST(l) 2.5 X-.ARRAY(19) ST(2) 0.0 S ST(2) 0.0 SUM_SQUARES ST(3) 2.5 SUMJ, ST(3) 0.0 - - -- ST(4) FMUL- ST ST 2.5 FADDP- ST(2), ST ST(O) 6.25 X _ARRAY(19)' S T(O) 2.5 X-.ARRAY (19) ST(l) 2.5 X _ARRAY (19) ST(l) 6.25 SUM_SQUARES ST(2) 0.0 S UM._.SQUARES S T(2) 0.0 ST(3) 0.0 SUM_INDEXES S T(3) 2.5 ST(4) 2.5 S SUM_INDEXES ..... FADDP- ST(2), ST FIMULN_oLX ST(O) 50.0 X_A RRAY(19)"20 ST(l) 6.25 ST(2) 0.0 ST(3) 2.5 SUM ST(O) 6.25 SUM_SQUARES ST(l) 50.0 SUM._INDEXES ST(2) 2.5 122164-14 Figure 5-7. Instructions and Register Stack 5-12 PROGRAMMING NUMERIC APPLICATIONS Figure 5-7 shows the effect of the instructions in the program loop on the NPX register stack. The figure assumes that the program is in its first iteration, that N_OF_X is 20, and that X_ARRAY(19) (the 20th element) contains the value 2.5. When the loop terminates, the three sums are left as the top stack elements so that the program ends by simply popping them into memory variables. 5.1.6 80387 Emulation The programming of applications to execute on both 80386 with an 80387 and 80386 systems without an 80387 is made much easier by the existence of an 80387 emulator for 80386 systems. The Intel EMUL387 emulator offers a complete software counterpart to the 80387 hardware; NPX instructions can be simply emulated in software rather than being executed in hardware. With software emulation, the distinction between 80386 systems with or without an 80387 is reduced to a simple performance differential. Identical numeric programs will simply execute more slowly (using software emulation of NPX instructions) on 80386 systems without an 80387 than on an 80386/80387 system executing NPX instructions directly. When incorporated into the systems software, the emulation of NPX instructions on the 80386 systems is completely transparent to the applications programmer. Applications software needs no special libraries, linking, or other activity to allow it to run on an 80386 with 80387 emulation. To the applications programmer, the development of programs for 80386 systems is the same whether the 80387 NPX hardware is available or not. The full 80387 instruction set is available for use, with NPX instructions being either emulated or executed directly. Applications programmers need not be concerned with the hardware configuration of the computer systems on which their applications will eventually run. For systems programmers, details relating to 80387 emulators are described in Chapter 6. The EMUL387 software emulator for 80386 systems is available from Intel as a separate program product. 5.2 CONCURRENT PROCESSING WITH THE 80387 Because the 80386 CPU and the 80387 NPX have separate execution units, it is possible for the NPX to execute numeric instructions in parallel with instructions executed by the CPU. This simultaneous execution of different instructions is called concurrency. No special programming techniques are required to gain the advantages of concurrent execution; numeric instructions for the NPX are simply placed in line with the instructions for the CPU. CPU and numeric instructions are initiated in the same order as they are encountered by the CPU in its instruction stream. However, because numeric operations performed by the NPX generally require more time than operations performed by the CPU, the CPU can often execute several of its instructions before the NPX completes a numeric instruction previously initiated. 5-13 PROGRAMMING NUMERIC APPLICATIONS This concurrency offers obvious advantages in terms of execution performance, but concurrency also imposes several rules that must be observed in order to assure proper synchronization of the 80386 CPU and 80387 NPX. All Intel high-level languages automatically provide for and manage concurrency in the NPX. Assembly-language programmers, however, must understand and manage some areas of concurrency in exchange for the flexibility and performance of programming in assembly language. This section is for the assembly-language programmer or well-informed high-level-language programmer. 5.2.1 Managing Concurrency Concurrent execution of the host and 80387 is easy to establish and maintain. The activities of numeric programs can be split into two major areas: program control and arithmetic. The program control part performs activities such as deciding what functions to perform, calculating addresses of numeric operands, and loop control. The arithmetic part simply adds, subtracts, multiplies, and performs other operations on the numeric operands. The NPX and host are designed to handle these two parts separately and efficiently. Concurrency management is required to check for an exception before letting the 80386 change a value just used by the 80387. Almost any numeric instruction can, under the wrong circumstances, produce a numeric exception. For programmers in higher-level languages, all required synchronization is automatically provided by the appropriate compiler. For assembly-language programmers exception synchronization remains the responsibility of the assembly-language programmer. A complication is that a programmer may not expect his numeric program to cause numeric exceptions, but in some systems, they may regularly happen. To better understand these points, consider what can happen when the NPX detects an exception. Depending on options determined by the software system designer, the NPX can perform one of two things when a numeric exception occurs: The NPX can provide a default fix-up for selected numeric exceptions. Programs can mask individual exception types to indicate that the NPX should generate a safe, reasonable result whenever that exception occurs. The default exception fix-up activity is treated by the NPX as part of the instruction causing the exception; no external indication of the exception is given. When exceptions are detected, a flag is set in the numeric status register, but no information regarding where or when is available. If the NPX performs its default action for all exceptions, then the need for exception synchronization is not manifest. However, as will be shown later, this is not sufficient reason to ignore exception synchronization when designing programs that use the 80387. • As an alternative to the NPX default fix-up of numeric exceptions, the 80386 CPU can be notified whenever an exception occurs. When a numeric exception is unmasked and the exception occurs, the NPX stops further execution of the numeric instruction and signals this event to the CPU. On the next occurrence of an ESC or WAIT instruction, 5-14 PROGRAMMING NUMERIC APPLICATIONS the CPU traps to a software exception handler. The exception handler can then implement any sort of recovery procedures desired for any numeric exception detectable by the NPX. Some ESC instructions do not check for exceptions. These are the non waiting forms FNINIT, FNSTENV, FNSA VE, FNSTSW, FNSTCW, and FNCLEX. When the NPX signals an unmasked exception condition, it is requesting help. The fact that the exception was unmasked indicates that further numeric program execution under the arithmetic and programming rules of the NPX is unreasonable. If concurrent execution is allowed, the state of the CPU when it recognizes the exception is undefined. The CPU may have changed many of its internal registers and be executing a totally different program by the time the exception occurs. To handle this situation, the NPX has special registers updated at the start of each numeric instruction to describe the state of the numeric program when the failed instruction was attempted. Exception synchronization ensures that the NPX is in a well-defined state after an unmasked numeric exception occurs. Without a well-defined state, it would be impossible for exception recovery routines to determine why the numeric exception occurred, or to recover successfully from the exception. The following two sections illustrate the need to always consider exception synchronization when writing 80387 code, even when the code is initially intended for execution with exceptions masked. If the code is later moved to an environment where exceptions are unmasked, the same code may not work correctly. An example of how some instructions written without exception synchronization will work initially, but fail when moved into a new environment is shown in Figure 5-8. INCORRECT ERROR SYNCHRONIZATION F [ LD [ HC FSQRT CO UNT COUNT COUNT NPX instruction CPU instruction alten operand subseguent NPX instruction -- error from previous HPX instruction detected here PROPER ERROR SYNCHRONIZATION F[LD FSQRT COUHT [HC COUHT HPX instruction subseguent HPX instruction -- error from previous HPX instruction detected here CPU instruction alters operand Figure 5-8. Exception Synchronization Examples 5-15 PROGRAMMING NUMERIC APPLICATIONS 5.2.1.1 INCORRECT EXCEPTION SYNCHRONIZATION In Figure 5-8, three instructions are shown to load an integer, calculate its square root, then increment the integer. The 80386-to-80387 interface and synchronous execution of the NPX emulator will allow this program to execute correctly when no exceptions occur on the FILD instruction. This situation changes if the 80387 numeric register stack is extended to memory. To extend the NPX stack to memory, the invalid exception is unmasked. A push to a full register or pop from an empty register sets SF and causes an invalid exception. The recovery routine for the exception must recognize this situation, fix up the stack, then perform the original operation. The recovery routine will not work correctly in the first example shown in the figure. The problem is that the value of COUNT is incremented before the NPX can signal the exception to the CPU. Because COUNT is incremented before the exception handler is invoked, the recovery routine will load an incorrect value of COUNT, causing the program to fail or behave unreliably. 5.2.1.2 PROPER EXCEPTION SYNCHRONIZATION Exception synchronization relies on the WAIT instruction and the BUSY # and ERROR# signals of the 80387. When an unmasked exception occurs in the 80387, it asserts the ERROR# signal, signaling to the CPU that a numeric exception has occurred. The next time the CPU encounters aWAIT instruction or an exception-checking ESC instruction, the CPU acknowledges the ERROR# signal by trapping automatically to Interrupt #16, the processor-extension exception vector. If the following ESC or WAIT instruction is properly placed, the CPU will not yet have disturbed any information vital to recovery from the exception. 5-16 System-Level Numeric Programming 6 CHAPTER 6 SYSTEM-LEVEL NUMERIC PROGRAMMING System programming for 80387 systems requires a more detailed understanding of the 80387 NPX than does application programming. Such things as emulation, initialization, exception handling, and data and error synchronization are all the responsibility of the systems programmer. These topics are covered in detail in the sections that follow. 6.1 80386/80387 ARCHITECTURE On a software level, the 80387 NPX appears as an extension of the 80386 CPU. On the hardware level, however, the mechanisms by which the 80386 and 80387 interact are more complex. This section describes how the 80387 NPX and 80386 CPU interact and points out features of this interaction that are of interest to systems programmers. 6.1.1 Instruction and Operand Transfer All transfers of instructions and operands between the 80387 and system memory are performed by the 80386 using I/0 bus cycles. The 80387 appears to the CPU as a special peripheral device. It is special in two respects: the CPU initiates I/O automatically when it encounters ESC instructions, and the CPU uses reserved I/0 addresses to communicate with the 80387. These I/O operations are completely transparent to software. Because the 80386 actually performs all transfers between the 80387 and memory, no additional bus drivers, controllers, or other components are necessary to interface the 80387 NPX to the local bus. The 80387 can utilize instructions and operands located in any memory accessible to the 80386 CPU. 6.1.2 Independent of CPU Addressing Modes Unlike the 80287, the 80387 is not sensitive to the addressing and memory management of the Cpu. The 80387 operates the same regardless of whether the 80386 CPU is operating in real-address mode, in protected mode, or in virtual 8086 mode. The instruction FSETPM that was necessary in 80286/80287 systems to set the 80287 into protected mode is not needed for the 80387. The 80387 treats this instruction as a no-op. Because the 80386 actually performs all transfers between the 80387 and memory, 80387 instructions can utilize any memory location accessible by the task currently executing on the 80386. When operating in protected mode, all references to memory operands are automatically verified by the 80386's memory management and protection mechanisms as for any other memory references by the currently-executing task. Protection violations associated with NPX instructions automatically cause the 80386 to trap to an appropriate exception handler. 6-1 SYSTEM PROGRAMMING To the numerics programmer, the operating modes of the 80386 affect only the manner in which the NPX instruction and data pointers are represented in memory following an FSAVE or FSTENV instruction. Each of these instructions produces one of four formats depending on both the operating mode and on the operand-size attribute in effect for the instruction. The differences are detailed in the discussion of the FSAVE and FSTENV instructions in Chapter 4. 6.1.3 Dedicated I/O Locations The 80387 NPX does not require that any memory addresses be set aside for special purposes. The 80387 does make use of I/O port addresses, but these are 32-bit addresses with the high-order bit set (i.e. > 80000000H); therefore, these I/O operations are completely transparent to the 80386 software. Because these addresses are beyond the 64 Kbyte I/O addressing limit of I/O instructions, 80386 programs cannot reference these reserved I/O addresses directly. 6.2 PROCESSOR INITIALIZATION AND CONTROL One of the principal responsibilities of systems software is the initialization, monitoring, and control of the hardware and software resources of the system, including the 80387 NPX. In this section, issues related to system initialization and control are described, including recognition of the NPX, emulation of the 80387 NPX in software if the hardware is not available, and the handling of exceptions that may occur during the execution of the 80387. 6.2.1 System Initialization During initialization of an 80386 system, systems software must • Recognize the presence or absence of the NPX. Set flags in the 80386 MSW to reflect the state of the numeric environment. If an 80387 NPX is present in the system, the NPX must be initialized. All of these activities can be quickly and easily performed as part of the overall system initialization. 6.2.2 Hardware Recognition of the NPX The 80386 identifies the type of its coprocessor (80287 or 80387) by sampling its ERROR# input some time after the falling edge of RESET and before executing the first ESC instruction. The 80287 keeps its ERROR# output in inactive state after hardware reset; the 80387 keeps its ERROR# output in active state after hardware reset. The 80386 records this difference in the ET bit of control register zero (CRO). The 80386 subsequently uses ET to control its interface with the coprocessor. If ET is set, it employs the 32-bit protocol of the 80387; if ET is not set, it employs the 16-bit protocol of the 80287. 6-2 SYSTEM PROGRAMMING Systems software can (if necessary) change the value of ET. There are three reasons that ET may not be set: 1. 2. 3. An 80287 is actually present. No coprocessor is present. An 80387 is present but it is connected in a nonstandard manner that does not trigger the setting of ET. An example of case three is the PC / AT-compatible design described in Appendix F. In such cases, initialization software may need to change the value of ET. 6.2.3 Software Recognition of the NPX Figure 6-1 shows an example of a recognition routine that determines whether an NPX is present, and distinguishes between the 80387 and the 8087/80287. This routine can be executed on any 80386, 80286, or 8086 hardware configuration that has an NPX socket. The example guards against the possibility of accidentally reading an expected value from a floating data bus when no NPX is present. Data read from a floating bus is undefined. By expecting to read a specific bit pattern from the NPX, the routine protects itself from the indeterminate state of the bus. The example also avoids depending on any values in reserved bits, thereby maintaining compatibility with future numerics coprocessors. 6.2.4 Configuring the Numerics Environment Once the 80386 CPU has determined the presence or absence of the 80387 or 80287 NPX, the 80386 must set either the MP or the EM bit in its own control register zero (CRO) accordingly. The initialization routine can either • • Set the MP bit in CRO to allow numeric instructions to be executed directly by the NPX. Set the EM bit in the CRO to permit software emulation of the numeric instructions. The MP (monitor coprocessor) flag of CRO indicates to the 80386 whether an NPX is physically available in the system. The MP flag controls the function of the WAIT instruction. When executing a WAIT instruction, the 80386 tests the task switched (TS) bit only if MP is set; if it finds TS set under these conditions, the CPU traps to exception #7. The Emulation Mode (EM) bit of CRO indicates to the 80386 whether NPX functions are to be emulated. If the CPU finds EM set when it executes an ESC instruction, program control is automatically trapped to exception #7, giving the exception handler the opportunity to emulate the functions of an 80387. For correct 80386 operation, the EM bit must never be set concurrently with MP. The EM and MP bits of the 80386 are described in more detail in the 80386 Programmer's Reference Manual. More information on software emulation for the 80387 NPX is described in the "80387 Emulation" section later in this chapter. In any case, if ESC instructions are to be executed, either the MP or EM bit must be set, but not both. 6-3 SYSTEM PROGRAMMING 8086/87/88/186 MACRO ASSEM8LER Test for presence of a NLrnerics Chip, Revision 1.0 PAGE DOS 3.20 (033-N) 8086/87/88/186 MACRO ASSEMBLER V2.0 ASSEMBLY Of MOOULE TEST_NPX OBJECT MODULE PLACED IN FINDNPX.OBJ LOC QBJ 0000 (100 LINE S(XJRCE 1 +1 2 Stitle('Test for presence of a Nl..Il'Ierics Chip, Revision 1.0 1 ) 5 stack 6 segment stack. 'stack' dw 100 dup (?) ???? 00e8 ???? 7 sst dw 8 stack: ends data segment publ ic 'data' 9 0000 0000 0000 0000 0000 0003 0006 OOOA 90D8E3 BED 000 C7045A5A 90003C 0000 803COO 0010 752A 0012 90093C 0015 0017 001A 0010 8B04 253f10 303fOO 7510 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 te"" dw data ends dgroup group data, stack. cgroup group code code segment pubL i c assune Oh I code I cs:cgroup, ds:dgroup start: Look for an 8087, 80287. or 80387 NPX. Note that we cannot execute WAIT on 8086/88 if no 8087 is present. test_npx: fninit mov mov fnstsw clll> jne ; Must use non-wait form si ,offset dgroup:terrp word ptr [5i] ,5A5AH ; Initial lze temp to non-zero value [51] ; Must use non-wait form of fstsw It is not necessary to use a 'WAIT instruction after fnstsw or fnstcw. Do not use one here. byte ptr [sl] ,0 See if correct status with zeroes was read JlIfl> 1f not a vaLid status word, meaning no NPX Now see if ones can be correctly written from the control word. fnstcw [s;] ; look at the control word; do not use \JAIT form mav ax, [s1] aX,103fh ax,3fh no_npx ; Do not use a WAIT instruction here! See if ones can be wri tten by NPX See if selected parts of control word look OK Check that ones and zeroes were correctly read JlJll> if no NPX is installed and crrp jne Some nl.l1lerics chip is installed. NPX instrUctions and WAIT are now safe. See if the NPX is an 8087, 80287, or 80387. This code is necessary if a denormal exception handler is used or the new 80387 instructions will be used. Figure 6-1. Software Routine to Recognize the 80287 6-4 SYSTEM PROGRAMMING 8086/87/88/186 MACRO ASSEMBLER LOC OBJ OOlF 0022 0025 0028 002B 002E 0031 0034 0036 0037 9B09E8 9B09EE 9BDEF9 9BD9CO 98D9EO 98DED9 98DD3C 8804 9E 7406 0039 E80790 003C LINE Test for presence of a NLJOerics Chip~ Revision 1.0 PAGE SOORCE 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 fld1 fldz i Must use default control word from FNINIT sr ; ; ; ; ; fstsw [sj] ; look at status from FCOMPP mov ax. [si] fdiv fld fchs fcOOW sahf je tound_87_287 An 80387 is present. they must be masked. Form inHnity 80871287 says +1nf = - inf Form negative infinity 80387 says +inf <> • inf See if they are the same and remove them ; See if the infinities matched JI.IJ1) if 8087/287 is present ; If denormal exceptions are used for an 8087/287, The 80387 wilt automatically normalize denormaL operands faster than an exception handler can. 64 j"l' found_387 65 66 set up for no NPX 67 68 003C E80490 003F 69 70 71 72 jlTp exit found_87_287: set up for 87/287 73 003F E80190 0042 74 75 76 n jrrp exit found_387: set up for 387 78 0042 79 80 81 exit: code ends end start,ds:dgroup,ss:dgroup:sst ASSEMBLY COMPLETE, NO ERRORS FOONO Figure 6-1. Software Routine to Recognize the 80287 (Cont'd.) 6.2.5 Initializing the 80381 Initializing the 80387 NPX simply means placing the NPX in a known state unaffected by any activity performed earlier. A single FNINIT instruction performs this initialization. All the error masks are set, all registers are tagged empty, TOP is set to zero, and default rounding and precision controls are set. Table 6-1 shows the state of the 80387 NPX following FINIT or FNINIT. This state is compatible with that of the 80287 after FINIT or after hardware RESET. The FNINIT instruction does not leave the 80387 in the same state as that which results from the hardware RESET signal. Following a hardware RESET signal, such as after initial power-up, the state of the 80387 differs in the following respects: 1. The mask bit for the invalid-operation exception is reset. 2. The invalid-operation exception flag is set. 3. The exception-summary bit is set (along with its mirror image, the B-bit). 6-5 SYSTEM PROGRAMMING Table 6-1. NPX Processor State Following Initialization Field Value Interpretation Control Word (Infinity Control)* Rounding Control Precision Control Exception Masks 0 00 11 111111 Affine Round to nearest 64 bits All exceptions masked Status Word (Busy) Condition Code Stack Top Exception Summary Stack Flag Exception Flags 0 0000 000 0 0 000000 - Tag Word Tags 11 Empty N.C. Not changed N.C. N.C. N.C. Not changed Not changed Not changed Registers Exception Pointers Instruction Code Instruction Address Operand Address Registe~ 0 is stack top No exceptions No exceptions *The 80387 does not have infinity control. This value is listed to emphasize that programs written for the 80287 may not behave the same on the 80387 if they depend on this bit. These settings cause assertion of the ERROR# signal as described previously. The FNINIT instruction must be used to change the 80387 state to one compatible with the 80287. 6.2.6 80387 Emulation If it is determined that no 80387 NPX is available in the system, systems software may decide to emulate ESC instructions in software. This emulation is easily supported by the 80386 hardware, because the 80386 can be configured to trap to a software emulation routine whenever it encounters an ESC instruction in its instruction stream. Whenever the 80386 CPU encounters an ESC instruction, and its MP and EM status bits are set appropriately (MP=O, EM = 1), the 80386 automatically traps to interrupt #7, the "processor extension not available" exception. The return link stored on the stack points to the first byte of the ESC instruction, including the prefix byte(s), if any. The exception handler can use this return link to examine the ESC instruction and proceed to emulate the numeric instruction in software. 6-6 SYSTEM PROGRAMMING The emulator must step the return pointer so that, upon return from the exception handler, execution can resume at the first instruction following the ESC instruction. To an application program, execution on an 80386 system with 80387 emulation is almost indistinguishable from execution on a system with an 80387, except for the difference in execution speeds. There are several important considerations when using emulation on an 80386 system: When operating in protected mode, numeric applications using the emulator must be executed in execute-readable code segments. Numeric software cannot be emulated if it is executed in execute-only code segments. This is because the emulator must be able to examine the particular numeric instruction that caused the emulation trap. Only privileged tasks can place the 80386 in emulation mode. The instructions necessary to place the 80386 in emulation mode are privileged instructions, and arc not typically accessible to an application. An emulator package (EMUL387) that runs on 80386 systems is available from Intel. This emulation package operates in both real and protected mode as well as in virtual 8086 mode, providing a complete functional equivalent for the 80387 emulated in software. When using the EMUL387 emulator, writers of numeric exception handlers should be aware of one slight difference between the emulated 80387 and the 80387 hardware: On the 80387 hardware, exception handlers are invoked by the 80386 at the first WAIT or ESC instruction following the instruction causing the exception. The return link, stored on the 80386 stack, points to this second WAIT or ESC instruction where execution will resume following a return from the exception handler. Using the EMUL387 emulator, numeric exception handlers are invoked from within the emulator itself. The return link stored on the stack when the exception handler is invoked will therefore point back to the EMUL387 emulator, rather than to the program code actually being executed (emulated). An IRET return from the exception handler returns to the emulator, which then returns immediately to the emulated program. This added layer of indirection should not cause confusion, however, because the instruction causing the exception can always be identified from the 80387's instruction and data pointers. 6.2.7 Handling Numerics Exceptions Once the 80387 has been initialized and normal execution of applications has been commenced, the 80387 NPX may occasionally require attention in order to recover from numeric processing exceptions. This section provides details for writing software exception handlers for numeric exceptions. Numeric processing exceptions have already been introduced in Chapter 3. 6-7 SYSTEM PROGRAMMING The 80387 NPX can take one of two actions when it recognizes a numeric exception: • If the exception is masked, the NPX will automatically perform its own masked exception response, correcting the exception condition according to fixed rules, and then continuing with its instruction execution. • If the exception is unmasked, the NPX signals the exception to the 80386 CPU using the ERROR# status line between the two processors. Each time the 80386 encounters an ESC or WAIT instruction in its instruction stream, the CPU checks the condition of this ERROR# status line. If ERROR# is active, the CPU automatically traps to Interrupt vector #16, the Processor Extension Error trap. Interrupt vector #16 typically points to a software exception handler, which mayor may not be a part of systems software. This exception handler takes the form of an 80386 interrupt procedure. When handling numeric errors, the CPU has two responsibilities: The CPU must not disturb the numeric context when an error is detected. • The CPU must clear the error and attempt recovery from the error. Although the manner in which programmers may treat these responsibilities varies from one implementation to the next, most exception handlers will include these basic steps: • Store the NPX environment (control, status, and tag words, operand and instruction pointers) as it existed at the time of the exception. Clear the exception bits in the status word. • • Enable interrupts on the CPU. Identify the exception by examining the status and control words environment. • Take some system-dependent action to rectify the exception. III the saved Return to the interrupted program and resume normal execution. 6.2.8 Simultaneous Exception Response In cases where multiple exceptions arise simultaneously, the 80387 signals one exception according to the precedence shown at the end of Chapter 3. This means, for example, that an SNaN divided by zero results in an invalid operation, not in a zero divide exception. 6.2.9 Exception Recovery Examples Recovery routines for NPX exceptions can take a variety of forms. They can change the arithmetic and programming rules of the NPX. These changes may redefine the default fixup for an error, change the appearance of the NPX to the programmer, or change how arithmetic is defined on the NPX. 6-8 SYSTEM PROGRAMMING A change to an exception response might be to automatically normalize all denormals loaded from memory. A change in appearance might be extending the register stack into memory to provide an "infinite" number of numeric registers. The arithmetic of the NPX can be changed to automatically extend the precision and range of variables when exceeded. All these functions can be implemented on the NPX via numeric exceptions and associated recovery routines in a manner transparent to the application programmer. Some other possible application-dependent actions might include: • • Incrementing an exception counter for later display or printing Printing or displaying diagnostic information (e.g., the 80387 environment and registers) Aborting further execution Storing a diagnostic value (a NaN) in the result and continuing with the computation Notice that an exception mayor may not constitute an error, depending on the application. Once the exception handler corrects the condition causing the exception, the floating-point instruction that caused the exception can be restarted, if appropriate. This cannot be accomplished using the IRET instruction, however, because the trap occurs at the ESC or WAIT instruction following the offending ESC instruction. The exception handler must obtain (using FSA VE or FSTENV) the address of the offending instruction in the task that initiated it, make a copy of it, execute the copy in the context of the offending task, and then return via IRET to the current CPU instruction stream. In order to correct the condition causing the numeric exception, exception handlers must recognize the precise state of the NPX at the time the exception handler was invoked, and be able to reconstruct the state of the NPX when the exception initially occurred. To reconstruct the state of the NPX, programmers must understand when, during the execution of an NPX instruction, exceptions are actually recognized. Invalid operation, zero divide, and denormalized exceptions are detected before an operation begins, whereas overflow, underflow, and precision exceptions are not raised until a true result has been computed. When a before exception is detected, the NPX register stack and memory have not yet been updated, and appear as if the offending instructions has not been executed. When an after exception is detected, the register stack and memory appear as if the instruction has run to completion; i.e., they may be updated. (However, in a store or store-and-pop operation, unmasked over junderflow is handled like a before exception; memory is not updated and the stack is not popped.) The programming examples contained in Chapter 7 include an outline of several exception handlers to process numeric exceptions for the 80387. 6-9 Numeric Programming Examples 7 CHAPTER 7 NUMERIC PROGRAMMING EXAMPLES The following sections contain examples of numeric programs for the 80387 NPX written in ASM386. These examples are intended to illustrate some of the techniques for programming the 80386/80387 computing system for numeric applications. 7.1 CONDITIONAL BRANCHING EXAMPLE As discussed in Chapter 2, several numeric instructions post their results to the condition code bits of the 80387 status word. Although there are many ways to implement conditional branching following a comparison, the basic approach is as follows: Execute the comparison. • Store the status word. (80387 allows storing status directly into AX register.) • Inspect the condition code bits. • Jump on the result. Figure 7-1 is a code fragment that illustrates how two memory-resident double-format real numbers might be compared (similar code could be used with the FTST instruction). The numbers are called A and B, and the comparison is A to B. The comparison itself requires loading A onto the top of the 80387 register stack and then comparing it to B, while popping the stack with the same instruction. The status word is then written into the 80386 AX register. A and B have four possible orderings, and bits C3, C2, and CO of the condition code indicate which ordering holds. These bits are positioned in the upper byte of the NPX status word so as to correspond to the CPU's zero, parity, and carry flags (ZF, PF, and CF), when the byte is written into the flags. The code fragment sets ZF, PF, and CF of the CPU status word to the values of C3, C2, and CO of the NPX status word, and then uses the CPU conditional jump instructions to test the flags. The resulting code is extremely compact, requiring only seven instructions. The FXAM instruction updates all four condition code bits. Figure 7-2 shows how a jump table can be used to determine the characteristics of the value examined. The jump table (FXAM_TBL) is initialized to contain the 32-bit displacement of 16 labels, one for each possible condition code setting. Note that four of the table entries contain the same value, "EMPTY." The first two condition code settings correspond to "EMPTY." The two other table entries that contain "EMPTY" will never be used on the 80387, but may he used if the code is executed with an 80287. The program fragment performs the FXAM and stores the status word. It then manipulates the condition code bits to finally produce a number in register BX that equals the condition 7-1 NUMERIC PROGRAMMING EXAMPLES DQ DQ FLD FCOMP FSTSW A B AX LOAD A ONTO TOP OF 387 STACK COMPARE A:B, POP A STORE RESULT TO CPU AX REGISTER CPU AX REGISTER CONTAINS CONDITION CODES (RESULTS OF COMPARE) LOAD CONDITION CODES INTO CPU FLAGS SAHF USE CONDITIONAL JUMPS TO DETERMINE ORDERING OF A TO B JP A E UNORDERED JB LLESS JE A_EQUAL LGREATER: ; TEST C2 (PF) TEST CO (CF) TEST C3 (ZF) CO (CF) = 0, C3 (ZF) = CO (C F) U, A LESS: CO (C F) 1, C3 (ZF) A 8 UNORDERED: C2 (P F ) EQUAL: C3 (ZF) Figure 7-1. Conditional Branching for Compares code times 2. This involves zeroing the unused bits in the byte that contains the code, shifting C3 to the right so that it is adjacent to C2, and then shifting the code to multiply it by 2. The resulting value is used as an index that selects one of the displacements from FXAM_TBL (the mUltiplication of the condition code is required because of the 2-byte length of each value in FXAM_TBL). The unconditional JMP instruction effectively vectors through the jump table to the labeled routine that contains code (not shown in the example) to process each possible result of the FXAM instruction. 7.2 EXCEPTION HANDLING EXAMPLES There are many approaches to writing exception handlers. One useful technique is to consider the exception handler procedure as consisting of "prologue," "body," and "epilogue" sections of code. This procedure is invoked via interrupt number 16. 7-2 NUMERIC PROGRAMMING EXAMPLES j JUMP TABLE FOR EXAMINE ROUTINE j FXAM_TBL DD POS_UNNORM, POS NAN, NEG_UNNORM, NEG_NAN, POS_NORM, POS_INFINITY, NEG_NORM, NEG_!NFINITY, POS_ZERO, EMPTY, NEG_ZERO, EMPTY, POS_DENORM, EMPTY, HEG_DENORM, EMPTY EXAMINE ST AND STORE RESULT (CONDITION CODES) F XAM XOR EAX,EAX FSTSW AX j CLEAR EAX CALCULATE OFFSET INTO JUMP TABLE AND SHR SAL OR XOR AX,0100011100000000B j CLEAR ALL BITS EAX,6 SHIFT C2-CO INTO PLACE AH,5 POSITION C3 AL,AH DROP C3 IN ADJACENT TO C2 AH,AH CLEAR OUT THE OLD COPY OF EXCEPT C3, C2 - C0 (OOOOXXXO) (OOOXOOOO) (OOOXXXXO) C3 JUMP TO THE ROUTINE 'ADDRESSED' BY CONDITION CODE JMP FXAM_TBLIEAXl HERE ARE THE JUMP TARGETS, ONE TO HANDLE EACH POSSIBLE RESULT OF FXAM POLUNNORM: NELU NNOR M: NELN AN: POS_NORM: POLINFiNITY: NELNORM: NELINFINITY: PO LZ ER0 : EMPTY: NELZE R0: PO LD END RM: NELD END RM: Figure 7-2. Conditional Branching for FXAM 7-3 NUMERIC PROGRAMMING EXAMPLES At the beginning of the prologue, CPU interrupts have been disabled. The prologue performs all functions that must be protected from possible interruption by higher-priority sources. Typically, this involves saving CPU registers and transferring diagnostic information from the 80387 to memory. When the critical processing has been completed, the prologue may enable CPU interrupts to allow higher-priority interrupt handlers to preempt the exception handler. The body of the exception handler examines the diagnostic information and makes a response that is necessarily application-dependent. This response may range from halting execution, to displaying a message, to attempting to repair the problem and proceed with normal execution. The epilogue essentially reverses the actions of the prologue, restoring the CPU and the NPX so that normal execution can be resumed. The epilogue must not load an unmasked exception flag into the 80387 or another exception will be requested immediately. Figures 7-3 through 7-5 show the ASM386 coding of three skeleton exception handlers. They show how prologues and epilogues can be written for various situations, but provide comments indicating only where the application dependent exception handling body should be placed. PROC SAVE CPU REGISTERS, ALLOCATE STACK SPACE FOR 80387 STATE IMAGE PUSH EBP MOV EBP,ESP SUB ESP,10B SAVE FULL 80387 STATE, ENABLE CPU INTERRUPTS FNSAVE [EBP-l0BI ST I APPLICATION-DEPENDENT EXCEPTION HANDLING CODE GOES HERE CLEAR EXCEPTION FLAGS IN STATUS WORD (WHICH IS IN MEMORY) RESTORE MODIFIED STATE IMAGE MOV BYTE PTR [EBP-l041, OH FRSTOR [EBP-l08I DEALLOCATE STACK SPACE, RESTORE CPU REGISTERS MOVE ESP, EBP POP EBP RETURN TO INTERRUPTED CALCULATION IRE T SAVE_ALL ENDP Figure 7-3. Full-State Exception Handler 7-4 NUMERIC PROGRAMMING EXAMPLES SAVE_ENVIRONMENT PROC SAVE CPU REGISTERS, ALLOCATE STACK SPACE FOR 80387 ENVIRONMENT PUSH ESP MOV EBP,ESP SUB ESP, 28 SAVE ENVIRONMENT, ENABLE CPU INTERRUPTS FNSTENV IEBP-28J ST I APPLICATION EXCEPTION-HANDLING CODE GOES HERE CLEAR EXCEPTION FLAGS IN STATUS WORD (WHICH IS IN MEMORY) RESTORE MODIFIED ENVIRONMENT IMAGE MOV BYTE PTR IEBP-241, OH FLDENV IEBP-28J DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS MOV ESP,EBP POP EBP RETURN TO INTERRUPTED CALCULATION IRE T SAVE_ENVIRON"ENT ENDP Figure 7-4. Reduced-Latency Exception Handler Figures 7-3 and 7-4 are very similar; their only substantial difference is their choice of instructions to save and restore the 80387. The tradeoff here is between the increased diagnostic information provided by FNSA VE and the faster execution of FNSTENV. For applications that are sensitive to interrupt latency or that do not need to examine register contents, FNSTENV reduces the duration of the "critical region," during which the CPU does not recognize another interrupt request. After the exception handler body, the epilogues prepare the CPU and the NPX to resume execution from the point of interruption (i.e., the instruction following the one that generated the unmasked exception). Notice that the exception flags in the memory image that is loaded into the 80387 are cleared to zero prior to reloading (in fact, in these examples, the entire status word image is cleared). The examples in Figures 7-3 and 7-4 assume that the exception handler itself will not cause an unmasked exception. Where this is a possibility, the general approach shown in Figure 7-5 can be employed. The basic technique is to save the full 80387 state and then to load a new control word in the prologue. Note that considerable care should be taken when designing an exception handler of this type to prevent the handler from being reentered endlessly. 7-5 NUMERIC PROGRAMMING EXAMPLES LOCAL CONTROL REENTRANT DW ASSUME INITIALIZED PROC SAVE CPU REGISTERS, ALLOCATE STACK SPACE fOR 80387 STATE IMAGE PUSH EBP MOV EBP,ESP SUB ESP,10B SAVE STATE, LOAD NEW CONTROL WORD, ENABLE CPU INTERRUPTS fNSAVE IEBP-10B] fLDCW LOCAL_CONTROL ST I APPLICATION EXCEPTION HANDLING CODE GOES HERE. AN UNMASKED EXCEPTION GENERATED HERE WILL CAUSE THE EXCEPTION HANDLER TO BE REENTERED. If LOCAL STORAGE IS NEEDED, IT MUST BE ALLOCATED ON THE CPU STACK. CLEAR EXCEPTION fLAGS IN STATUS WORD (WHICH IS IN MEMORY) RESTORE MODifiED STATE IMAGE MOV BYTE PTR IEBP-l04I, OH fRSTOR IEBP-l0B] DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS MOV ESP, EBP POP EBP RETURN TO POINT OF INTERRUPTION IRET REENTRANT ENDP Figure 7-5. Reentrant Exception Handler 7.3 FLOATING-POINT TO ASCII CONVERSION EXAMPLES Numeric programs must typically format their results at some point for presentation and inspection by the program user. In many cases, numeric results are formatted as ASCII strings for printing or display. This example shows how floating-point values can be converted to decimal ASCII character strings. The function shown in Figure 7-6 can be invoked from PL/M-386, Pascal-386, FORTRAN-386, or ASM386 routines. 7-6 NUMERIC PROGRAMMING EXAMPLES XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE FLOATlNG_TO_ASCll OBJECT MODULE PLACED IN fpasc.obj ASSEMBLER INVOKED BY: asm386 fpasc.asm LOC OBJ LINE SOURCE .... 1 Stitle( •Convert a floating point nutDer to ASCII') 3 00000000 4 5 6 7 public extrn 8 This subroutine wi l t convert the floating point rn.rnber in the top of the NPX stack to an ASCII string and separate power of 10 seal ing value (in binary). The maxinun width of the ASCII string formed is controlled by a parameter which must be > 1. Unnonnal values, denormal values, and psuedo zeroes wHl be correctly converted. However, unnormals 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 flosting_to_Bscii getJ)OWer_10:ne8r, tos_status:near and pseudo zeros are no longer supported formats on the ; 80387( in conformance with the IEEE floating point ; ; ; ; standard) and hence not generated internally. A returned value wi Ll indicate how many binary bits of precision were lost in an unnormal or denormal value. The magnitude (in terms of binary power) of a pseudo zero wi II also be indicated. Integers Less than 10**18 in magnitude are accurately converted if the destination ASCII string field is wide enough to hold all the digits. Otherwise the value is converted to scientific notation. 26 27 28 The status of the conversion is identified by the return value, it can be: 29 30 o 31 32 33 2 3 conversion complete, string_size is defined inval id argunents exact integer conversion, string size is defined indefinite - 34 4 • NAN (Not A Nl.Illber) 35 5 . NAN 36 6 + Infinity 37 7 8 - Infinity pseudo zero found, string_size is defined 38 1 39 40 41 42 43 44 45 46 47 48 49 50 The PLM/386 call ing convention is: floating to ascii: pr~cedure (nLllDer ,denormaLytr,string_ptr ,sizeytr, field size, powerJ)tr) word external; decla;e (denormal_ptr ,stringJ)tr ,powerytr ,size_ptr) pointer; declare field size word, string size based sizeJltr word; declsr; nunber real; declare denormal integer based denormalytr; Figure 7-6. Floating-Point to ASCII Conversion Routine 7-7 NUMERIC PROGRAMMING EXAMPLES LaC OBJ LINE SOURCE 51 52 53 54 55 declare power integer based powerytr; end floating_to_ascii; 57 58 The floating point value is expected to be on the top of the NPX stack. This subroutine expects 3 free entries on the NPX stack and wi II pop the passed value off when done. The generated ASCII string will have a leading 59 60 character either 1.1 or 1+1 indicating the sign of the vaLue. The ASCII decimal digits will 56 61 inmediately follow. The nllneric value of the 62 ASCII string is (ASCII STRING.)*10**POUER. If the given mll1ber was zero, the ASCI I string wi 11 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 contain a sign and a single zero chacter. The value string_size indicates the total length of the ASCI I string including the sign character. StringeD) will always hold the sign. It is possible for string_size to be Less than field_size. This occurs for zeroes or integer values. A pseudo zero will return a special return code. The denormal count wi 11 indicate the power of two originally assoclated with the value. The power of ten and ASCII string will be as if the value was an ordinary zero. Thh subroutine is accurate up to a maximum of 18 decimal digits for integers. Integer values will have a decimal power of zero associated with them. For non integers, the resul t wi Ll be accurate to within 2 decimaL digits of the 16th decimal placeCdouble precision). The exponentiate instruction is aLso used for scaling the value into the range acceptabl e for the BCD data type. The rounding mode in effect on entry to the subroutine is used for the conversion. The following registers are not transparent: 88 00000000 [] 00000004 [] 00000008 [] OOOOOOOC [] 00000010 [] 00000014 [] 00000018 [] 0000001C [] 89 90 91 92 93 94 95 96 97 98 99 100 101 eax ebx ecx edx esi edi eflags Define the stack Layout. ebp_save es_save returnytr power_ptr field_size sizeytr stringytr denormal_ptr equ equ equ equ equ equ equ equ dword ptr [ebp] ebp_save + size ebp_save es_save + size es_save return_ptr + size return_ptr powerytr + size power_ptr field_size + size field_size sizeytr + size size_ptr string_ptr + size string_ptr 102 0014 103 parms size 104 & 105 equ size powerJ'tr + 'size field_Size + size size_ptr + size stringJ'tr + size denormaL_ptr Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-8 NUMERIC PROGRAMMING EXAMPLES LOC OBJ 0012 0004 OOOA 0001 0004 0006 0003 0008 -0002 -0004 ·0006 -0008 0000 0002 "FF'FFC[] FFFFFFF2[] FFFFFFF2 [] FFFFFFF2[] oooe LINE 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 SOURCE Define constants used BCD_DIGITS \/oRO_SIZE BCD_SIZE MINUS NAN INFINITY INDEFINITE PSEUDO_ZERO INVALID ZERO DENORMAL UN NORMAL NORMAL EXACT equ equ equ equ equ equ equ equ equ equ equ equ equ equ 18 4 10 1 4 6 3 8 -2 ·4 ·6 -8 0 2 ; NLll'ber of digits in bcd_value Define return values The exact values chosen here are inportant. They must correspond to the possible return values and be in the same numeric order as tested by the program. Define layout of temporary storage area. power_two bed value fraction equ equ equ equ byte ptr bcd_value bed_value Local_size equ size power_two + size bcd_vaLue bcd=byte word ptr [ebp - WORD_SIZE] tbyte ptr power_two' BCD_SIZE Allocate stack space for the temporaries so the stack wi II be bi 9 enough stack stackseg (local_si ze+6) ; Allocate stack ; space for Locals +1 $eject Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-9 NUMERIC PROGRAMMING EXAMPLES LOC OBJ LINE 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 00000000 OAOO 00000002 00000003 00000004 00000005 00000006 00000007 00000008 00000009 OOOOOOOA OOOOOOOB OOOOOOOC 00000000 OODOGDOE OOOOOOOF 00000010 00000011 F8 04 F9 05 00 06 01 07 FC FE FD FE FA FE FB FE 00000012 00000012 E800000000 156 157 158 159 160 161 00000017 2EOFB68002000000 0000001F 3CFE 00000021 7527 162 163 164 165 166 167 00000023 C21400 168 00000026 169 170 171 172 00000026 ODDS 000001]28 EB02 0000002A 0000002A BOFE 0000002C 0000002C C9 173 174 175 176 177 178 179 180 181 182 SOURCE segment publ ic er extrn power ~ table:qword code Constants used by this fUnction. even dw const10 Optimize for 16 bits 10 Adjustment value for ; too big BCD Convert the C3,C2,C1,CO encoding from tos status into meaningful bH flags and values. status_table db UNNORMAL, NAN, UNNORMAL + MINUS, & NAN + MINUS, NORMAL, INFINITY, & NORMAL + MINUS, INFINITY + MINUS, & ZERO, INVALID, ZERO + MINUS, INVALID, & DENORMAL, INVALID, DENORMAL + MINUS, INVALID call tos status Look at status of SHO) Get descriptor from table movzx eax, status_table[eax] cmp aL,INVALID jne not_empty ST(O) is empty! ; Look for empty ST(O) Return the status vaLue. Remove infinity from stack and exit. found_inf1nl ty: fstp st(O) jmp short exit_proc OK to Leave fstp running String space is too small! Return inval id code. smal L string: mov exit_proc: leave al, INVALID ; Restore stack setup Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-10 NUMERIC PROGRAMMING EXAMPLES LaC OBJ LINE 00000020 07 0000002E C21400 183 184 185 186 187 188 00000031 00000031 DB?oF2 00000034 A801 00000036 9B 00000037 74F3 00000039 BBOOOOOOCO 189 190 191 192 193 194 195 196 197 198 SOURCE pop ret es parms_s;ze SHO) is NAN or indefinite. NAN or indefinite: - - fstp fraction test al,MINUS fwait jz exityroc 0000003E 2B5DF6 00000041 DB5DF2 00000044 75E6 00000046 B003 00000048 EBE2 0000004A 0000004A 06 00000048 C80COOOO 0000004F 8B4010 00000052 83F902 00000055 7CD3 00000057 49 00000058 83F912 0000005B 7605 00000050 B912DOOOOO 00000062 00000062 3C06 00000064 ?oCO 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 ; Remove value from stack for examination ; Look at sign bit Insure store is done ; Can't be indefinite if positive mov ebx,OCOOOOOOOH; Match against upper 32 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 Store the value in memory and look at the fract; on i field to separate indefinite from an ordinary NAN. ibits of fraction C_are bits 63-32 sub ebx, dword ptr fraction + 4 Bi ts 31-0 I1lJst be zero or jnz ebx, dword ptr fraction exityroc Set return value for indefinite value maval,INDEFINlTE jmp exit_proc Allocate stack space for local variables and establ ish parameter addressibi l ity. not_empty: push es enter local_size, Save working register Setup stack address; ng Check for enough string space mov ecx, fieLd_size c"" ecx,2 jl small_string dec ecx ; Adjust for sign character See if string is too large for BCD crq:> ecx,BCO_DIGITS jbe size_ok. Else set maximum string size mov ecx,BCD_OIGITS al, INFINITY Return status value for + or jge found_infinity i Look for infinity inf Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-11 NUMERIC PROGRAMMING EXAMPLES LOC OBJ LINE 00000066 3C04 00000068 lOC7 0000006A 09El 0000006C 0000006E 00000071 00000074 00000077 0000007. 0000007C 0000007F 000000B2 00000084 3102 8B701C 668917 8B500C 668913 88C2 80E201 80C202 3CFC OF83BCOOOOOO 0000008A 00000080 0000008E 00000091 00000095 00000098 0000009. 0000009c OBlO ,2 9B 8.45F9 8040 F980 OB60F2 09F4 A880 7524 0000009E OOOOOOAO 000000A2 OOOOOOM 000000A7 000000A8 09E8 OEE9 09E4 9BO FEO 9E 7510 OOOOOOAA OOOOOOAC OOOOOOAF OOOOOOB 1 000000B3 000000B5 D9EC 80C206 OECA 09c9 OF1B E98COOOOOO OOOOOOBA OOOOOOBA 09F4 OOOOOOBC 09C9 OOOOOOBE 09EO OOOOOOCO 0 F1 F 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 SOURCE cmp at,NAN jge NAN_or _indefinite ; Look for NAN or INDEFINITE Set defaul t return values and check that the number is normallzed. tabs ; Use positive value only xor mov ; sign bit in at has true sign of vaLue edx,edx ; Form 0 constant 00; ,denormal_ptr; Zero denormal count mov me'll mov [edi], dx ebx,power ptf [ebx]. dx- Zero power of ten value mav dt, at and dl, 1 add d l, EXACT cmp at,ZERO jae convert_integer fstp fwait roov ; Test for zero Skip power code if value ; is zero fract i on al, bed byte + 7 byte pt'j: bcd_byte + 7, BOh fld fraction fxt,act test al, BOh jnz normal_value fld1 fsub ftst fstsw ax sahf jnz set_unnormat _count Found a pseudo zero fldtg2 add fmulp fxch fistp jmp ; Develop power of ten est imate dl, PSEUDO_ZERO • EXACT st(2). st Get power of ten word ptr [ebx] Set power of ten convert_integer set_unnormal _count: fxtract fxch fchs fistp word ptr [edi] Get original fraction. now normaL ized Get unnormal count Set unnormal count Calculate the decimaL magnitude associated with this nl.I1lber to within one order. This Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-12 NUMERIC PROGRAMMING EXAMPLES LOC LINE OBJ 293 294 295 296 000000C2 000000C2 DB7DF2 ODOOOOC5 DF55 FC OOOOOOC8 D9EC OOOOOOCA DEC9 oooooocc DF1 B 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 SOURCE error wi II always be inevitable due to rounding and lost precision. As a result, we wi II del iberateLy fail to consider the LOG10 of the fractiOl'\' value in calculating the order. Since the fraction wiLL always LOG10 wi Lt not change the basic accuracy of the function. To get the decimal order of magnitude, simply ITlJLtiply the power of two by LOG10(2) and truncate the resuL t to an integer. be 1 <= F < 2. its normal_value: ; Save the fraction fieLd fstp fraction fist fldlg2 power_two for later ,use frrul fistp ; Save power of two ; Get LOG10(2) ; Power_two is now safe to use '; Form LOG10(of exponent of number) word ptr [ebx] ; Any rounding mode ; will work here 313 314 315 Check if the magnitude of the number rules out treating it as an integer. 316 317 ODOOOOCE 9B OOOOOOCF 668B33 00000002 29CE 318 319 320 321 322 323 324 325 326 00000004 771C 327 328 329 000000D6 OF45FC 00000009 80EAFE fild sub 334 000000E1 000000E3 ODODDOE5 000000E7 ODOOOOEA 337 335 336 338 339 340 341 342 OOOOOOEB 7559 OOOOOOEO 0008 ODOOOOEF 8DC2FE The number is between 1 and 10**(fieLd_size). Test if it is an integer. 330 331 333 343 344 345 346 347 ; YaH for power_ten to be val id fwait Get power of ten of value ITlOVSX s i, word pt r [ebx] ; Form seal ing factor sub esi, ecx ; necessa ry in ax adjust_result Jump if number will not fit ja 332 OOOOOODC DB60F2 ODOODODF D9FD DOD1 D9FC 08D9 9BO FED 9E CX has the maxinun number of decimal digits allowed. power_two Restore original number dl,NORMAL·EXACT Convert to exact return ; value fLd fraction ; Form full value, this fscale is safe here Copy vaLue for compare fst st(1) ; Test if its an integer frndint Compare vaLues fcomp Save status fstsw ax C3=1 impl ies it sahf an integer jnz convert _1 nteger fstp add st(O) dL,NORMAL-EXACT Remove non integer value Re'Store original return value Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-13 NUMERIC PROGRAMMING EXAMPLES Lac a9J LINE 348 349 350 351 352 SOURCE Scale the numl::>er to within the range aLLowed by the BCD format. The scal ;ng operation should produce a number within one decimal order of magnitude of the largest decimal nlJlTber representable within the given string width. 353 354 OOOOOOf2 000000F2 89C6 000000F4 668903 OOOOOOF? F708 000000F9 E800000000 OOOOOOFE 0960F2 00000101 OEC9 00000103 89Fl 355 356 357 358 359 360 361 362 363 364 365 366 The seal ing power of ten value is in S1. adjust_ result: mov mov eax,esi word ptr [ebx] ,ax neg eax ; Set initial power of ten return value Subtract one for each order of call getJ'Ower _10 magnitude the value ;s scaled by Seal ing factor is fld fmul fraction returned as exponent and fraction esi ,ecx 367 368 00000105 C1E603 00000108 OF45FC 00000109 OEC2 00000100 09FO 369 370 371 372 373 374 375 376 0000010F 0009 377 378 379 380 381 382 383 384 385 386 00000111 i Setup for powlO ; Get fraction ; Combine fractions Form power of ten of the maximum shl esi ,3 fild faddp fscale power two st(2),st fstp st(l) ; BCD value to fi t in the string ; Combine powers of two Form full value, exponent was safe ; Remove exponent Test the adjusted value against a table of exact powers of ten. The combined errors of the magnitude estimate and power function can resul t in a value one order of magnitude too small or too large to fit correctly in the BCD field. To handle this problem, pretest the adjusted value, if it is too small or large, then adjust 1t by ten and adjust the power of ten value. 387 388 389 390 00000111 00000118 0000011B 0000011c 2EOC9608000000 9BOFEO 9E 720F 0000011E 00000125 00000128 0000012B 2EOE3500000000 80E2FO 66Ff03 EB17 391 392 393 394 Compare against exact power entry. Use the next entry since cx has been decremented by one feam power_table (es;]+type power_table fstsw ax ; No wait is necessary sahf ; If C3 = CO = 0 then jb test_far_small too big 395 0000012D 00000120 2EOC9600000000 396 fidiv 397 and 398 399 inc jmp 400 401 402 const10 Else adjust value dl. not EXACT Remove exact flag word ptr [ebx] Adjust power of ten value short in_range Convert the value to a BCD ; integer test_for _smal t: feam power table[esiJ Test relative size Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-14 NUMERIC PROGRAMMING EXAMPLES OC OBJ LINE 0000134 980FEO SOURCE fstsw 403 ax No wait is necess ary :0000137 9E '0000138 720A 1f CO = 0 then steO) >= lower bound ; Convert the va 1ue sahf 404 405 406 jc in_range filTMJl const10 to a '000013A 2EDEOOOOOOOOOO :0000141 66FFOB -0000144 '0000144 09FC 407 408 409 410 411 412 413 414 415 10000146 10000146 OF75F2 )0000149 BE08000000 D000014E 66B9040F 00000152 BB01000000 00000157 887018 D000015A J000015C 0000015E 0000015F 00000161 00000164 8C08 8ECO FC B028 F6C201 7402 416 417 418 419 420 421 422 423 424 425 426 427 428 429 ; BCD integer dec in_range: ; Adjust value into range word ptr [ebxJ Adjust power of ten value frncHnt Assert: ; Form integer vaLue a <= IDS <= 999,999,999,999,999,999 The lOS number wi II be exactly representable in 18 digit BCD format. convert integer: -fbstp bcd_value ; Store as BCD format number White the store BCD runs, setup registers for the conversion to ASCI I. mov es;,BCO SIZE-2 mov cx,Of04h ebx,1 i Initial BCD index value 432 433 434 moy 435 jz Set shift count and mask Set initial size of ASCII ; field for sign edi ,string_ptr i Get address of start of ; ASCI I string ax.~ Co~ds toes es,ax ; Set autoincrement mode al,I+1 ; Clear sign fieLd dl,MINUS look for negative value posit ive_resul t mov al, 430 431 moy moy moy moy cld test 436 00000166 8020 00000168 00000168 AA 437 438 439 I. I POSt tive_resut t: stosb 440 00000169 80E2FE 0000016C 98 441 442 443 444 and fwait dl.not MINUS ; Bump string pointer past sign Turn off sign bit ; Wait for fbstp to finish Regi ster usage; 445 ah: 446 a1: dx: BCD byte value in use ASCI I character value Return vaLue 448 ch: BCD mask::: Oth 449 cl: 450 bx: 451 esi: di: ds,es: BCD shift count = 4 ASCII string field width BCD field index ASCII string field pointer ASCI I string segment base 447 452 453 454 455 Remove leading zeroes from the number. Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-15 NUMERIC PROGRAMMING EXAMPLES LOC OBJ 00000160 00000160 00000171 00000173 00000175 00000177 8A6435F2 88EO 02E8 240F 7517 LINE 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 00000179 88EO 00000178 240F 00000170 7519 0000017F 4E 00000180 79E8 00000182 00000184 00000185 00000186 00000188 00000188 0000018c 0000018E 00000190 00000190 00000192 00000193 00000195 00000197 00000198 00000198 0000019A 00000198 0000019C 0000019D 0000019F 0000019F 000001A2 000001A5 000001A? B030 AA 43 EB17 8A6435F2 88EO 02E8 0430 AA 88EO 240F 43 0430 AA 43 4E 79E9 887014 66891 F 8BC2 E980FEFFFF 000001AC 511 512 513 '/\'SSEMBl Y COMPL ErE I NO WARN I NGS I SOURCE sk ip_ teadi ng_zeroes: mav ah,bcd_byte[esl] ; Get BCD byte mov al,ah Copy value shr al ret Get high order digit and at,Ofh Set zero flag jnz enter_odd Exit loop if leading non zero found mov al,ah al,Ofh enter_even and jnz ; Get BCD byte again ; Get low order digit ; Exit loop if non zero digit found esi skip_leading_zeroes dec jns Decrement BCD index The significand was all zeroes. mav stasb at, '0' inc ebx short cxit_with_value jrnp Set initial zero ; Bump string length Now expand the BCD string into digH per byte values 0-9. mov mov ah,bcd byte[esi] Get BCD byte aL,ah - shr enter_odd: add stosh mav al,cl aL, Get high order digit 'a' Convert to ASCII Put digit into ASCII strl n9 area ; Get low order diglt al,ah al,Ofh and inc ; Bump fieLd size counter ebx enter_even: add stosb inc at, 'a' Convert to ASCI I Put di gft into ASCI I area Bump field size counter ; Go to next BCD byte ebx dec esi jns digit loop Conversion compLete. size and remainder. Set the string exit with_value: mav mov j~ edi ,size_ptr word ptr [ediJ,bx eax,edx ex; t_proc code Set return va 1ue ends end NO ERRORS. Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-16 NUMERIC PROGRAMMING EXAMPLES XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE GET POIIER 10 OBJECT MOOULE PLACED IN power10.obj ASSEMBLER INVOKED BY: asm386 power10.asm LOC OBJ LINE SOURCE +1 $title(Calculate the value of 10**ax) 3 This subroutine wi II calculate the 4 5 value of 10**eax. For values of o <= eax < 19, the resuLt wi 1t exact. All 80386 registers are transparent 6 7 8 and the value is returned on the TOS as two numbers, exponent in ST(l) and fraction in STeO). The exponent vaLue can be Larger than the largest 9 10 11 12 13 00000000 00000000 00000008 00000010 00000018 00000020 00000028 00000030 00000038 00000040 00000048 00000050 00000058 00000060 00000068 00000070 00000078 00000080 00000088 00000090 000000000000F03F 0000000000002440 0000000000005940 0000000000408F40 000000000088C340 00000000006AF840 0000000080842E41 0000000000126341 0000000084079741 0000000065COC041 000000205FA00242 000000E876483742 000000A2941 A6042 000040ES9C30A242 0000901EC4BC0642 00003426FS6BOC43 0080E03779C34143 00A0088557347643 00C84E6760C1AB43 00000098 00000098 3012000000 00000090 770B 0000009F 2E0004C500000000 000000A7 09F4 14 15 16 17 18 19 20 21 22 23 24 exponent of an extended rea 1 format number. Three stack entries are used. name public stack get_power_l0 get_power_l0,power_tabte stacKseg code segment pubt i c er Use exact values from 1.0 to le18. even dq 1.0,1e1,1e2,1e3 25 dq le4, le5,1e6, le7 26 dq 1e8,1e9,1e10,1e11 27 dq le12, le13,le14, 1e15 28 dq le16, le17, lela power table ; Optimize 16 bit access 29 30 31 32 cmp 33 ja 34 35 36 fld power_tabLe [eax*8]; Get exact value fxtract ; Separate power proc eax,18 out_of_range Test for a <= ax < 19 Figure 7-6. Floating-Point to ASCII Conversion Routine (Cont'd.) 7-17 NUMERIC PROGRAMMING EXAMPLES Shortness, speed, and accuracy were chosen rather than providing the maximum number of significant digits possible. An attempt is made to keep integers in their own domain to avoid unnecessary conversion errors. Using the extended precision real number format, this routine achieves a worst case accuracy of three units in the 16th decimal position for a non integer value or integers greater than 10 18 • This is double precision accuracy. With values having decimal exponents less than 100 in magnitude, the accuracy is one unit in the 17th decimal position. Higher precision can be achieved with greater care in programming, larger program size, and lower performance. 7.3.1 Function Partitioning Three separate modules implement the conversion. Most of the work of the conversion is done in the module FLOATING_TO_ASCII. The other modules are provided separately, because they have a more general use. One of them, GET_POWER_lO, is also used by the ASCII to floating-point conversion routine. The other small module, TOS_STATUS, identifies what, if anything, is in the top of the numeric register stack. 7.3.2 Exception Considerations Care is taken inside the function to avoid generating exceptions. Any possible numeric value is accepted. The only possible exception is insufficient space on the numeric register stack. The value passed in the numeric stack is checked for existence, type (NaN or infinity), and status (denormal, zero, sign). The string size is tested for a minimum and maximum value. If the top of the register stack is empty, or the string size is too small, the function returns with an error code. Overflow and underflow is avoided inside the function for very large or very small numbers. 7.3.3 Special Instructions The functions demonstrate the operation of several numeric instructions, different data types, and precision control. Shown are instructions for automatic conversion to BCD, calculating the value of 10 raised to an integer value, establishing and maintaining concurrency, data synchronization, and use of directed rounding on the NPX. Without the extended precision data type and built-in exponential function, the double precision accuracy of this function could not be attained with the size and speed of the shown example. The function relies on the numeric BCD data type for conversion from binary floating-point to decimal. It is not difficult to unpack the BCD digits into separate ASCII decimal digits. The major work involves scaling the floating-point value to the comparatively limited range of BCD values. To print a 9-digit result requires accurately scaling the given value to an 7-18 NUMERIC PROGRAMMING EXAMPLES integer between 108 and 109 • For example, the number +0.123456789 requires a scaling factor of lO9 to produce the value + 123456789.0, which can be stored in 9 BCD digits. The scale factor must be an exact power of lO to avoid changing any of the printed digit values. These routines should exactly convert all values exactly representable in decimal in the field size given. Integer values that fit in the given string size are not be scaled, but directly stored into the BCD form. Noninteger values exactly representable in decimal within the string size limits are also exactly converted. For example, 0.125 is exactly representable in binary or decimal. To convert this floating-point value to decimal, the scaling factor is 1000, resulting in 125. When scaling a value, the function must keep track of where the decimal point lies in the final decimal value. 7.3.4 Description of Operation Converting a floating-point number to decimal ASCII takes three major steps: identifying the magnitude of the number, scaling it for the BCD data type, and converting the BCD data type to a decimal ASCII string. Identifying the magnitude of the result requires finding the value X such that the number is represented by I X lOX, where 1.0 -< I < 10.0. Scaling the number requires multiplying it by a scaling factor lOS, so that the result is an integer requiring no more decimal digits than provided for in the ASCII string. Once scaled, the numeric rounding modes and BCD conversion put the number in a form easy to convert to decimal ASCII by host software. Implementing each of these three steps requires attention to detail. To begin with, not all floating-point values have a numeric meaning. Values such as infinity, indefinite, or NaN may be encountered by the conversion routine. The conversion routine should recognize these values and identify them uniquely. Special cases of numeric values also exist. Denormals have numeric values, but should be recognized because they indicate that precision was lost during some earlier calculations. Once it has been determined that the number has a numeric value, and it is normalized (setting appropriate denormal flags, if necessary, to indicate this to the calling program), the value must be scaled to the BCD range. 7.3.5 Scaling the Value To scale the number, its magnitude must be determined. It is sufficient to calculate the magnitude to an accuracy of 1 unit, or within a factor of 10 of the required value. After scaling the number, a check is made to see if the result falls in the range expected. If not, the result can be adjusted one decimal order of magnitude up or down. The adjustment test after the scaling is necessary due to inevitable inaccuracies in the scaling value. 7-19 NUMERIC PROGRAMMING EXAMPLES Because the magnitude estimate for the scale factor need only be close, a fast technique is used. The magnitude is estimated by multiplying the power of 2, the unbiased floating-point exponent, associated with the number by log102. Rounding the result to an integer produces an estimate of sufficient accuracy. Ignoring the fraction value can introduce a maximum error of 0.32 in the result. Using the magnitude of the value and size of the number string, the scaling factor can be calculated. Calculating the scaling factor is the most inaccurate operation of the conversion process. The relation 10x =2(X-log 21O) is used for this function. The exponentiate instruction F2XM 1 is used. Due to restrictions on the range of values allowed by the F2XM I instruction, the power of 2 value is split into integer and fraction components. The relation 2(1 + F) = 21 X 2F allows using the FSCALE instruction to recombine the 2F value, calculated through F2XM1, and the 2' part. 7.3.5.1 INACCURACY IN SCALING The inaccuracy in calculating the scale factor arises because of the trailing zeros placed into the fraction value of the power of two when stripping off the integer valued bits. For each integer valued bit in the power of 2 value separated from the fraction bits, one bit of precision is lost in the fraction field due to the zero fill occurring in the least significant bits. Up to 14 bits may be lost in the fraction because the largest allowed floating point exponent value is 214-1. These bits directly reduce the accuracy of the calculated scale factor, thereby reducing the accuracy of the scaled value. For numbers in the range of lO±30, a maximum of 8 bits of precision are lost in the scaling process. 7.3.5.2 AVOIDING UNDERFLOW AND OVERFLOW The fraction and exponent fields of the number are separated to avoid underflow and overflow in calculating the scaling values. For example, to scale lO~4932 to 108 requires a scaling factor of 10495°, which cannot be represented by the NPX. By separating the exponent and fraction, the scaling operation involves adding the exponents separate from multiplying the fractions. The exponent arithmetic involves small integers, all easily represented by the NPX. 7.3.5.3 FINAL ADJUSTMENTS It is possible that the power function (GeLPoweLlO) could produce a scaling value such that it forms a scaled result larger than the ASCII field could allow. For example, scaling 7-20 NUMERIC PROGRAMMING EXAMPLES 9.9999999999999999 X 10 4900 by 1.00000000000000010 X 10- 4883 produces 1.00000000000000009 X 10 18 • The scale factor is within the accuracy of the NPX and the result is within the conversion accuracy, but it cannot be represented in BCD format. This is why there is a post-scaling test on the magnitude of the result. The result can be multiplied or divided by 10, depending on whether the result was too small or too large, respectively. 7.3.6 Output Format For maximum flexibility in output formats, the position of the decimal point is indicated by a binary integer called the power value. If the power value is zero, then the decimal point is assumed to be at the right of the rightmost digit. Power values greater than zero indicate how many trailing zeros are not shown. For each unit below zero, move the decimal point to the left in the string. The last step of the conversion is storing the result in BCD and indicating where the decimal point lies. The BCD string is then unpacked into ASCII decimal characters. The ASCII sign is set corresponding to the sign of the original value. 7.4 TRIGONOMETRIC CALCULATION EXAMPLES (NOT TESTED) In this example, the kinematics of a robot arm is modeled with the 4 X 4 homogeneous transformation matrices proposed by Denavit and Hartenberg l •2 • The translational and rotational relationships between adjacent links are described with these matrices using the D-H matrix method. For each link, there is a 4 X 4 homogeneous transformation matrix that represents the link's coordinate system (LJ at the joint (J.) with respect to the previous link's coordinate system (J1- 1 , L i - I ). The following four geometric quantities completely describe the motion of any rigid joint/link pair (J i , L.), as Figure 7-7 illustrates. The angular displacement of the Xi axis from the Xi_I axis by rotating around the Zi_1 axis (antic1ockwise). di The distance from the origin of the (i-l)'h coordinate system along the axis to the Xi axis. The distance of the origin of the ith coordinate system from the along the -Xi axis. The angular displacement of the (anticlockwise ). Zi axis from the Zi_1 about the Zi_1 Zi_1 axis Xi aXIS 1. J. Denavit and R.S. Hartenberg, "A Kinematic Notation for Lower-Pair Mechanisms Based on Matrices," J. Applied Mechanics, June 1955, pp. 215-221. 2. C.S. George Lee, "Robot Arm Kinematics, Dynamics, and Control," IEEE Computer, Dec. 1982. 7-21 NUMERIC PROGRAMMING EXAMPLES d, JOINT,+, I---a'---l x, G40003 Figure 7-7. Relationships between Adjacent Joints 7-22 NUMERIC PROGRAMMING EXAMPLES The D-H transformation matrix AL for adjacent coordinate frames (from jointi_1 to jointi is calculated as follows: ___ where ... Tz,d represents a translation along the Zi_1 axis Tz,o represents a rotation of angle 8 about the Zi_1 axis Tx,a represents a translation along the Xi axis Tx,a represents a rotation of angle COS(Ji (Ji SIN o o 0' about the Xi axis -COS O'i SIN 8i COS O'i COS (Ji SIN O'i o SIN O'i SIN (Ji - SIN O'i COS COSO'i o (Ji COS (Ji SIN (Ji di 1 The composite homogeneous matrix T which represents the position and orientation of the joint/link pair with respect to the base system is obtained by successively multiplying the D-H transformation matrices for adjecent coordinate frames. This example in Figure 7-8 illustrates how the transformation process can be accomplished using the 80387. The program consists of two major procedures. The first procedure TRANS_PROC is used to calculate the elements in each D-H matrix, Ai-I' The second procedure MATRIXMUL_PROC finds the product of two successive D-H matrices. 7-23 NUMERIC PROGRAMMING EXAMPLES XEN!x286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MOOULE TOS STATUS OBJECT MODULE PLACED IN tos.obj ASSEMBLER INVOKED BY: asm386 tos.asm LOC OSJ LINE SOURCE +1 $title(Oetermine IDS register contents) This subroutine will return a value from 0-15 in eax corresponding to the contents of NPX IDS. 5 6 7 8 9 At t reg; sters are transparent and no errors are possible. The return value corresponds to c3,c2,cl,cO of FXAM instruction. 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 00000000 00000000 00000000 00000002 00000005 00000007 OOOOOOOC OOOOOOOf 00000011 00000013 D9E5 9BOFEO 88EO 2507400000 COEC03 08EO B400 C3 00000014 ASSEMBL Y COMPLETE I NO YARN I NGS, name public tos_status tos_status stack. stackseg code segment publ i c er tos_status proc fxam fstsw mov and shr or mov ret ; Get status of lOS reg; ster ax al,ah Get current status Put bit 10-8 into bits 2-0 eax,4007h Mask out bits c3,c2,cl,cO ah, 3 Put bit c3 into bit 11 at ,ah Put c3 into bit 3 ah,O tos_status endp code ends end CLear return vaLue NO ERRORS. Figure 7-8. Robot Arm Kinematics Example 7-24 NUMERIC PROGRAMMING EXAMPLES LOC 08J LINE 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 000000A9 C3 OOOOOOAA OOOOOOAA 09E9 OOOOOOAC C8040000 00000080 8945FC 00000083 00000086 0000008S 0000008A SOURCE OA4DFC 09ES D9EO 09Cl 0000008C D9FC ; and fraction ; OK to leave fxtract ruming ret Calculate the value using the exponentiate instruction .. The following relations are used: 10**x = 2··(10g2(10)*x) 2··U+F) :z; 2**1 .. 2**F if st(l) I: I and &t(O) = 2**F then fseale produces 2*·(1+1") fldl2t enter lOS = LOG2(10) 4,0 save poker of 10 value, P [ebp~4] IhOY ,eax lOS,X = LOGZ(10)*P • LOGZ(10**P) filDJl 91 ALPHA_DEG alp_deg<> 00000118 7177117111117111 00000120 111111???????777 00000128 ???????'????????? 00000130 ?????711?7?77?11 00000138 00000140 00000148 00000150 00000158 00000160 00000168 00000170 00000178 17???11????1?1?? 0000000000000000 1771111111111171 1111111711111117 ???????????????? 0000000000000000 0000000000000000 0000000000000000 0100000000000000 00000180 17711111 00000184 00000188 0000018C 00000190 00000198 000001AO DOOG01A8 00000180 00000184 0001 0004 0004 00000188 17171111 ??????71 77111171 1111111777111111 11771777????771? 7???717?????17?7 111??11111777711 00000000 84000000 92 THETA_DEG tht_deg<:> 93 A_VECTOR A_array<> 94 O_VECTOR D_Brray<> 95 96 97 98 99 01 C MACRO # II 00000000 00000000 D9EB 00000002 083584010000 00000008 D9CO OOOOOOOA DCOCCD80010000 00000011 D9C9 00000013 DCOCC088010000 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 i15 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 ZERO d180 NlIM_JOINT NUM_ROW NUll_COL REVERSE trans_data ends assl.De dd dd "'" "'" "'" db 0 180 1 4 4 1h ds:trans_data, es:trans_data trans_code contains the procedures for calculating matrix elements and i matrix nut tipL ications trans_code segment er public ; create nnemonics for fsincos which is not ; yet avai table from ASM386 8S of now codemacro fsincos dw Ofbd9h erdn transJlroc proc far Calculate alpha and theta in radians from their values in degrees fldpi fdiv dlBO Dupl i eate pi /1 BO fld st f1wl fxch f""'l qword ptr ALPHA_DEG [ec~'81 .t(l) qword ptr THETA_DEG[ec~'81 Figure 7-8. Robot Arm Kinematics Example (Cont'd.) 7-28 NUMERIC PROGRAMMING EXAMPLES theta(radians) in ST and alpha(radians) in ST(1) 135 136 137 0000001A 09FB 0000001 C 0000001E 00000020 00000027 0000002A 0000002C 0000002 F 00000031 00000038 0000003B 00000030 09CO 0013 OCOCCD90010000 005B18 09C9 005320 09CO OCOCC090010000 005B38 0ge2 09FB 138 Calculate matrix elements 139 a11 = cos theta a22 = cos aLpha 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 a12 = - cos alpha"" sin thet an = sin alpha * sin theta a14 = A * cos theta a21 ;: sin theta 166 ftd 167 fsincos a23 a24 a32 a33 .34 a31 * 005350 0ge9 005348 09C2 00000049 00000048 0000004E 00000050 00000052 00000055 08e9 005B10 08e8 09EO 005830 09C2 = a41 = a42 = a43 = 0.0 .44 =1 ebx contains the offset for the matrix fsincos fld fst iCOS theta ;s1n theta st ;dupl i cate [ebx].a11 ;cos theta frwt qword ptf A_VECTOR [ecx*8] fstp fxch fst fld 170 171 [ebx] .a14 ;A * cos theta in a14 ;sin theta in ST [ebx] .a21 ;51n theta in a21 5t ;dupt ieate sin theta fmut qword ptr A_VECTOR[ecx*81 fstp [ebx] . a24 ; A "" sin theta in a24 st(2) :alpha in ST ;cos aLpha in ST fst fxch fst fld [ebx] .833 st(1) [ebx] .a32 5T(2) 176 frrul 177 fstp flJlJl fchs fstp fld st,st(1) [ebx] .a13 st,st(3) 172 173 174 178 179 180 181 182 183 [ebx] .a23 st(2) 184 185 flrul fstp flrul 186 187 in ST in ST(l) cos theta in al1 8t(1) 175 00000057 08C9 00000059 005828 0000005C 08C9 theta =0 168 169 0000003F 00000042 00000044 00000047 cos theta = -sin aLpha'" cos = A ... sin theta = sin alpha = cos alpha st , st(1) [ebx] . a22 st , st(1) isin alpha in SHU ;sin theta in ST(2) ; cos theta in 5T(3) ;cos alpha in 833 ;sin alpha in 51 ;sin alpha in a32 ;sin theta in ST isin alpha in 5T(1) ;sin alpha * sin theta ;stored in a13 JCOS theta" sin alpha i-COS theta * sin alpha ;stored in 8Z3 JCOS theta in S1 ;cos alpha in ST(1) ;sin theta in 5T(Z) ices theta in 51(3) ices theta * cos alpha ; stored in a22 ;cos alpha * sin theta 188 189 190 191 To tak.e advantage af parallel operations 0000005E 50 192 push 0000005F 8B04eOA0010000 00000066 894358 193 194 195 196 also move 0 into a34 in a faster way mav eax, dword ptr D_VECTOR (ecx*8J mav dword ptr [ebx + 88], e9X between the CPU and NPX eax; save eax Figure 7-8_ Robot Arm Kinematics Example (Cont'd.) 7-29 NUMERIC PROGRAMMING EXAMPLES 00000069 00000070 00000073 00000074 00000076 8B04CDA4010000 89435C 58 D9EO DD5B08 00000079 CB 0000007A 0000007A 0000007A 55 00000078 51 0000007C 88CE 0000007E 6BC904 197 198 mov mov 199 pop 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 fens fstp 00000083 892C39 00000086 896C3904 0000008A 51 0000008B 00000088 01 E9 00000080 000408 00000090 8BCD 00000092 68C904 I i-COS alpha eax * sin theta [ebx] . a12 ; stored in a12 ;and all nonzero elements :have been calculated ret trans_proc endp matrix_elem proc far ; This procedure calculate the dot product of the ith row of the first matrix and the jth cotLlll'l of the second matrix: Tf j where Ti j = sun of Aik x Bkj over k parameters passed from the call ing routine, matrix_row: ESI = 0-1)*8 EOI = (j -1 )*S local register, ESP poJsh poJsh mov ebp = (k-1)*8 save ebp ecx ecx to be used as a tmp reg ecx, esi i save it for later indexing locating the element in the first matrix, A inul ecx, NUM_COl ecx contains offset due to preceding rows; the offset is from the beginning of the matrix 231 00000081 31ED eax, dword ptr D VECTOR (ecx*8 + 4] dword ptr [ebx +-92] eax ; restore eax 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 xor ebp, ebp; clear ebp, which wi II be used a temp reg to index( k) across the ith row of the first matrix as well as down the jth colLlm of the second matrix clear Tij for accuruLating Aik*Bkj mov dword ptr [ecx] Cedi] ,ebp mav dword ptr [ecxl [edi+41, ebp eex add save on stack: esi * nurn_col the offset of the beginnging of the ith row from the beginning of the A. matrix ecx, ebp ; get to the kth column entry of the ith row of the A. matrix load Aik into 80387 fld qword ptr [eax) [ecx] Loeat i n9 Bkj mav ecx, ebp imut ecx, NUM_ROW ; ecx contains the offset of the beginning of the kth row from the Figure 7-8. Robot Arm Kinematics Example (Cont'd.) 7-30 inter NUMERIC PROGRAMMING EXAMPLES 260 261 00000095 01F9 entry 00000097 DCOCOB 0000009A 59 0000009B 51 0000009C 01 F9 0000009E OC040A OOOOOOA 1 001 COA 000000A4 83C508 000000A7 83F020 OOOOOOAA 7CDF OOOOOOAC OOOOOOAO OOOOOOAE OOOOOOAF 59 59 50 CB OOOOOOBO OOOOOOBO OOOOOOBO 31 FF 000000B2 000000B2 000000B9 OOOOOOBC OOOOOOBF 000000C1 000000C2 000000C2 9A7AOOOOOO- - -83C708 83FF20 7CF1 CB 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 ; beginning of the B matrix get to the jth column add ecx, edt fmul qword ptr [ebx] [ecx]; Ai k * Skj of the kth row of the B ; matrix pop ecx esi * nun_col push ecx in ecx again a Lso at top of program stack add to the result in the output matrix, Tij add ecx, ed; accl..ITULating the sum of Ailt: * Skj fadd qword pt r [edx] [ecx] fstp qword pt r [edx] [ecx] increment k by 1, i.e., ebp by 8 add ebp. 8 Has k reached the width of the matrix yet? c"l' ebp. NUM_COL *8 jl NXT_k Restore registers pop ecx clear esi*m.IJI_col from stack pop pop ret ecx ebp restore ecx restore ebp matrix_row proc far xor edi. edi scan across a row NXT_COL: call matrix eLem add c~ edi, 8eeli, NUM_COl*8 j l NXT_COL ret This procedure does the matrix mut tipl ication by cat Ling matrix_row to calculate entries in each row The matrix multipl ication is performed in the fol towing manner, Tij :; Aik x Bkj where i and j denote the row and colutm respectively and k is the index for seaming across the ith row of the first matrix and the jth coll.firl of the second matrix. Figure 7-8. Robot Arm Kinematics Example (Cont'd.) 7-31 NUMERIC PROGRAMMING EXAMPLES 000000C2 5A 000000C3 5B 000000C4 58 000000C5 31 F6 000000C7 000000C7 OOOOOOCE 00000001 00000004 00000006 9ABOOOOOOO- - -83C608 83FE20 7CFl CB 00000007 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 pop edx ; offset Tmx ebx ; offset Brnx n edx n ebx pop eax ; offset"Amx n eax pop setup esi and 001 edt points to the colunn es; poi nts to the row xor esi. esi clear esi NXT_ROY: add cl11' matrix_row est, 8 esi, NUM_R0\I*8 jl NXT_ROY call ret 340 341 342 343 344 345 346 trans_code ends ; *************************************** Ma i n program 347 348 349 350 351 352 353 354 00000000 355 .;.***************************************.. main_code segment er START: 356 00000000 BCOOOOOOOO 357 mav 358 save at t regi sters esp, stackstart trans_stack 359 00000005 60 360 361 362 pushad 363 where no of matrices = NUM_JOINT + 1 364 Find the first matrix( from the base of the system to the first joint) and call it Bmx xor ecx, ecx 1st matrix may ebx, offset Brnx call trans_proc is Brnx inc ecx 365 366 00000006 00000008 00000000 00000014 00000015 31C9 BB80000000 9AOOOOOOOO- - - 41 367 368 369 370 371 372 373 374 375 376 3n 378 379 380 381 382 ECX denotes the nUl'ber of joints From the 2nd matrix and on, it will be stored in AffiX. The result from the first matrix multo is stored in Tmx but wi II be accessed as Bmx in the next mul tipl ication. As a matter of fact, the roles of 8mx and Tmx alternate in successive multipl ications. This is achieved by ; reversing the order of the Bmx and Tmx ; poi nters be; ng passed onto the program Figure 7-8. Robot Arm Kinematics Example (Cont'd.) 7-32 NUMERIC PROGRAMMING EXAMPLES 383 385 ; stack: Thus, this is invisible to the ; matrix tllJltiplication procedure. ; REVERSE serves as the indicator: 386 ; REVERSE 384 387 =0 means that the resul t is to placed in Tmx. 388 00000015 0000001A 00000021 00000022 00000029 BBOOOOOOOO 9AOOOOOOOO···· 41 8035B801000001 7511 389 390 391 ebx, offset Amx transj)roc ecx IIIOV call inc xor 392 393 :find Amx REVERSE, 1h BInX_BS_Tmx jnz 394 0000002B 00000030 00000035 0000003A 6800000000 6880000000 6800010000 EBOF 0000003C 0000003C 6800000000 00000041 6800010000 00000046 6880000000 00000048 0000004B 9AC2000000···· 00000052 83F901 00000055 7EBE 00000057 61 ASSEMBLY COMPLETE, NO WARNINGS, 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 no reversing. Bmx as the second input matrix white Trnx as the output matrix. push offset Amx push offset 8mx push offset Tmx j"" CONTINUE ; reversing. Tmx as the second input ; matrix while Bmx as the output matrix. 8mx as Tmx: offset Amx push offset Trnx ; revers i n9 the push offset 8mx ;pointers passed - Push CONTINUE: call clJ1) jt. matrixmuLyroc ecx, NOM_JOINT NXT_MATRIX if REVERSE = 1 then the f i na L answer wilt be in Bmx otherwise, in Tmx. popad end START, ds:trans_data, ss:trans_staek NO ERRORS. Figure 7-8. Robot Arm Kinematics Example (Cont'd.) 7-33 Machine Instruction Encoding and Decoding A APPENDIX A MACHINE INSTRUCTION ENCODING AND DECODING 1st Byte 2nd Byte Hex D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D8 D9 09 09 D9 09 D9 D9 D9 D9 09 D9 09 D9 09 D9 D9 09 D9 09 09 09 D9 D9 09 D9 D9 D9 D9 09 D9 D9 D9 Bytes 3-7 Binary 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 1101 1000 11011000 1101 1000 1101 1000 11011001 11011001 11011001 11011001 11011001 11011001 11011001 1101 1001 11011001 11011001 11011001 11011001 11011001 11011001 11011001 11011001 11011001 1101 1001 11011001 1101 1001 11011001 11011001 11011001 1101 1001 1101 1001 1101 1001 11011001 11011001 11011001 11011001 1101 1001 11011001 SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000 REG 11001 REG 1101 0 REG 11011 REG 11100 REG 11101 REG 11110 REG 1111 1 REG MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOO 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000 REG 11001 REG 1101 0000 1101 0001 1101 0011101 01-1101 1 REG 11100000 11100001 111000111100100 11100101 111001111101000 11101001 11101010 1110 1011 11101100 1110 1101 11101110 11101111 11110000 1111 0001 1111 0010 displ displ displ displ displ displ displ displ SIB, displ SIB, SIB, SIB, SIB, SIB, SIB, A-1 displ displ displ displ displ displ ASM386 Instruction Format FADD FMUL FCOM FCOMP FSUB FSUBR FDIV FDIVR FADD FMUL FCOM FCOMP FSUB FSUBR FDIV FDIVR FLO reserved FST FSTP FLOENV FLOCW FSTENV FSTCW FLO FXCH FNOP reserved reserved reserved reserved FCHS FABS reserved FTST FXAM reserved FLD1 FLDL2T FLOL2E FLOP I FLOLG2 FLDLN2 FLDZ reserved F2XM1 FYL2X FPTAN single-real single-real single-real single-real single-real single-real single-real single-real ST,ST(i) ST,ST(i) ST(i) ST(i) ST,ST(i) ST,ST(i) ST,ST(i) ST,ST(i) single-real single-real single-real 14 or 28 bytes'" 2 bytes 14 or 28 bytes'" 2 bytes ST(i) ST(i) MACHINE INSTRUCTION ENCODING AND DECODING 1st Byte Bytes 3-7 2nd Byte Hex D9 D9 D9 D9 D9 D9 D9 D9 D9 D9 D9 D9 D9 DA DA DA DA DA DA DA DA OA DA DA DA DA OA DA DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DC DC DC DC DC DC DC DC DC Binary 1101 1001 11011001 11011001 11011001 11011001 1101 1001 11011001 1101 1001 11011001 1101 1001 1101 1001 1101 1001 1101 1001 11011010 1101 1010 11011010 11011010 11011010 11011010 1101 1010 1101 1010 1101 1010 11011010 11011010 10101010 11011010 1101 1010 1101 1010 1101 1011 11011011 11011011 11011011 11011011 1101 1011 1101 1011 1101 1011 1101 1011 1101 1011 11011011 11011011 1101 1011 1101 1011 1101 1011 11011011 1101 1011 1101 1011 1101 1100 1101 1100 11011100 1101 1100 1101 1100 11011100 1101 1100 11011100 1101 1100 1111 0011 1111 0100 11110101 1111 0110 1111 0111 1111 1000 1111 1001 1111 1010 1111 1011 11111100 11111101 1111 1110 1111 1111 MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 110- ---11100--1110 1000 11101001 1110101111011-1111 ---MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 110- ---11100000 11100001 11100010 11100011 11100100 11100101 111001111101--1111 ---MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000REG A-2 SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ displ displ SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ displ displ SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ displ displ ASM386 Instruction Format FPATAN FXTRACT FPREM1 FDECSTP FINCSTP FPREM FYL2XP1 FSQRT FSINCOS FRNDINT FSCALE FSIN FCOS FIADD short-integer FIMUL short-integer FICOM short-integer FICOMP short-integer FISUB short-integer FISUBR short-integer FIDIV short-integer FIDIVR short-integer reserved reserved reserved FUCOMPP reserved reserved reserved short-integer FILD reserved FIST short-integer short-integer FISTP reserved extended-real FLO reserved FSTP extended-real reserved **(1 ) **(2) FCLEX FINIT **(3) reserved reserved reserved reserved FADD double-real FMUL double-real FCOM double-real FCOMP double-real double-real FSUB FSUBR double-real FDIV double-real FDIVR double-real FADD ST(i),ST MACHINE INSTRUCTION ENCODING AND DECODING 1st Byte Bytes 3-7 2nd Byte Hex DC DC DC DC DC DC DC DD DD DD DD DD DD DD DD DD DD DD DD DD DD DD DE DE DE DE DE DE DE DE DE DE DE DE DE DE DE DE DE DE DE DF DF DF DF DF DF DF DF DF DF DF DF DF DF Binary 1101 1100 1101 1100 1101 100 1101 1100 1101 1100 1101 1100 1101 1100 1101 1101 1101 1101 11011101 11011101 11011101 1101 1101 11011101 1101 1101 1101 1101 1101 1101 1101 1101 11011101 1101 1101 11011101 1101 1101 11011110 1101 1110 1101 1110 1101 1110 1101 1110 1101 1110 1101 1110 1101 1110 11011110 1101 1110 1101 1110 11011110 1101 1110 1101 1110 11011110 1101 1110 11011110 1101 1110 11011110 1101 1111 1101 1111 1101 1111 1101 1111 1101 1111 11011111 1101 1111 11011111 1101 1111 1101 1111 1101 1111 1101 1111 11011111 1101 1111 11001 REG 11010 REG 1101 1 REG 11100 REG 11101 REG 1111 0 REG 1111 1 REG MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000 REG 11001 REG 1101 0 REG 1101 1 REG 11100 REG 11101 REG 1111 ---MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000 REG 11001 REG 1101 0--1101 1000 1101 1001 11011011101 11-11100 REG 11101 REG 1111 0 REG 1111 1 REG MOD 000 RIM MOD 001 RIM MOD 010 RIM MOD 011 RIM MOD 100 RIM MOD 101 RIM MOD 110 RIM MOD 111 RIM 11000 REG 11001 REG 1101 0 REG 1101 1 REG 11100000 11100001 SIB, displ A-3 SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ displ displ SIB, SIB, SIB, SIB, SIB, SIB, SIB, SIB, displ displ displ displ displ displ displ displ ASM386 Instruction Format ST(i),ST FMUL reserved reserved FSUBR ST(i),ST FSUB ST(i),ST FDIVR ST(i),ST FDIV ST(i),ST FLD double-real reserved FST double-real FSTP double-real FRSTOR 94 or 108 bytes··· reserved 94 or 108 bytes··· FSAVE FSTSW 2 bytes FFREE ST(i) reserved FST ST(i) FSTP ST(i) FUCOM ST(i) FUCOMP ST(i) reserved FIADD word-integer FIMUL word-integer FICOM word-integer FICOMP word-integer FISUB word-integer FISUBR word-integer FIDIV word-integer FIDIVR word-integer FADDP ST(i),ST FMULP ST(i),ST reserved reserved FCOMPP reserved reserved FSUBRP ST(i),ST FSUBP ST(i),ST FDIVRP ST(i),ST FDIVP ST(i),ST FILD word-integer reserved FIST word-integer FISTP word-integer FBLD packed-decimal FILD long-integer FBSTP packed-decimal FISTP long-integer reserved reserved reserved reserved FSTSW AX reserved MACHINE INSTRUCTION ENCODING AND DECODING 1st Byte Bytes 3-7 2nd Byte Hex OF OF OF OF Binary 1101 1101 1101 1101 1111 1111 1111 1111 1110001111001-11101--1111 ---- ASM386 Instruction Format reserved reserved reserved reserved •• The marked encodings can be generated by the language translators; however, the 80387 treats them as FNOP. They correspond to the following 8087 or 80287 instructions. (1) FEN I (2) FOISI (3) FSETPM ••• The size of operand transferred depends on the 80386 operand-size attribute in effect for the instruction. A-4 Exception Summary B APPENDIX B EXCEPTION SUMMARY The following table lists the instruction mnemonics in alphabetical order. For each mnemonic, it summarizes the exceptions that the instruction may cause. When writing 80387 programs that may be used in an environment that employs numerics exception handlers, assemblylanguage programmers should be aware of the possible exceptions for each instruction in order to determine the need for exception synchronization. Chapter 4 explains the need for exception synchronization. Mnemonic Instruction IS I 0 F2XM1 FABS FADD(P) FBLD FBSTP FCHS FCLEX FCOM(P)(P) FCOS FDECSTP FDIV(R)(P) FFREE FIADD FICOM(P) FIDIV FIDIVR FILD FIMUL FINCSTP FINIT FIST(P) FISUB(R) FLD extended or stack FLD single or double FLD1 FLDCW FLDENV FLDL2E FLDL2T FLDLG2 FLDLN2 FLDPI 2X-1 Absolute value Add real BCD load BCD store and pop Change sign Clear exceptions Compare real Cosine Decrement stack pointer Divide real Free register Integer add Integer compare Integer divide Integer divide reversed Integer load Integer multiply Increment stack pOinter Initialize processor Integer store Integer subtract Load real y Y Y Y y Y y y y y y y y y y y Y Y Y y Y y y Y y y Y y y y Y y y y Y y Y Load real Y Load + 1.0 Load Control word Load environment Load log2e Load log21O Loadlog1Q2 Load 10g.,2 Load ... Y y Y y y y Y y IS-Invalid operand due to stack overflow/underflow I-Invalid operand due to other cause D-Denormal operand Z-Zero-divide O-Overflow U-Underflow P-Inexact result (precision) B-1 Z 0 y U P y y y y Y Y y y Y Y Y y y y y y y y y y y y y y y y y Y y y y y Y y y y y y y y Y y y Y Y y y EXCEPTION SUMMARY Mnemonic FLDZ FMUL(P) FNOP FPATAN FPREM FPREM1 FPTAN FRNDINT FRSTOR FSAVE FSCALE FSIN FSINCOS FSQRT FST(P) stack or extended FST(P) single or double FSTCW FSTENV FSTSW(AX) FSU8(R)(P) FTST FUCOM(P)(P) FWAIT FXAM FXCH FXTRACT FYL2X FYL2XP1 Instruction Load + 0.0 Multiply real No operation Partial arctangent Partial remainder IEEE partial remainder Partial tangent Round to integer Restore state Save state Scale Sine Sine and cosine Square root Store real Store real Store control word Store Environment Store status word Subtract real Test Unordered compare real CPU Wait Examine Exchange registers Extract Y oloQ2X Y oloQ2(X + 1) IS I 0 Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Z U P Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y y y Y Y Y Y Y Y Y Y IS-Invalid operand due to stack overflow/underflow I-Invalid operand due to other cause D-Denormal operand Z-Zero-divide O-Overflow U-Underflow P-Inexact result (precision) 8-2 Y Y 0 Y Y Compatibility Between the 80387 and the 80287/8087 C APPENDIX C COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 This appendix summarizes the differences between the 80387 and its predecessors the 80287 and the 8087, and analyzes the impact of these differences on software that must be transported from the 80287 or 8087 to the 80387. Any migration from the 8087 directly to the 80387 must also take into account the additional differences between the 8087 and the 80387 as listed in Appendix D of this manual. C.1 INITIALIZATION SEQUENCE Difference Description Reason Issue Impact on Software 80387 Behavior RESET, FINIT, and ERROR# PIN After a hardware RESET, the ERROR# output is asserted to indicate that an 80387 is present. To accomplish this, Ihe IE and ES bits of the status word are set, and the 1M bit in the control word is reset. After FINIT, the status word and the contrOl word have the same values as in an 80287/8087 after RESET. No difference between RESET and FINIT. for the Difference 8087/80287 Behavior 80387 initialization software must execute an FNINIT instruction to clear ERROR#. The FNINIT is not required for 80287/ 8087 software, though Intel documentation recommends its use (refer to the Numerics Supplement to the iAPX 286 Programmer's Reference Manua~. Permits the 80386 to differ· entiate between the 80287 and the 80387. C.2 DATA TYPES AND EXCEPTION HANDLING Difference Description Issue Impact on Software Reason for the Difference 80387 Behavior 8087/80287 Behavior NaN The 80387 distinguishes between signaling NaNs and quiet NaNs. The 80387 only generates quiet NaNs. An invalid-operation exception is raised only upon encountering a signaling NaN (except for FCOM, FIST, and FBSTP which also raise IE for quiet NaNs). The 80287/8087 only generates one kind of NaN (the equivalent of a quiet NaN) but raises an invalidoperation exception upon encountering any kind of NaN. Uninitialized memory locations that contain aNaNs should be changed to SNaNs to cause the 80387 to faUlt when uninitialized memory locations are referenced. IEEE Standard 754 compatibility. Pseudozero, Pseudo-NaN, The 80387 neither generates not supports these formats; it raises an invalid-operation exception whenever it encounters them in an arithmetic operation. The 80287/8087 defines and supports special handling for these formats. None. The 80387 does not generate these formats, and therefore will not encounter them unless a programmer deliberately enters them. IEEE Standard 754 compatibility. Pseudoi"'i"i!y, and Unnormal Formats C-1 COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 Difference Description Reason 80387 Behavior Tag Word Bits for Unsupported Data Formats Difference 8087/80287 Behavior The encoding in the tag The encoding lor pseudo- The exception handler may IEEE Standard 754 word for the unsupported data formats mentioned in zero and unnormal is need to be changed if compatibility. "valid" (type 00); the others are "special data" (type 10). programmers use such data types. Upon encountering a None. Software on the 80387 will continue to execute in cases where the 80287/8087 would trap. Section C.2.2 is "special data" (type 10). Invalid-Operation Exception No invalid-operation exception is raised upon encountering a denormal in FSORT, FDIV, or FPREM or upon conversion to BCD or to integer. The operation proceeds by lirst normalizing the value. Denormal Exception The denormal exception is The denormal exception is raised in transcendental not raised in transcendental instructions and instructions and FXTRACT. Overflow Exception for the Impact on Software Issue denormal in FSORT, FDIV, or FPREM or upon conversian to BCD or to integer, the invalid-operation exception is raised. The exception handler needs to be changed only to different opcodes. Overflow exception masked. Overflow exception masked. Overflow exception masked. If the rounding mode is set to chop (toward zero), the The 80287/8087 does not signal the overllow exception when the masked Under the most common rounding modes, no when the rounding control impact. II rounding is toward zero (chop), a program on the 80387 produces under overflow is not set to round to zero. conditions a result that is If rounding is set to chop (toward zero), the result is different in the least signilicant bit 01 the signilieand, compared to the result on the 80287. response is not infinity; i.e., it signals overflow only positive or negative infinity. Overflow exception not masked. Overflow exception not masked. Overflow exception not masked. The precision exception is The precision exception is flagged. When the result is stored in the stack, the cand is not rounded. If the result is stored on the stack, a program on the 80387 produces a different result under not Ilagged and the signili- significand is rounded according to the precision control (PC) bit of the overflow conditions than on the 80287/8087. The difference is apparent only to the exception handler. control word or according to the opcode. C-2 Performance enhancement for normal case. if it gives special treatment FXTRACT. result is the most positive or most negative number. Upgrade, to eliminate exception. IEEE Standard 754 compatibility. COMPATIBILITY BETWEEN HIE 80387 AND THE 80287/8087 Difference Description Reason Impact on Software Issue 80387 Behavior Underflow Exception Two related events contribute to underflow: 1. The creation tiny result. A tiny number, because it is so small, may cause some other exception later (such as overflow upon division). 2. Loss of accuracy during the denormalization of a tiny number. Conditions for underflow. Conditions for underflow. When the underflow exception is masked, the underflow exception is signaled when both the result is tiny and denormalization results in a loss of accuracy. When the underflow exception is masked and rounding is toward zero, the underflow exception flag is raised on tininess, regardless of loss of accuracy. Response to underflow. Response to underflow. When the underflow exception is unmasked and the instruction is supposed to store the result on the stack, the significand is rounded to the appropriate precision (according to the precision control (PC) bit of the control word, for those instructions controlled by PC, otherwise to extended precision). When the underflow exception is not masked and the destination is the stack, the significand is not rounded but rather is left as is. There is no difference in the precedence of the denormal exception, whether it be masked or not. When the denormal exception is not masked, it takes precedence over all other exceptions. for the Difference 8087/80287 Behavior Underflow exception masked. IEEE Standard 754 compatibility. No impact. The underflow exception occurs less often when rounding is toward zero. Underflow exception not masked. A program on the 80387 produces a different result during underflow conditions than on the 80287/ 8087 if the result is stored on the stack. The difference is only in the least significant bit of the si9n;ficand and is apparent only to the exception handler. Which of these events triggers the underfiow exception depends on whether the underflow exception is masked. Exception Precedence None, but some unneeded normalization of denormal operands is prevented on the 80387. Operational improvement. C.3 TAG, STATUS, AND CONTROL WORDS Difference Description Impact on Software Issue 80387 Behavior 8087/80287 Behavior Reason for the Difference Bits C3-CO of Status Word After FINIT, incomplete FPREM, and hardware reset, the 80387 sets these bits to zero. After FINIT, incomplete FPREM, and hardware reset, the 80287/8087 leaves these bits intact (they contain the prior value). None. Upgrade, to provide consistent state after reset. Bit C2 of Status Word Bit 10 (C2) serves as an incomplete bit for FPTAN. This bit is undefined for FPTAN. None. Programs don't check C2 after FPTAN. Upgrade to allow fast checking of operand range. Infinity Control Only affine ciosure is supported. Bit 12 remains programmable but has no effect on 80387 operation. Both affine and projective closures are supported. After RESET, the default value in the control word is projective. Software that requires projective inlinity arithmetic may give different results. iEEE Standard 754 compatibility. C-3 COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 Difference Description Issue Impact on Software 80387 Behavior 8087/80287 Behavior Reason for the Difference Status Word Bit 6 for Stack Fault When an invalid-operation exception occurs due to stack overflow or underflow, not only is bit 0 (IE) of the status word set, but also bit 6 is set to indicate a stack fault and bit 9 (C1) specifies overflow or underflow. Bit 6 is called SF and serves to distinguish invalid exceptions caused by stack overflow/ underflow from those caused by numeric operations. When an invalid-operation exception occurs due to stack overflow or underflow, only bit 0 (IE) of the status word is set. Bit 6 is RESERVED. None. Existing exception handlers need not change, but may be upgraded to take advantage of the additional information. Newly written handlers will be more effective. Upgrade and performance Improvement. Tag Word When loading the tag word with an FLO EN V or FRSTOR instruction, the only interpretations of tag values used by the 80387 are empty (value 11) and nonempty (values 00, 01, and 10). Subsequent operations on a nonempty register always examine the value in the register, not the value in its tag. The FSTENV and FSAVE instructions examine the nonempty registers and put the correct values in the tags before storing the tag word. The corresponding tag is checked before each reg15ter access to determine the class of operand in the register; the tag is updated after every change to a register so that the tag always reflects the most recent status of the register. Programmers can load a tag with a vaille that disagrees with the contents 0/ a register (for example, the register contains valid contents, but the tag says special; the 80287/8087, in this case, honors the tag and does not examine the register). Software may not operate correctly if it uses FLDENV or FRSTOR to change tags to values (other than empty) that are different from actual register contents. Performance improvement C.4 INSTRUCTION SET Difference Description Impact on Software Issue 80387 Behavior 8087180287 Behavior Reason for the Difference FBSTP, FDIV, FIST(P), FPREM, FSQRT Operation on denormal operand is supported. An underflow exception can occur. Operation on denormal operand raises invalidoperation exception. Underflow is not possible. The exception handler for underflow may require change only if it gives different treatment to different opcodes. Possibly fewer invalid-operation exceptions will occur. IEEE Standard 754 compatibility. FSCALE The range of the scaling operand is not restricted. If 0< IST(1)1 < 1, the scaling factor is zero; therefore, ST(O) remains unchanged. If the rounded result is not exact or if there was a loss of accuracy (masked underflow), the precision exception is signaled. The range of the scaling operand is retricted. If 0 < I ST(1) I < 1, the result is undefined and no exception is signaled. Different result when 0 < IST(1)1< 1. Upgrade. C-4 COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 Difference Description 80387 Behavior FPREMl Performs partial remainder Difference 8087/80287 Behavior Does not exist. None. IEEE Standard 754 compatibility and upgrade. The quotient bits are incorrect when performing a None. Software that works around the bug should not be affected. Upgrade. according to IEEE Standard 754 standard. FPREM FUCOM, FUCOMP, FUCOMPP FPTAN Reason for the Impact on Software Issue Bits CO, C3, Cl of the status word, correctly 64 N + M when reflect the three low-order reduction of bits of the quotient. N:2: 1 and Perform unordered Do not exist. None. IEEE Standard 754 compatibility. Range of operand is restricted (I ST(O) I < ,,/4); operand must be reduced to range using FPREM. None. Upgrade. M~l or M~2. compare according to IEEE Standard 754 standard. Range of operand is much less restricted ( I ST(O) I < 263); reduces operand internally using an internal ,,/4 constant that is more accurate. After a stack overflow After a stack overflow IEEE Standard 754 compatibility. when the invalid-operation when the invalid-operation exception is masked, both ST and ST(l) contain quiet NaNs. exception is masked, the original operand remains unchanged, but is pushed toST(l). FSIN, FCOS, FSINCOS Perform three common trigonometric functions. Do not exist. None. Upgrade. FPATAN Range of operands is unrestricted. I ST(O) I must be smaller than I ST(l) I. None. Upgrade. Wider range of operand The supported operand range is 0 :5 ST (0) :5 0.5. None. Upgrade. (-1 :5ST(O):5 +1). Does not report denormal exception because the Reports denormal None. Upgrade. exception. None. Software usually bypasses zero and co. IEEE 754 recommendation to fully support the 10gb F2XMl FLO extended~real instruction is not arithmetic. FXTRACT If the operand is zero, the reported and ST(l) is -co. II the operand is +co, no If the operand is zero, ST(l) is zero and no exception is reported. If the operand is + co, the exception is reported. invalid-operation exception zero-divide exception is function. is reported. FLO constant Rounding control is in Rounding control is not in effect. effect. Results are the same as for the 8087/80287 when rounding control is set to round to zero, round to -co, and (in the case of FLDL2T) round to nearest. Results are different by one in the least significant bit of the signilicand in round to + CXJ and round to nearest (excluding FLDL2T). FLDl and FLDZ are always the same. C-5 IEEE 754 recommendation. COMPATIBILITY BETWEEN THE 80387 AND THE 80287/8087 Difference D••crlptlon Realon Impact on Software Isaue lor the Difference 80387 Behavior 8087/80287 Behavior Loading a denormal causes the number to be converted to extended precision (because it is put on the stack). Loading a denormal causes the number to be converted to an unnormal. If the next instruction is FXTRACT or FXAM, the 80387 will give a different resu~ than the 80287/8087. IEEE Standard 754 compatibility. FLO .Inglel double preclalon When loading a signaling NaN, raises invalid exception. Does not raise an exception when loading a signaling NaN. The exception handler need to be updated to handle this condition. IEEE Standard 754 compatibility. FSETPM Treated as FNOP (no operation). Informs the 80287 that the system is in protected mode. None. The 80386 handles all addressing and exceptionpointer information, whether in protected mode or not. FXAM When encountering an empty register, the 80387 will not generate combinations of C3-CO equal to 1101 or 1111. May generate these combinations, among others. None. Upgrade, to provide repeatable results. All Tranlcendental Instructions May generate different resu~ in round-up bit of status word. Round-up bit of status word is undefined for these instructions. None. Upgrade, to signal rounding status. FLD Iinglel double precision C-6 Compatibility Between the 80387 and the 8087 D APPENDIX D COMPATIBILITY BETWEEN THE 80387 AND THE 8087 The 80386/80387 operating in real-address mode will execute 8087 programs without major modification. However, because of differences in the handling of numeric exceptions between the 80387 NPX and the 8087 NPX, exception-handling routines may need to be changed. This appendix summarizes the additional differences between the 80387 NPX and the 8087 NPX (other than those already included in Appendix B), and provides details showing how 8087 programs can be ported to the 80387. 1. The 80387 signals exceptions through a dedicated ERROR# line to the 80386; no interrupt controller is needed for this purpose. The 8087 requires an interrupt controller (8259A) to interrupt the CPU when an unmasked exception occurs. Therefore, any interrupt-con troller-oriented instructions in numeric exception handlers for the 8087 should be deleted. 2. The 8087 instructions FENI/FNENI and FDISI/FNDISI perform no useful function in the 80387. If the 80387 encounters one of these opcodes in its instruction stream, the instruction will effectively be ignored-none of the 80387 internal states will be updated. While 8087 code containing these instructions may be executed on the 80387, it is unlikely that the exception-handling routines containing these instructions will be completely portable to the 80387. 3. In real mode and protected mode (not including virtual 8086 mode), interrupt vector 16 must point to the numeric exception handling routine. In virtual 8086 mode, the V86 monitor can be programmed to accommodate a different location of the interrupt vector for numeric exceptions. 4. The ESC instruction address saved in the 80386/80387 or 80386/80287 includes any leading prefixes before the ESC opcode. The corresponding address saved in the 8086/8087 does not include leading prefixes. 5. In protected mode (not including virtual 8086 mode), the format of the 80387's saved instruction and address pointers is different than for the 8087. The instruction opcode is not saved in protected mode-exception handlers will have to retrieve the opcode from memory if needed. 6. Interrupt 7 will occur in the 80386 when executing ESC instructions with either TS (task switched) or EM (emulation) of the 80386 MSW set (TS= 1 or EM = 1). If TS is set, then a WAIT instruction will also cause interrupt 7. An exception handler should be included in 80387 code to handle these situations. 7. Interrupt 9 will occur if the second or subsequent words of a floating-point operand fall outside a segment's size. Interrupt 13 will occur if the starting address of a numeric operand falls outside a segment's size. An exception handler should be included to report these programming errors. D-1 COMPATIBILITY BETWEEN THE 80387 AND THE 8087 8. Except for the processor control instructions, all of the 80387 numeric instructions are automatically synchronized by the 80386 CPU-the 80386 automatically waits until all operands have been transferred between the 80386 and the 80387 before executing the next ESC instruction. No explicit WAIT instructions are required to assure this synchronization. For the 8087 used with 8086 and 8088 processors, explicit WAITs are required before each numeric instruction to ensure synchronization. Although 8087 programs having explicit WAIT instructions will execute perfectly on the 80387 without reassembly, these WAIT instructions are unnecessary. 9. Since the 80387 does not require WAIT instructions before each numeric instruction, the ASM386 assembler does not automatically generate these WAIT instructions. The ASM86 assembler, however, automatically precedes every ESC instruction with a WAIT instruction. Although numeric routines generated using the ASM86 assembler will generally execute correctly on the 80386/20, reassembly using ASM386 may result in a more compact code image and faster execution. The processor control instructions for the 80387 may be coded using either aWAIT or No-WAIT form of mnemonic. The WAIT forms of these instructions cause ASM386 to precede the ESC instruction with a CPU WAIT instruction, in the identical manner as does ASM86. 10. The address of a memory operand stored by FSAVE or FSTENV is undefined if the previous ESC instruction did not refer to memory. 11. Because the 80387 automatically normalizes denormal numbers when possible, an 8087 program that uses the denormal exception solely to normalize denormal operands can run on an 80387 by masking the denormal exception. The 8087 denormal exception handler would not be used by the 80387 in this case. A numerics program runs faster when the 80387 performs normalization of denormal operands. A program can detect at run-time whether it is running on an 80387 or 8087/80287 and disable the denormal exception when an 80387 is used. D-2 80387 80-Bit CHMOS III Numeric Processor Extension E This appendix is a copy of the 80387 Data Sheet, which is also available separately. (The AC specifications have been deliberately left out.) The specifications in data sheets are subject to change; consult the most recent data sheet for design-in information. 80387 80-BIT CHMOS III NUMERIC PROCESSOR EXTENSION High Performance SO-Bit Internal • Architecture ANSI/IEEE Standard 754• Implements 19S5 for Binary Floating-Point Arithmetic to Six Times SOS7/S02S7 • Five Performance Upward Object-Code Compatible from • SOS7 and S02S7 Expands S03S6 Data Types to Include • 32-, 64-, SO-Bit Floating POint, 32-, 64Bit Integers and 1S-Digit BCD Operands Full-Range Transcendental Operations • for SINE, COSINE, TANGENT, ARCTANGENT and LOGARITHM • Operates Independently of Real, • Protected and Virtual-SOS6 Modes of Built-In Exception Handling the S03S6 SO-Bit Numeric Registers, Usable • asEightIndividually Addressable General Registers or as a Register Stack • Available in 6S-Pin PGA Package (See Packaging Spec: Order #231369) Extends S03S6 Instruction Set • toDirectly Include Trigonometric, Logarithmic, Exponential and Arithmetic Instructions for All Data Types The Intel 80387 is a high-performance numerics processor extension that extends the 80386 architecture with floating point, extended integer and BCD data types. The 80386/80387 computing system fully conforms to the ANSIIIEEE floating-point standard. Using a numerics oriented architecture, the 80387 adds over seventy mnemonics to the 80386/80387 instruction set, making the 80386/80387 a complete solution for high-performance numerics processing. The 80387 is implemented with 1.5 micron, high-speed CHMOS III technology and packaged in a 68-pin ceramic pin grid array (PGA) package. The 80386/80387 is upward object-code compatible from the 80386/80287, 80286/80287 and 808618087 computing systems. BUS CONTROL LOGIC I I DATA INTERFACE AND CONTROL UNIT I 31 FLOATING POINT UNIT DBUS INTERFACE DATA ALIGNMENT AND OPERAND CHECKING 16 00-D31 386CLK2 387CLK2 231920-1 Figure 0.1. 80387 Block Diagram Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in an Intel product. No other circuit patent January 1987 licenses are implied. Information contained herein supersedes previously published specifications on these devices from Intel. CD Intel Corporation, 1987 Order Number: 231920·002 intJ 80387 CONTENTS 1.0 Functional Description. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 2.0 Programming Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Data Types ..... .. ...................................................... 2.2 Numeric Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Register Set ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Data Registers ....................................................... 2.3.2 Tag Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Status Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Instruction and Data Pointers ......................................... 2.3.5 Control Word........................................................ 2.4 Interrupt Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Exception Handling...................................................... 2.6 Initialization ............................................................. 2.7 8087 and 80287 Compatibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.7.1 General Differences. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.2 Exceptions........................................................... 3.0 Hardware Interface .................................. , . . . . . . . . . . . . . . . . . . . . . . 3.1 Signal Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 80386 Clock 2 (386CLK2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 3.1.2 80387 Clock 2 (387CLK2). . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . 3.1.3 80387 Clocking Mode (CKM)........................................... 3.1.4 System Reset (RESETIN). . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . 3.1.5 Processor Extension Request (PEREQ) ................................ 3.1.6 Busy Status (BUSY #) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.7 Error Status (ERROR #) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.8 Data Pins (D31-DO) .............................. '" ..... ... ...... .... 3.1.9 Write/Read Bus Cycle (W/R#) ...... .................................. 3.1.10 Address Strobe (ADS#) ............................................. 3.1.11 BusReadylnput(READY#).......................................... 3.1.12 Ready Output (READYO #) . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . 3.1.13 Status Enable (STEN) .. .. . . . .. . . .. . . .. . .. . . .. .. . ... .. . . .. .. .. . . . . .. .. 3.1.14 NPX Select #1 (NPS1#).............................................. 3.1.15 NPXSelect #2 (NPS2) ......... ..... ........ ............. ............ 3.1.16 Command (CMDO#) ................................................. 3.2 Processor Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Bus Control Logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Data Interface and Control Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Floating Point Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 System Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Bus Cycle Tracking. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 80387 Addressing .................................................... 3.3.3 Function Select ...................................................... 3.3.4 CPU/NPX Synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 Synchronous or Asynchronous Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.6 Automatic Bus Cycle Termination ..................................... 3.4 Bus Operation ........................................................... 3.4.1 Nonpipelined Bus Cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1.1 Write Cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1.2 Read Cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Pipelined Bus Cycles ................................................. 3.4.3 Bus Cycles of Mixed Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4 BUSY # and PEREQ Timing Relationship .......... . . . . . . . . . . . . . . . . . . . . . 4.0 Mechanical Data ........................................................... 2 4 5 5 5 7 7 7 8 11 13 13 14 14 15 15 16 16 16 16 16 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 20 20 20 21 21 21 21 21 22 22 23 23 23 24 25 25 27 inter 80387 5.0 Electrical Data .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Absolute Maximum Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 DC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.0 80387 Extensions to the 80386 Instruction Set ............................... Appendix A-Compatibility Between the 80287 NPX and the 8087 . . . . . . . . . . . . . . . . . . 28 28 28 29 33 37 FIGURES Figure 2.7 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 Figure 3.7 Figure 4.1 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6 80387 Block Diagram .............................................. . 80386/80387 Register Set .......................................... . 80387 Tag Word ................................................... . 80387 Status Word ................................................ . Protected Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format ................................................... . Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format ......................................................... . Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format ................................................... . Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format ......................................................... . 80387 Control Word ............................................... . 80387 Pin Configuration ........................................... . 80386/80387 System Configuration ................................. . Bus State Diagram ................................................ . Nonpipelined Read and Write Cycles ............................... . Fastest Transitions to and from Pipelined Cycles .................... . Pipelined Cycles with Wait States .................................. . STEN, BUSY # and PEREQ Timing Relationship ...................... . Package Description .............................................. . 386CLK2/387CLK2 Waveform ...................................... . Output Signals .................................................... . Input and 1/0 Signals .............................................. . RESET Signal ..................................................... . Float from STEN .................................................. . Other Parameters ................................................. . Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 2.6 Table 2.7 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 5.1 Table 5.2 Table 5.3 80387 Data Type Representation in Memory ......................... . Condition Code Interpretation ..................................... . Condition Code Interpretation after FPREM and FPREM11nstructions . Condition Code Resulting from Comparison ........................ . Condition Code Defining Operand Class ............................ . 80386 Interrupt Vectors Reserved for NPX .......................... . Exceptions ....................................................... . 80387 Pin Summary ............................................... . 80387 Pin Cross-Reference ........................................ . Output Pin Status after Reset ...................................... . Bus Cycles Definition .............................................. . DC Specifications ................................................. . Timing Requirements .............................................. . Other Parameters ................................................. . Figure 0.1 Figure 1.1 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 1 4 7 8 11 12 12 12 13 18 20 22 24 25 26 26 27 30 30 31 31 31 32 TABLES 3 6 9 10 10 10 14 15 17 17 18 21 28 29 32 intJ 80387 80386 Registers GENERAL REGISTERS 15 31 0 1 CH 1 CL EDX DX 1 DH -I 64 63 Exponent 0 Significand Tag Field 1 0 ,---- f------ f--f------ R3 f------ ES CX 78 R2 DS 1 BH 1 BL ECX Sign R1 SS BX 79 RO CS 1 AH 1 AL EBX i SEGMENT REGISTERS 15 0 AX EAX 80387 Data Registers R4 FS R5 GS R6 DL f--f--f------ R7 '----- 31 ESI 1 EDI 1 EBP 0 SI DI I : EF~GS 15 0 Control Register : I Status Register 47 0 I Instruction Pointer (in 80386) 1 1 Data Pointer (in 80386) 1 Tag Word I BP 1 SP ESP : Figure 1.1.80386/80387 Register Set In real-address mode and virtual-8086 mode, the 80386/80387 is completely upward compatible with software for 808618087, 80286/80287 real-address mode, and 80386/80287 real-address mode systems. 1.0 FUNCTIONAL DESCRIPTION The 80387 Numeric Processor Extension (NPX) provides arithmetic instructions for a variety of numeric data types in 80386/80387 systems. It also executes numerous built-in transcendental functions (e.g. tangent, sine, cosine, and log functions). The 80387 effectively extends the register and instruction set of an 80386 system for existing data types and adds several new data types as well. Figure 1.1 shows the model of registers visible to 80386/80387 programs. Essentially, the 80387 can be treated as an additional resource or an extension to the 80386. The 80386 together with an 80387 can be used as a single unified system, the 80386/80387. In protected mode, the 80386/80387 is completely upward compatible with software for 80286/80287 protected mode, and 80386/80287 protected mode systems. The only differences of operation that may appear when 808618087 programs are ported to a protected-mode 80386/80387 system (not using virtual8086 mode), is in the format of operands for the administrative instructions FLDENV, FSTENV, FRSTOR and FSAVE. These instructions are normally used only by exception handlers and operating systems, not by applications programs. The 80387 works the same whether the 80386 is executing in real-address mode, protected mode, or virtual-8086 mode. All memory access is handled by the 80386; the 80387 merely operates on instructions and values passed to it by the 80386. Therefore, the 80387 is not sensitive to the processing mode of the 80386. The 80387 contains three functional units that can operate in parallel to increase system performance. The 80386 can be transferring commands and data to the 80387 bus control logic for the next instruction while the 80387 floating-point unit is performing the current numeric instruction. 4 inter 80387 2.0 PROGRAMMING INTERFACE 2.1 Data Types The 80387 adds to an 80386 system additional data types, registers, instructions, and interrupts specifically designed to facilitate high-speed numerics processing. To use the 80387 requires no special programming tools, because all new instructions and data types are directly supported by the 80386 assembler and compilers for high-level languages. All 8086/8088 development tools that support the 8087 can also be used to develop software for the 80386/80387 in real-address mode or virtual-8086 mode. All 80286 development tools that support the 80287 can also be used to develop software for the 80386/80387. Table 2.1 lists the seven data types that the 80387 supports and presents the format for each type. Operands are stored in memory with the least significant digit at the lowest memory address. Programs retrieve these values by generating the lowest address. For maximum system performance, all operands should start at physical-memory addresses evenly divisible by four (doubleword boundaries); operands may begin at any other addresses, but will require extra memory cycles to access the entire operand. Internally, the 80387 holds all numbers in the extended-precision real format. Instructions that load operands from memory automatically convert operands represented in memory as 16-, 32-, or 64-bit integers, 32- or 64-bit floating-point numbers, or 18digit packed BCD numbers into extended-precision real format. Instructions that store operands in memory perform the inverse type conversion. All communication between the 80386 and the 80387 is transparent to applications software. The CPU automatically controls the 80387 whenever a numerics instruction is executed. All physical memory and virtual memory of the CPU are available for storage of the instructions and operands of programs that use the 80387. All memory addressing modes, including use of displacement, base register, index register, and scaling, are available for addressing numerics operands. 2.2 Numeric Operands A typical NPX instruction accepts one or two operands and produces a single result. In two-operand instructions, one operand is the contents of an NPX register, while the other may be a memory location. The operands of some instructions are predefined; for example FSQRT always takes the square root of the number in the top stack element. Section 6 at the end of this data sheet lists by class the instructions that the 80387 adds to the instruction set of an 80386 system. 5 inter 80387 Table 2.1. 80387 Data Type Representation in Memory Data Formats Word Integer 104 Precision 7 017 017 017 017 017 017 01 7 01 7 01 7 109 1019 COMPLEMENT) 11TWO S 32 Bits COMPLEMENT} 0 I (TWO S COMPLEM(NT) 64 Bits 0 63 Packed BCD 1018 18 Digits Sl 79 Single Precision 10±38 24 Bits 10±308 53 Bits x Id" MAGNITUDE sL 10-,-4932 64 Bits sL d 1.1 d'j d12 d" d 'U d'-j d, d, d, d. d d; 72 ,1 d, dJ 0 I SIGN1FlCAND I BIASED EXPONENT I 0 2 3 ' - I, 63 Extended Precision dl~ d1t;. ;\ BIASED S EXPONENT J1 Double Precision I 0 31 Long Integer a .lITWO S 16 Bits 15 Short Integer HIGHEST ADDRESSED BYTE Most Significant Byte Range SIGNIFtCAND BIASED EXPONENT h SIGNIFICANO 64 63' 79 I 0 52'-1 .. I 0 231920-2 NOTES: (1) S ~ Sign bit (0 ~ positive, 1 ~ negative) (2) dn ~ Decimal digit (two per byte) (3) X = Bits have no significance; 80387 ignores when loading, zeros when storing (4). = Position of implicit binary point (5) I = Integer bit of significand; stored in temporary real, implicit in single and double precision (6) Exponent Bias (normalized values): Single: 127 (7FH) Double: 1023 (3FFH) Extended Real: 16383 (3FFFH) (7) Packed BCD: (-I)S (017 ... 00) (8) Real: (-I)S (2E-BIAS) (Fo F1"') 6 inter 80387 o 15 TAG (7) TAG (6) TAG (5) TAG (4) TAG (3) TAG (2) TAG (1) TAG (0) NOTE: The index i of tag(i) is not top-relative. A program typically uses the "top" field of Status Word to determine which tag(i) field refers to logical top of stack. TAG VALUES: 00 = Valid 01 = Zero 10 = QNaN, SNaN, Infinity, Denormal and Unsupported Formats 11 = Empty Figure 2.1. 80387 Tag Word TOP by one. Like 80386 stacks in memory, the 80387 register stack grows "down" toward loweraddressed registers. 2.3 Register Set Figure 1.1 shows the 80387 register set. When an 80387 is present in a system, programmers may use these registers in addition to the registers normally available on the 80386. Instructions may address the data registers either implicitly or explicitly. Many instructions operate on the register at the TOP of the stack. These instructions implicitly address the register at which TOP points. Other instructions allow the programmer to explicitly specify which register to user. This explicit register addressing is also relative to TOP. 2.3.1 DATA REGISTERS 80387 computations use the 80387's data registers. These eight 80-bit registers provide the equivalent capacity of twenty 32-bit registers. Each of the eight data registers in the 80387 is 80 bits wide and is divided into "fields" corresponding to the NPXs extended-precision real data type. 2.3.2 TAG WORD The tag word marks the content of each numeric data register, as Figure 2.1 shows. Each two-bit tag represents one of the eight numerics registers. The principal function of the tag word is to optimize the NPXs performance and stack handling by making it possible to distinguish between empty and nonempty register locations. It also enables exception handlers to check the contents of a stack location without the need to perform complex decoding of the actual data. The 80387 register set can be accessed either as a stack, with instructions operating on the top one or two stack elements, or as a fixed register set, with instructions operating on explicitly designated registers. The TOP field in the status word identifies the current top-of-stack register. A "push" operation decrements TOP by one and loads a value into the new top register. A "pop" operation stores the value from the current top register and then increments 7 intJ 80387 , - - - - - - - - - - - - - - - - - - 80387 BUSY , - - , - , - - - - - - - - - - - - - - - TOP OF STACK POINTER ,-H-+--r---,-,-------------- CONDITION CODE ERROR SUMMARY STATUS - - - - - - - ' STACK FLAG _ _ _ _ _ _ _--l EXCEPTION FLAGS: PRECISION - - - - - - - - - - - " UNDERFLOW ---------~ OVERFLOW - - - - - - - - - - - - - - - ' ZERO DIVIDE - - - - - - - - - - - - - ' DENORMALIZED OPERAND - - - - - - - - - - - - - - - ' INVALID OPERATION - - - - - - - - - - - - - - - - - ' 231920-3 ES is set if any unmasked exception bit is set; cleared otherwise. See Table 2.2 for interpretation of condition code. TOP values: 000 ~ Register 0 is Top of Stack 001 ~ Register 1 is Top of Stack 111 ~ Register 7 is Top of Stack For definitions of exceptions, refer to the section entitled "Exception Handling" Figure 2.2. 80387 Status Word Bit 6 is the stack flag (SF). This bit is used to distinguish invalid operations due to stack overflow or underflow from other kinds of invalid operations. When SF is set, bit 9 (C1) distinguishes between stack overflow (C1 = 1) and underflow (C 1 = 0). 2.3.3 STATUS WORD The 16-bit status word (in the status register) shown in Figure 2.2 reflects the overall state of the 80387. It may be read and inspected by CPU code. Bit 15, the B-bit (busy bit) is included for 8087 compatibility only. It reflects the contents of the ES bit (bit 7 of the status word), not the status of the BUSY # output of 80387/80287. Figure 2.2 shows the six exception flags in bits 5-0 of the status word. Bits 5-0 are set to indicate that the 80387 has detected an exception while executing an instruction. A later section entitled "Exception Handling" explains how they are set and used. Bits 13-11 (TOP) point to the 80387 register that is the current top-of-stack. Note that when a new value is loaded into the status word by the FLDENV or FRSTOR instruction, the value of ES (bit 7) and its reflection in the B-bit (bit 15) are not derived from the values loaded from memory but rather are dependent upon the values of the exception flags (bits 5-0) in the status word and their corresponding masks in the control word. If ES is set in such a case, the ERROR# output of the 80387 is activated immediately. The four numeric condition code bits (C3-CO) are similar to the flags in a CPU; instructions that perform arithmetic operations update these bits to reflect the outcome. The effects of these instructions on the condition code are summarized in Tables 2.2 through 2.5. Bit 7 is the error summary (ES) status bit. This bit is set if any unmasked exception bit is set; it is clear otherwise. If this bit is set, the ERROR# signal is asserted. 8 il1tef 80387 Table 2.2. Condition Code Interpretation Instruction CO(S) FPREM, FPREM1 (see Table 2.3) Q2 FCOM, FCOMP, FCOMPP, FTST, FUCOM, FUCOMP, FUCOMPP, FICOM, FICOMP FXAM FCHS, FABS, FXCH, FINCTOP, FDECTOP, Constant loads, FXTRACT, FLD, FILD, FBLD, FSTP (ext real) FIST, FBSTP, FRNDINT, FST, FSTP, FADD, FMUL, FDIV, FDIVR, FSUB, FSUBR, FSCALE, FSQRT, FPATAN, F2XM1, FYL2X, FYL2XP1 FPTAN, FSIN FCOS, FSINCOS FLDENV, FRSTOR I C3(Z) Three least significant bits of quotient QO Result of comparison (see Table 2.4) C1 (A) C2(C) Q1 orO/U# Reduction 0= complete 1 = incomplete Zero orO/U# Operand is not comparable (Table 2.4) Operand class (see Table 2.5) Sign orO/U# Operand class (Table 2.5) UNDEFINED Zero or O/U# UNDEFINED UNDEFINED Roundup orO/U# UNDEFINED Roundup orO/U#, undefined if C2 = 1 UNDEFINED Reduction 0= complete 1 = incomplete Each bit loaded from memory FLDCW, FSTENV, FSTCW, FSTSW, FCLEX, FINIT, FSAVE UNDEFINED O/U# When both IE and SF bits of status word are set, indicating a stack exception, this bit distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0). Reduction If FPREM or FPREM1 produces a remainder that is less than the modulus, reduction is complete. When reduction is incomplete the value at the top of the stack is a partial remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the reduction bit is set if the operand at the top of the stack is too large. In this case the original operand remains at the top of the stack. Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the instruction was upward. UNDEFINED Do not rely on finding any specific value in these bits. 9 inter 80387 Table 2.3. Condition Code Interpretation after FPREM and FPREM1 Instructions Condition Code Interpretation after FPREM and FPREM1 C2 C3 C1 CO 1 X X X 01 00 02 o MOD8 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 2 3 4 5 6 7 0 Incomplete Reduction: further interation required for complete reduction Complete Reduction: CO, C3, C1 contain three least significant bits of quotient Table 2.4. Condition Code Resulting from Comparison Order C3 C2 CO TOP> Operand TOP < Operand TOP = Operand Unordered 0 0 1 1 0 0 0 1 0 1 0 1 Table 2.5. Condition Code Defining Operand Class C3 C2 C1 CO 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 10 Value at TOP + Unsupported + NaN - Unsupported - NaN + Normal + Infinity - Normal - Infinity +0 + Empty -0 - Empty + Denormal - Denormal inter 80387 the address of the instruction (including any prefixes that may be present), the address of the operand (if present), and the opcode. 2.3.4 INSTRUCTION AND DATA POINTERS Because the NPX operates in parallel with the CPU, any errors detected by the NPX may be reported after the CPU has executed the ESC instruction which caused it. To allow identification of the failing numeric instruction, the 80386/80387 contains two pointer registers that supply the address of the failing numeric instruction and the address of its numeric memory operand (if appropriate). The instruction and data pointers appear in one of four formats depending on the operating mode of the 80386 (protected mode or real-address mode) and depending on the operand-size attribute in effect (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode, the real-address mode formats are used. (See Figures 2.3 through 2.6.) The ESC instructions FLDENV, FSTENV, FSAVE, and FRSTOR are used to transfer these values between the 80386 registers and memory. Note that the value of the data pointer is undefined if the prior ESC instruction did not have a memory operand. The instruction and data pointers are provided for user-written error handlers. These registers are actually located in the 80386, but appear to be located in the 80387 because they are accessed by the ESC instructions FLDENV, FSTENV, FSAVE, and FRSTOR. (In the 8086/8087 and 80286/80287, these registers are located in the NPX.) Whenever the 80386 decodes a new ESC instruction, it saves 31 23 32-BIT PROTECTED MODE FORMAT 15 7 o RESERVED CONTROL WORD o RESERVED STATUS WORD 4 RESERVED TAG WORD 8 IPOFFSET C RESERVED CSSELECTOR DATA OPERAND OFFSET RESERVED OPERAND SELECTOR 10 14 18 Figure 2.3. Protected Mode 80387 Instruction and Data POinter Image in Memory, 32·Bit Format 11 infef 80387 31 23 0000 0000 I I 32-BIT REAL-ADDRESS MODE FORMAT 15 o 7 RESERVED CONTROL WORD o RESERVED STATUS WORD 4 RESERVED TAG WORD a RESERVED INSTRUCTION POINTER 15.. 0 C I INSTRUCTION POINTER 31 .. 16 0 I OPCODE 10..0 RESERVED OPERAND POINTER 15.. 0 OPERAND POINTER 31 .. 16 I 0000 10 14 00000000 1a Figure 2.4. Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 16-BIT PROTECTED MODE FORMAT o 15 7 16-BIT REAL-ADDRESS MODE AND VIRTUAL-a086 MODE FORMAT 15 7 CONTROL WORD o STATUS WORD 2 TAG WORD 4 IPOFFSET 6 CSSELECTOR a OPERAND OFFSET A OPERAND SELECTOR C CONTROL WORD o STATUS WORD 2 TAG WORD 4 INSTRUCTION POINTER 15 .. 0 6 IP19. 16 101 Figure 2.5. Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format OPCODE 10 .. 0 8 OPERAND POINTER 15.. 0 A DP 19.16/0 / 0 0 0 0 0 0 0 0 0 o 0 C Figure 2.6. Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 12 o inter 80387 RESERVED RESERVED" ROUNDING CONTROL 117 I I I I I I:I~ I~ I~ I~ I~ I PRECISION CONTROL a 5 x x x x RC PC " "0" AFTER RESET OR FIN IT; CHANGEABLE UPON LOADING THE CONTROL WORD (CW). PROGRAMS MUST IGNORE THIS BIT. x x RESERVED EXC EPTION MASKS: PRECISION UNDERFLOW OVERFLOW Z ERO DIVIDE DENORMALIZE D OPERAND INVALID OPERATION 231920-4 Rounding Control OO-Round to nearest or even 01-Round down (toward - 00) 1o-Round up (toward + "") 11-Chop (truncate toward zero) Precision Control 00-24 bits (single precision) 01-(reserved) 10-53 bits (double precision) 11-64 bits (extended precision) Figure 2.7. 80387 Control Word affects only those instructions that perform rounding at the end of the operation (and thus can generate a precision exception); namely, FST, FSTP, FIST, all arithmetic instructions (except FPREM, FPREM1, FXTRACT, FABS, and FCHS), and all transcendental instructions. 2.3.5 CONTROL WORD The NPX provides several processing options that are selected by loading a control word from memory into the control register. Figure 2.7 shows the format and encoding of fields in the control word. • The precision control (PG) bits (bits 9-8) can be used to set the 80387 internal operating precision of the significand at less than the default of 64 bits (extended precision). This can be useful in providing compatibility with early generation arithmetic processors of smaller precision. PC affects only the instructions ADD, SUB, DIV, MUL, and SORT. For all other instructions, either the precision is determined by the opcode or extended precision is used. The low-order byte of this control word configures the 80387 error and exception masking. Bits 5-0 of the control word contain individual masks for each of the six exceptions that the 80387 recognizes. The high-order byte of the control word configures the 80387 operating mode, including precision and rounding. • Bit 12 no longer defines infinity control and is a reserved bit. Only affine closure is supported for infinity arithmetic. The bit is initialized to zero after RESET or FINIT and is changeable upon loading the CWo Programs must ignore this bit. 2.4 Interrupt Description Several interrupts of the 80386 are used to report exceptional conditions while executing numeric programs in either real or protected mode. Table 2.6 shows these interrupts and their causes. • The rounding control (RG) bits (bits 11-10) provide for directed rounding and true chop, as well as the unbiased round to nearest even mode specified in the IEEE standard. Rounding control 13 inter 80387 Table 2.6. 80386 Interrupt Vectors Reserved for NPX Interrupt Number Cause of Interrupt 7 An ESC instruction was encountered when EM or TS of 80386 control register zero (CRO) was set. EM = 1 indicates that software emulation of the instruction is required. When TS is set, either an ESC or WAIT instruction causes interrupt 7. This indicates that the current NPX context may not belong to the current task. 9 An operand of a coprocessor instruction wrapped around an addressing limit (OFFFFH for small segments, OFFFFFFFFH for big segments, zero for expand-down segments) and spanned inaccessible addressesa. The failing numerics instruction is not restartable. The address of the failing numerics instruction and data operand may be lost; an FSTENV does not return reliable addresses. As with the 80286/80287, the segment overrun exception should be handled by executing an FNINIT instruction (i.e. an FINIT without a preceding WAIT). The return address on the stack does not necessarily point to the failing instruction nor to the following instruction. The interrupt can be avoided by never allowing numeric data to start within 108 bytes of the end of a segment. 13 The first word or doubleword of a numeric operand is not entirely within the limit of its segment. The return address pushed onto the stack of the exception handler points at the ESC instruction that caused the exception, including any prefixes. The 80387 has not executed this instruction; the instruction pointer and data pointer register refer to a previous, correctly executed instruction. 16 The previous numerics instruction caused an unmasked exception. The address of the faulty instruction and the address of its operand are stored in the instruction pointer and data pointer registers. Only ESC and WAIT instructions can cause this interrupt. The 80386 return address pushed onto the stack of the exception handler points to a WAIT or ESC instruction (including prefixes). This instruction can be restarted after clearing the exception condition in the NPX. FNINIT, FNCLEX, FNSTSW, FNSTENV, and FNSAVE cannot cause this interrupt. . . ... , , a. An operand may wrap around an addreSSing limit when the segment limit IS near an addreSSing limit and the operand IS near the largest valid address in the segment. Because of the wrap·around, the beginning and ending addresses of such an operand will be at opposite ends of the segment. There are two ways that such an operand may also span inaccessible addresses: 1) if the segment limit is not equal to the addressing limit (e.g. addressing limit is FFFFH and segment limit is FFFDH) the operand will span addresses that are not within the segment (e,g, an a·byte operand that starts at valid offset FFFC will span addresses FFFC-FFFF and 0000·0003; however addresses FFFE and FFFF are not valid, because they exceed the lim~); 2) if the operand begins and ends in present and accessible pages but intermediate bytes of the operand fall in a not·present page or a page to which the procedure does not have access rights, 2.5 Exception Handling 2.6 Initialization The 80387 detects six different exception conditions that can occur during instruction execution. Table 2.7 lists the exception conditions in order of precedence, showing for each the cause and the default action taken by the 80387 if the exception is masked by its corresponding mask bit in the control word. 80387 initialization software must execute an FNINIT instruction (i.e. an FINIT without a preceding WAIT) to clear ERROR#-. The FNINIT is not required for the 80287, though Intel documentation recommends its use (refer to the Numerics' Supplement to the iAPX 286 Programmer's Reference Manual). After a hardware RESET, the ERROR#output is asserted to indicate that an 80387 is present. To accomplish this, the IE and ES bits of the status word are set, and the 1M bit in the control word is reset. After FNINIT, the status word and the control word have the same values as in an 80287 after RESET. Any exception that is not masked by the control word sets the corresponding exception flag of the status word, sets the ES bit of the status word, and asserts the ERROR# signal. When the CPU attempts to execute another ESC instruction or WAIT, exception 16 occurs. The exception condition must be resolved via an interrupt service routine. The 80386/80387 saves the address of the floating-point instruction that caused the exception and the address of any memory operand required by that instruction. 14 inter 80387 Operands for FSCALE and FPATAN are no longer restricted in range (except for ± 00); F2XM1 and FPTAN accept a wider range of operands. 2.78087 and 80287 Compatibility This section summarizes the differences between the 80387 and the 80287. Any migration from the 8087 directly to the 80387 must also take into account the differences between the 8087 and the 80287 as listed in Appendix A. The results of transcendental operations may be slightly different from those computed by 80287. In the case of FPTAN, the 80387 supplies a true tangent result in ST(1), and (always) a floating pOint 1 in ST. Many changes have been designed into the 80387 to directly support the IEEE standard in hardware. These changes result in increased performance by eliminating the need for software that supports the standard. Rounding control is in effect for FLD constant. Software cannot change entries of the tag word to values (other than empty) that do not reflect the actual register contents. 2.7.1 GENERAL DIFFERENCES The 80387 supports only affine closure for infinity arithmetic, not projective closure. Bit 12 of the Control Word (CW) no longer defines infinity control. It is a reserved bit; but it is initialized to zero after RESET or FINIT and is changeable upon loading the CWo Programs must ignore this bit. After reset, FINIT, and incomplete FPREM, the 80387 resets to zero the condition code bits C3-CO of the status word. In conformance with the IEEE standard, the 80387 does not support the special data formats: pseudozero, pseudo-NaN, pseudoinfinity, and unnormal. Table 2.7. Exceptions Exception Default Action (if exception is masked) Cause Invalid Operation Operation on a signaling NaN, unsupported format, indeterminate form (0' 00, 0/0, (+ 00) + (- 00), etc.), or stack overflow/underflow (SF is also set). Result is a quiet NaN, integer indefinite, or BCD indefinite Denormalized Operand At least one of the operands is denormalized, i.e. it has the smallest exponent but a nonzero significand. Normal processing continues Zero Divisor The divisor is zero while the dividend is a noninfinite, nonzero number. Result is 00 Overflow The result is too large in magnitude to fit in the specified format. Result is largest finite value or 00 Underflow The true result is nonzero but too small to be represented in the specified format, and, if underflow exception is masked, denormalization causes loss of accuracy. Result is denormalized or zero Inexact Result (Precision) The true result is not exactly representable in the specified format (e.g. 1/3); the result is rounded according to the rounding mode. Normal processing continues 15 inter 80387 signal is at a low Voltage. When no # is present after the signal name, the signal is asserted when at the high voltage level. 2.7.2 EXCEPTIONS When the overflow or underflow exception is masked, one difference from the 80287 is in rounding when overflow or underflow occurs. The 80387 produces results that are consistent with the rounding mode. The other difference is that the 80387 sets its underflow flag only if there is also a loss of accuracy during denormalization. 3.1 Signal Description In the following signal descriptions, the 80387 pins are grouped by function as follows: 1. Execution control-386CLK2, 387ClK2, CKM, RESETIN 2. NPX handshake-PEREQ, BUSY#, ERROR# A number of differences exist due to changes in the IEEE standard and to functional improvements to the architecture of the 80387: 3. Bus interface pins-031-00, W/R#, AOS#, REAOY#, REAOYO# 1. Fewer invalid-operation exceptions due to denormal operands, because the instructions FSQRT, FOIV, FPREM and conversions to BCO or to integer normalize denormal operands before proceeding. 2. The FSQRT, FBSTP, and FPREM instructions may cause underflow, because they support denormal operands. 4. Chip/Port CMOO# Select-STEN, NPS1 #, NPS2, 5. Power supplies-Vee, Vss Table 3.1 lists every pin by its identifier, gives a brief description of its function, and lists some of its characteristics. All output signals are tristate; they leave floating state only when STEN is active. The output buffers of the bidirectional data pins 031-00 are also tristate; they leave floating state only in read cycles when the 80387 is selected (i.e. when STEN, NPS1 #, and NPS2 are all active). 3. The denormal exception can occur during the transcendental instructions and the FXTRACT instruction. 4. The denormal exception no longer takes prece- dence over all other exceptions. 5. When the operand is zero, the FXTRACT instruction reports a zero-divide exception and leaves - 00 in ST(1). Figure 3.1 and Table 3.2 together show the location of every pin in the pin grid array. 6. The status word has a new bit (SF) that signals when invalid-operation exceptions are due to stack underflow or overflow. 3.1.1 80386 CLOCK 2 (386CLK2) This input uses the 80386 CLK2 signal to time the bus control logic. Several other 80387 signals are referenced to the rising edge of this signal. When CKM = 1 (synchronous mode) this pin also clocks the data interface and control unit and the floatingpoint unit of the 80387. This pin requires MOS-Ievel input. The Signal on this pin is divided by two to produce the internal clock signal ClK. 7. FLO extended precision no longer reports den ormal exceptions, because the instruction is not numeric. 8. FLO single/double precision when the operand is denormal converts the number to extended precision and signals the denormalized operand exception. When loading a signaling NaN, FLO single/double precision signals an invalid-operation exception. 3.1.280387 CLOCK 2 (387CLK2) 9. The 80387 only generates quiet NaNs (as on the 80287); however, the 80387 distinguishes between quiet NaNs and signaling NaNs. Signaling NaNs trigger exceptions when they are used as operands; quiet NaNs do not (except for FCOM, FIST, and FBSTP which also raise IE for quiet NaNs). When CKM = 0 (asynchronous mode) this pin provides the clock for the data interface and control unit and the floating-point unit of the 80387. In this case, the ratio of the frequency of 387CLK2 to the frequency of 386CLK2 must lie within the range 10:16 to 16:10. When CKM = 1 (synchronous mode) this pin is ignored; 386ClK2 is used instead for the data interface and control unit and the floating-point unit. This pin requires TTL-level input. 3.0 HARDWARE INTERFACE In the following description of hardware interface, the # symbol at the end of a signal name indicates that the active or asserted state occurs when the 16 80387 Table 3.1. 80387 Pin Summary Pin Name Active State Function 386CLK2 387CLK2 CKM RESETIN 80386 CLocK 2 80387 CLocK 2 80387 CLocKing Mode System reset PEREQ BUSY# ERROR# Processor Extension REQuest Busy status Error status 031-00 W/R# AOS# REAOY# REAOYO# Data pins Write/Read bus cycle ADdress Strobe Bus ready input Ready output STEN NPS1# NPS2 CMOO# STatus ENable NPX select # 1 NPX select #2 CoMmanD Input! Output Referenced To High I I I I 386CLK2 High 0 386CLK2/STEN Low Low 0 0 386CLK2/STEN 387CLK2/STEN High HilLa Low Low Low I/O I I I 0 386CLK2 386CLK2 386CLK2 386CLK2 386CLK2/STEN High Low High Low I I I I 386CLK2 386CLK2 386CLK2 386CLK2 I I Vee Vss NOTE: STEN is referenced to only when getting the output pins into or out of tristate mode. Table 3.2. 80387 Pin Cross-Reference A2 A3 A4 A5 A6 A7 A8 A9 A10 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 C1 C2 C10 - - - - - - - - 09 011 012 014 Vee 016 018 Vee 021 08 Vss 010 Vee 013 015 VSS 017 019 020 022 07 06 023 C11 01 02 010 011 E1 E2 E10 E11 F1 F2 F10 F11 G1 G2 G10 G11 H1 H2 H10 H11 J1 J2 - - - - - 17 VSS 05 04 024 025 Vee VSS 026 027 Vee VSS Vee VSS 03 02 028 029 01 00 030 031 Vss Vee J10 J11 K1 K2 K3 K5 K5 K6 K7 K8 K9 K10 K11 L2 L3 L4 L5 L6 L7 L8 L9 L10 - - - - - - - VSS CKM PEREQ BUSY# Tie High W/R# Vee NPS2 AOS# REAOY# No Connect 386CLK2 387CLK2 ERROR# REAOYO# STEN VSS NPS1# - Vee - CMOO# Tie High RESETIN - 80387 3.1.5 PROCESSOR EXTENSION REQUEST (PEREQ) ABCDEFGHJKL * + + + + + + + + + + + + + 3 + + + + 4 + + + 5 + + + + + + + 6 + + 7 + + + + 8 + + + 9 + + 10 11 When active, this pin signals to the 80386 CPU that the 80387 is ready for data transfer to/from its data FIFO. When all data is written to or read from the data FIFO, PEREa is deactivated. This signal always goes inactive before BUSY # goes inactive. This signal is referenced to 386CLK2. It should be connected to the 80386 PEREa input. Refer to Figure 3.7 for the timing relationships between this and the BUSY# and ERROR# pins. + + + + + + 2 80387 + + + + + + + + + + + + + + + + + + + + + + + 3.1.6 BUSY STATUS (BUSV#) When active, this pin signals to the 80386 CPU that the 80387 is currently executing an instruction. This signal is referenced to 386CLK2. It should be connected to the 80386 BUSY # pin. Refer to Figure 3.7 for the timing relationships between this and the PEREa and ERROR# pins. 231920-5 PIN SIDE VIEW *Pin 1 Figure 3.1. 80387 Pin Configuration 3.1.7 ERROR STATUS (ERROR#) This pin reflects the ES bits of the status register. When active, it indicates that an unmasked exception has occurred (except that, immediately after a reset, it indicates to the 80386 that an 80387 is present in the system). This signal can be changed to inactive state only by the following instructions (without a preceding WAIT): FNINIT, FNCLEX, FNSTENV, and FNSAVE. This signal is referenced to 387CLK2. It should be connected to the 80386 ERROR# pin. Refer to Figure 3.7 for the timing relationships between this and the PEREa and BUSY # pins. 3.1.380387 CLOCKING MODE (CKM) This pin is a strapping option. When it is strapped to Vee, the 80387 operates in synchronous mode; when strapped to Vss, the 80387 operates in asynchronous mode. These modes relate to clocking of the data interface and control unit and the floatingpoint unit only; the bus control logic always operates synchronously with respect to the 80386. 3.1.4 SYSTEM RESET (RESETIN) A LOW to HIGH transition on this pin causes the 80387 to terminate its present activity and to enter a dormant state. RESETIN must remain HIGH for at least 40 387CLK2 periods. The HIGH to LOW transitions of RESETIN must be synchronous with 386CLK2, so that the phase of the internal clock of the bus control logic (which is the 386CLK2 divided by 2) is the same as the phase of the internal clock of the 80386. After RESETIN goes LOW, at least 50 387CLK2 periods must pass before the first NPX instruction is written into the 80387. This pin should be connected to the 80386 RESET pin. Table 3.3 shows the status of other pins after a reset. 3.1.8 DATA PINS (031-00) These bidirectional pins are used to transfer data and opcodes between the 80386 and 80387. They are normally connected directly to the corresponding 80386 data pins. HIGH state indicates a value of one. DO is the least significant data bit. Timings are referenced to 386CLK2. 3.1.9 WRITE/READ BUS CYCLE (W/R#) This signal indicates to the 80387 whether the 80386 bus cycle in progress is a read or a write cycle. This pin should be connected directly to the 80386 W/R# pin. HIGH indicates a write cycle; LOW, a read cycle. This input is ignored if any of the signals STEN, NPS1 #, or NPS2 is inactive. Setup and hold times are referenced to 386CLK2. Table 3.3. Output Pin Status during Reset Pin Value Pin Name HIGH REAOYO#, BUSY# LOW PEREa, ERROR# Tri-State OFF 031-00 18 intJ 80387 3.1.10 ADDRESS STROBE (ADS#) 3.1.15 NPX SELECT #2 (NPS2) This input, in conjunction with the READY # input indicates when the 80387 bus·control logic may sample W/R# and the chip-select signals. Setup and hold times are referenced to 386ClK2. This pin should be connected to the 80386 ADS# pin. When active (along with STEN and NPS1 #) in the first period of an 80386 bus cycle, this signal indicates that the purpose of the bus cycle is to communicate with the 80387. This pin should be connected directly to the 80386 A31 pin, so that the 80387 is selected only when the 80386 uses one of the 1/0 addresses reserved for the 80387 (800000F8 or 800000FC). Setup and hold times are referenced to 386ClK2. 3.1.11 BUS READY INPUT (READY#) This input indicates to the 80387 when an 80386 bus cycle is to be terminated. It is used by the buscontrol logic to trace bus activities. Bus cycles can be extended indefinitely until terminated by READY #. This input should be connected to the same signal that drives the 80386 READ# input. Setup and hold times are referenced to 386ClK2. 3.1.16 COMMAND (CMDO#) During a write cycle, this signal indicates whether an opcode (CMDO# active) or data (CMDO# inactive) is being sent to the 80387. During a read cycle, it indicates whether the control or status register (CMDO# active) or a data register (CMDO# inactive) is being read. CMDO# should be connected directly to the A2 output of the 80386. Setup and hold times are referenced to 386ClK2. 3.1.12 READY OUTPUT (READYO#) This pin is activated at such a time that write cycles are terminated after two clocks and read cycles after three clocks. I n configurations where no extra wait states are required, it can be used to directly drive the 80386 READY # input. Refer to section 3.4 "Bus Operation" for details. This pin is activated only during bus cycles that select the 80387. This signal is referenced to 386ClK2. 3.2 Processor Architecture As shown by the block diagram on the front page, the NPX is internally divided into three sections: the bus control logic (BCl), the data interface and control unit, and the floating point unit (FPU). The FPU (with the support of the control unit which contains the sequencer and other support units) executes all numerics instructions. The data interface and control unit is responsible for the data flow to and from the FPU and the control registers, for receiving the instructions, decoding them, and sequencing the microinstructions, and for handling some of the administrative instructions. The BCl is responsible for 80386 bus tracking and interface. The BCl is the only unit in the 80387 that must run synchronously with the 80386; the rest of the 80387 can run asynchronously with respect to the 80386. 3.1.13 STATUS ENABLE (STEN) This pin serves as a chip select for the 80387. When inactive, this pin forces BUSY #, PEREQ, ERROR #, and READYO# outputs into floating state. D31-DO are normally floating and leave floating state only if STEN is active and additional conditions are met. STEN also causes the chip to recognize its other chip-select inputs. STEN makes it easier to do onboard testing (using the overdrive method) of other chips in systems containing the 80387. STEN should be pulled up with a resistor so that it can be pulled down when testing. In boards that do not use onboard testing, STEN should be connected to Vee. Setup and hold times are relative to 386ClK2. Note that STEN must maintain the same setup and hold times as NPS1 #, NPS2, and CMDO# (i.e. if STEN changes state during an 80387 bus cycle, it should change state during the same ClK period as the NPS1 #, NPS2, and CMDO# signals). 3.2.1 BUS CONTROL LOGIC The BCl communicates solely with the CPU using 1/0 bus cycles. The BCl appears to the CPU as a special peripheral device. It is special in two respects: the CPU initiates 1/0 automatically when it encounters ESC instructions, and the CPU uses reserved 1/0 addresses to communicate with the BCl. The BCl does not communicate directly with memory. The CPU performs all memory access, transferring input operands from memory to the 80387 and transferring outputs from the 80387 to memory. 3.1.14 NPX Select #1 (NPS1#) When active (along with STEN and NPS2) in the first period of an 80386 bus cycle, this signal indicates that the purpose of the bus cycle is to communicate with the 80387. This pin should be connected directly to the 80386 M/IO# pin, so that the 80387 is selected only when the 80386 performs 1/0 cycles. Setup and hold times are referenced to 386ClK2. 19 inter 80387 dental, constant, and data transfer instructions. The data path in the FPU is 84 bits wide (68 significant bits, 15 exponent bits. and a sign bit) which allows internal operand transfers to be performed at very high speeds. 3.2.2 DATA INTERFACE AND CONTROL UNIT The data interface and control unit latches the data and, subject to BCl control, directs the data to the FIFO or the instruction decoder. The instruction decoder decodes the ESC instructions sent to it by the CPU and generates controls that direct the data flow in the FIFO. It also triggers the microinstruction sequencer that controls execution of each instruction. If the ESC instruction is FIN IT, FClEX, FSTSW, FSTSW AX, or FSTCW, the control executes it independently of the FPU and the sequencer. The data interface and control unit is the one that generates the BUSY #, PEREQ and ERROR # signals that synchronize 80387 activities with the 80386. It also supports the FPU in all operations that it cannot perform alone (e.g. exceptions handling, transcendental operations, etc.). 3.3 System Configuration As an extension to the 80386, the 80387 can be connected to the CPU as shown by Figure 3.2. A dedicated communication protocol makes possible high-speed transfer of opcodes and operands between the 80386 and 80387. The 80387 is designed so that no additional components are required for interface with the 80386. The 80387 shares the 32bit wide local bus of the 80386 and most control pins of the 80387 are connected directly to pins of the 80386. 3.2.3 FLOATING POINT UNIT The FPU executes all instructions that involve the register stack, including arithmetic, logical, transcen- FROM OTHER PERIPHERALS 32 MHz CLOCK GENERATOR T l i i i' X2 EFI FIc# X1 ADSO# r+ 180387 CLOCK GENERATOR (OPTIONAL) I I --I RES# 80384 ClK2 ClK ADS# RESET t HLDA '------+ '-----+ ..... ..... ..... ..... ..... D/c# READY# ClK2 LOCK# BE3#-BEO# RESETIN READY# HOLD INTR NMI A30-A3 80386 READYO# 80387 NPS'# NPS2 A31 NA# bI ...... ... M/IO# BS'6# 387 ClK2 386ClK2 WAIT STATE GENERATOR (OPTIONAL) RESET CKM 1-+ 1. I 4 >-- ~ r+ CMDO# A2 W/R# WjR# ADS# ADS# 32 D31-00 D3'-DO STEN lJ BUSY# BUSY# ERROR# ERROR# PEREQ PEREQ 231920-6 Figure 3.2. S0386/80387 System Configuration 20 inter 80387 Table 3.4. Bus Cycles Definition STEN NPS1# NPS2 CMDO# W/R# Bus Cycle Type 0 x x x x 1 1 1 1 1 1 1 x 0 0 0 0 x 0 1 1 1 1 x x 0 0 1 1 x x 0 1 0 1 80387 not selected and all outputs in floating state 80387 not selected 80387 not selected CW or SW read from 80387 Opcode write to 80387 Data read from 80387 Data write to 80387 The NPX uses the PEREQ pin of the 80386 CPU to signal that the NPX is ready for data transfer to or from its data FIFO. The NPX does not directly access memory; rather, the 80386 provides memory access services for the NPX. Thus, memory access on behalf of the NPX always obeys the rules applicable to the mode of the 80386, whether the 80386 be in real-address mode or protected mode. 3.3.1 BUS CYCLE TRACKING The ADS# and READY # signals allow the 80387 to track the beginning and end of 80386 bus cycles, respectively. When ADS# is asserted at the same time as the 80387 chip-select inputs, the bus cycle is intended for the 80387. To signal the end of a bus cycle for the 80387, READY # may be asserted directly or indirectly by the 80387 or by other bus-control logic. Refer to Table 3.4 for definition of the types of 80387 bus cycles. Once the 80386 initiates an 80387 instruction that has operands, the 80386 waits for PEREQ signals that indicate when the 80387 is ready for operand transfer. Once all operands have been transferred (or if the instruction has no operands) the 80386 continues program execution while the 80387 executes the ESC instruction. 3.3.2 80387 ADDRESSING The NPS1 #, NPS2 and STEN signals allow the NPX to identify which bus cycles are intended for the NPX. The NPX responds only to liD cycles when bit 31 of the 110 address is set. In other words, the NPX acts as an liD device in a reserved liD address space. In 8086/8087 systems, WAIT instructions may be required to achieve synchronization of both commands and operands. In 80286/80287 and 80386/80387 systems, WAIT instructions are required only for operand synchronization; namely, after NPX stores to memory (except FSTSW and FSTCW) or loads from memory. Used this way, WAIT ensures that the value has already been written or read by the NPX before the CPU reads or changes the value. Because A31 is used to select the 80387 for data transfers, it is not possible for a program running on the 80386 to address the 80387 with an I/O instruction. Only ESC instructions cause the 80386 to communicate with the 80387. The 80386 BS16# input must be inactive during 110 cycles when A31 is active. Once it has started to execute a numerics instruction and has transferred the operands from the 80386, the 80387 can process the instruction in parallel with and independent of the host CPU. When the NPX detects an exception, it asserts the ERROR # signal, which causes an 80386 interrupt. 3.3.3 FUNCTION SELECT The CMDO# and W/R# signals identify the four kinds of bus cycle: control or status register read, data read, opcode write, data write. 3.3.5 SYNCHRONOUS OR ASYNCHRONOUS 3.3.4 CPU/NPX Synchronization MODES The pin pairs BUSY#, PEREQ, and ERROR# are used for various aspects of synchronization between the CPU and the NPX. The internal logic of the 80387 (the FPU) can either operate directly from the CPU clock (synchronous mode) or from a separate clock (asynchronous mode). The two configurations are distinguished by the CKM pin. In either case, the bus control logic (BCl) of the 80387 is synchronized with the CPU clock. Use of asynchronous mode allows the 80386 and the FPU section of the 80387 to run at different speeds. In this case, the ratio of the frequency of BUSY# is used to synchronize instruction transfer from the 80386 to the 80387. When the 80387 recognizes an ESC instruction, it asserts BUSY #. For most ESC instructions, the 80386 waits for the 80387 to deassert BUSY # before sending the new opcode. 21 intJ 80387 Bus operation is described in terms of an abstract state machine. Figure 3.3 illustrates the states and state transitions for 80387 bus cycles: 387ClK2 to the frequency of 386ClK2 must lie within the range 10:16 to 16:10. Use of synchronous mode eliminates one clock generator from the board design. • TI is the idle state. This is the state of the bus logic after RESET, the state to which bus logic returns after evey nonpipelined bus cycle, and the state to which bus logic returns after a series of pipe lined cycles. 3.3.6 AUTOMATIC BUS CYCLE TERMINATION In configurations where no extra wait states are required, READYO# can be used to drive the 80386 READY # input. If this pin is used, it should be connected to the logic that ORs all READY outputs from peripherals on the 80386 bus. READYO# is asserted by the 80387 only during 1/0 cycles that select the 80387. Refer to section 3.4 "Bus Operation" for details. • T RS is the READY # sensitive state. Different types of bus cycle may require a minimum of one or two successive T RS states. The bus logic remains in T RS state until READY # is sensed, at which point the bus cycle terminates. Any number of wait states may be implemented by delaying READY #, thereby causing additional successive T RS states. • T p is the first state for every pipelined bus cycle. 3.4 Bus Operation The READYO# output of the 80387 indicates when a bus cycle for the 80387 may be terminated if no extra wait states are required. For all write cycles (except those for the instructions FlDENV and FRSTOR), READYO# is always asserted in the first T RS state, regardless of the number of wait states. For all read cycles and write cycles for FlDENV and FRSTOR, READYO# is always asserted in the second T RS state, regardless of the number of wait states. These rules apply to both pipe lined and nonpipelined cycles. Systems designers may use READYO# in one of three ways: With respect to the bus interface, the 80387 is fully synchronous with the 80386. Both operate at the same rate, because each generates its internal ClK signal by dividing 386ClK2 by two. The 80386 initiates a new bus cycle by activating ADS #. The 80387 recognizes a bus cycle, if, during the cycle in which ADS# is activated, STEN, NPS1 #, and NPS2 are all activated. Proper operation is achieved if NPS1 # is connected to the M/IO# output of the 80386, and NPS2 to the A31 output. The 80386's A31 output is guaranteed to be inactive in all bus cycles that do not address the 80387 (i.e. 1/0 cycles to other devices, interrupt acknowledge, and reserved types of bus cycles). System logic must not signal a 16-bit bus cycle via the 80386 BS16# input during 1/0 cycles when A31 is active. 1. leave it disconnected and use external logic to generate READY # signals. When choosing this option, 80387 requirements for wait states in read cycles and write cycles of FlDENV and FRSTOR must be obeyed. 2. Connect it (directly or through logic that ORs READY signals from other devices) to the READY# inputs of the 80386 and 80387. During the ClK period in which ADS# is activated, the 80387 also examines the W/R# input signal to determine whether the cycle is a read or a write cycle and examines the CMDO# input to determine whether an opcode, operand, or controll status register transfer is to occur. 3. Use it as one input to a wait-state generator. ADS# The 80387 supports both pipelined and nonpipelined bus cycles. A nonpipelined cycle is one for which the 80386 asserts ADS# when no other 80387 bus cycle is in progress. A pipelined bus cycle is one for which the 80386 asserts ADS# and provides valid next-address and control signals as soon as in the second ClK period after the ADS# assertion for the previous 80386 bus cycle. Pipelining in· creases the availability of the bus by at least one ClK period. The 80387 supports pipelined bus cycles in order to optimize address pipelining by the 80386 for memory cycles. READY# 231920-7 Figure 3.3. Bus State Diagram 22 inter 80387 The following sections illustrate different types of 80387 bus cycles. When READY # is asserted the 80387 returns to the idle state, in which ADS# could be asserted again by the 80386 for the next cycle. Because different instructions have different amounts of overhead before, between, and after operand transfer cycles, it is not possible to represent in a few diagrams all of the combinations of successive operand transfer cycles. The following bus-cycle diagrams show memory cycles between 80387 operand-transfer cycles. Note however that, during the instructions FlDENV, FSTENV, FSAVE, and FRSTOR, some consecutive accesses to the NPX do not have intervening memory accesses. For the timing relationship between operand transfer cycles and opcode write or other overhead activities, see Figure 3.7. 3.4.1.2 Read Cycle At the second clock of the bus cycle, the 80387 enters the TRS state. See Figure 3.4. In this state, the 80387 samples the READY # input and stays in this state as long as READY # is inactive. At the rising edge of elK in the second clock period of the cycle, the 80387 starts to drive the 031-00 outputs and continues to drive them as long as it stays in T RS state. In ~ead cycles that address the 80387, at least one walt state must be inserted to insure that the 80386 latches the correct data. Since the 80387 starts driving the system data bus only at the rising edge of elK rn the second clock period of the bus cycle, not enough time is left for the data signals to propagate and be latched by the 80386 at the falling edge of the same clock period. The 80387 drives the READYO# signal for one elK period in the third elK of the bus cycle. Therefore, if the READYO# output is used to drive the 80386 READY# input, one wait state is inserted automatically. 3_4.1 NONPIPELINED BUS CYCLES Figure 3.4 illustrates bus activity for consecutive nonpipelined bus cycles. 3.4.1.1 Write Cycle At the second clock of the bus cycle, the 80387 enters the TRS (READY #-sensitive) state. During this state, the 80387 samples the READY# input and stays in this state as long as READY # is inactive. Because one wait state is required for 80387 reads the minimum is three elK cycles per read, as cycl~ 3 of Figure 3.4 shows. In write cycles, the 80387 drives the READYO# signal for one elK period beginning with the second elK of the bus cycle; therefore, the fastest write cycle takes two elK cycles (see cycle 2 of Figure 3.4). For the instructions FlDENV and FRSTOR, however, the 80387 forces a wait state by delaying the activation of READYO# to the second T RS cycle (not shown in Figure 3.4). When READY # is asserted the 80387 returns to the idle state, in which ADS# could be asserted again by the 80386 for the next cycle. The transition from T RS state to idle state causes the 80387 to put the trls~ate 031-00 outputs into the floating state, allowrng another device to drive the system data bus. 23 intJ 80387 CYCLE 1 NON-PIPELINED MEMORY READ CYCLE 2 NON-PIPELINEO NPX WRITE CYCLE 3 NON-PIPELINED NPX READ CYCLE 4 NON-PIPELINED MEMORY WRITE 386ClK2 (ClK) NPS2, NPS1#, M/IO# ~----+-----~~--~-----T,-~~-i~--~------~----~----~----~----~ fLL--+---+.....-+----iu..~r_;_.l....-+_--t_-"""'f~-+--+_-__i W/R# ADS# REAOYO# DO-031 ---- -- 231920-8 Cycles 1 & 2 represent part of the operand transfer cycle for instructions involving either 4-byte or 8-byte operand loads. Cycles 3 & 4 represent part of the operand transfer cycle for a store operation. 'Cycles 1 & 2 could repeat here or TI states for various non-operand transfer cycles and overhead. Figure 3.4. Nonpipelined Read and Write Cycles T p state is metastable; therefore, one clock period later the 80387 returns to T RS state. In consecutive pipelined cycles, the 80387 bus logic uses only T RS and T p states. 3.4.2 PIPELINED BUS CYCLES Because all the activities of the 80387 bus interface occur either during the T RS state or during the transitions to or from that state, the only difference between a pipelined and a nonpipelined cycle is the manner of changing from one state to another. The exact activities in each state are detailed in the previous section "Nonpipelined Bus Cycles". Figure 3.5 shows the fastest transition into and out of the pipe lined bus cycles. Cycle 1 in this figure represents a nonpipelined cycle. (Nonpipelined write cycles with only one T RS state (i.e. no wait states) are always followed by another nonpipelined cycle, because READY # is asserted before the earliest possible assertion of ADS# for the next cycle.) When the 80386 asserts ADS# before the end of a bus cycle, both ADS# and READY# are active during a T RS state. This condition causes the 80387 to change to a different state named T p. The 80387 activities in the transition from a T RS state to a T p state are exactly the same as those in the transition from a T RS state to a TI state in non pipe lined cycles. Figure 3.6 shows the pipelined write and read cycles with one additional T RS states beyond the minimum required. To delay the assertion of READY# requires external logic. 24 infef 80387 3.4.3 BUS CYCLES OF MIXED TYPE 3.4.4 BUSY # AND PEREQ TIMING RELATIONSHIP When the 80387 bus logic is in the T RS state, it distinguishes between nonpipelined and pipelined cycles according to the behavior of ADS# and READY#. In a nonpipelined cycle, only READY# is activated, and the transition is from TRS to idle state. In a pipelined cycle, both READY# and ADS# are active and the transition is first from T RS state to T p state then, after one clock period, back to T RS state. CYCLE 1 NON-PIPELINED MEMORY READ Figure 3.7 shows the activation of BUSY # at the beginning of instruction execution and its deactivation after execution of the instruction is complete. PEREO is activated in this interval. If ERROR # (not shown in the diagram) is ever asserted, it would occur at least six 386CLK2 periods after the deactivation of PEREO and at least six 386CLK2 periods before the deactivation of BUSY #. Figure 3.7 shows also that STEN is activated at the beginning of a bus cycle. CYCLE 2 PIPELINED NPX WRITE CYCLE 3 PIPELINED MEMORY READ CYCLE 4 NON-PIPELINED NPX WRITE 386ClK2 (ClK) NPS2, NPS1#, M/IO# ~----~----rr--~~----rr--~~--~----~~--~----~----~ fU---+--oof"l.--+-----!U---t---+---iU---+--+---i W/R# ADS# READYO# READY# V"'''.'''' 00-031 ---- ----- -- 231920-9 Cycle 1-Cycle 4 represent the operand transfer cycle for an instruction involving a transfer of two 32-bit loads in total. The opcode write cycles and other overhead are not shown. Note that the next cycle will be a pipelined cycle if both READY # and ADS# are sampled active at the end of a T RS state of the current cycle. Figure 3.5. Fastest Transitions to and from Pipelined Cycles 25 intJ 80387 CYCLE 1 PIPELINED WRITE CYCLE 2 NOTE 1 PIPEUNED READ Tp Tp 386CLK2 (elK) NP52. ~---+----~~--rr---+--~~~--~----_+----~--~n----r--~ NP51#. M/IO# .....-_t_--~ 1'-'----+----_f''----f-L---+--~~_f'--_1----_+----~--~ W/R# AD5# READYO# 231920-10 NOTE: 1. Cycles between operand write to the NPX and storing result. Figure 3.S. Pipelined Cycles with Wait States QPCODE WRITE 1ST OPERAND NOTE 4 NOTE 1 WRITE NOTE 2 NOTE 3 NOTE 1 231920-11 NOTES: 1. Instruction dependent. 2. PEREQ is an asynchronous input to the 80386; it may not be asserted (instruction dependent). 3. More operand transfers. 4. Memory read (operand) cycle is not shown. Figure 3.7. STEN, BUSY# and PEREa Timing Relationship 26 infef 80387 4.0 MECHANICAL DATA 68 LEAD CERAMIC PIN GRID ARRAY PACKAGE INTEL TYPE A SEATIN~ PLANE oB (ALL PINS) ~~ A'=F- BASE SWAGGED PIN DETAIL ' PLANE 231920-12 Family: Ceramic Pin Grid Array Package Millimeters Symbol Min Max A 3.56 4.57 A1 0.76 A1 Inches Notes 1.27 Solid Lid 0.41 EPROM Lid Min Max 0.140 0.180 0.030 0.050 Solid Lid 0.016 EPROM Lid 0.135 Solid Lid EPROM Lid A2 2.72 3.43 Solid Lid 0.107 A2 3.43 4.32 EPROM Lid 0.135 0.170 A3 1.14 1.40 0.045 0.055 B 0.43 0.51 0.017 0.020 0 28.83 29.59 1.135 1.165 D1 25.27 25.53 0.995 1.005 e1 2.29 2.79 0.090 0.110 L 2.29 3.30 0.090 0.130 1.27 2.54 0.050 0.100 68 N S1 ISSUE IWSREV7 68 3/26/86 Figure 4.1. Package Description 27 Notes 80387 Consult the most recent 80387 data sheet for AC specifications. 28 intJ 80387 Consult the most recent 80387 data sheet for AC specifications. 29 intJ 80387 Consult the most recent 80387 data sheet for AC specifications. 30 inter 80387 Consult the most recent 80387 data sheet for AC specifications. 31 inter 80387 Consult the most recent 80387 data sheet for AC specifications. 32 inter 80387 Instruction OPA 1 MOD 11011 MF OPA MOD 3 11011 d P OPA 1 1 4 11011 0 0 1 1 1 1 5 11011 0 1 1 1 1 1 15-11 10 9 8 7 6 5 43210 = Register stack element i 111 d = Destination O-Destination is ST(O) 1-Destination is ST(i) = DISP OP I OP • • • = Eighth stack element The instruction summaries that follow assume that the instruction has been prefetched, decoded, and is ready for execution; that bus cycles do not require wait states; that there are no local bus HOLD request delaying processor access to the bus; and that no exceptions are detected during instruction execution. If the instruction has MOD and RIM fields that call for both base and index registers, add one clock. 11011 = DISP SIB (Scale Index Base) byte and DISP (displacement) are optionally present in instructions that have MOD and RIM fields. Their presence depends on the values of MOD and RIM, as for 80386 instructions. Pop 0-00 not pop stack 1-Pop stack after operation R XOR d R XOR d I I Programmer's Reference Manual) Memory Format 00-32-bit real 01-32-bit integer 10-64-bit real 11-16-bit integer = SIB MOD (Mode field) and RIM (Register/Memory specifier) have the same interpretation as the corresponding fields of 80386 instructions (refer to 80386 = ESC SIB RIM ST(i) OPB I RIM 000 = Stack top 001 = Second stack element OP = Instruction opcode, possible split into two fields OPA and OPB = OPB OPB ST(i) Instructions for the 80387 assume one of the five forms shown in the following table. In all cases, in· structions are at least two bytes long and begin with the bit pattern 11011 B, which identifies the ESCAPE class of instruction. Instructions that refer to memory operands specify addresses using the 80386 addressing modes. MF 1 I 11011 2 6.0 80387 EXTENSIONS TO THE 80386 INSTRUCTION SET P Optional Fields Second Byte First Byte O-Destination (op) Source 1-Source (op) Destination 33 inter b:\IIDW£OO©~ OOO[P©OO~b:\trO@oo 80387 80387 Extensions to the 80386 Instruction Set Instruction Optional Bytes 2-6 32-Blt Real 20 16-Bil Inleger DATA TRANSFER FLO ~ Load a Integer/real memory to ST(O) SIB/DISP Long integer memory to ST(O) SIB/DISP Extended real memory to ST(O) SIB/DISP 44 BCD memory to ST(O) SIB/DISP 266-275 ST(i) to ST(O) FST ~ ESC 001 11000ST(i) ESC 101 11010ST(i) 61-65 45 82-95 45 82-95 31 71-75 31 71-75 14 SIB/DISP ST(O) to integer/real memory 44 79-93 11 Store and Pop ~ SIB/DISP ST(O) to long integer memory SIB/DISP ST(O) to extended real SIB/DISP 53 ST(O) to BCD memory SIB/DISP 512-534 ST(O) to ST(i) 44 79-93 ST(O) to integer/real memory FXCH 25 56-67 Store ST(O) to ST(i) FSTP 45-52 80-97 ESC 101 11001 ST(i) 12 ESC 001 11001 ST(i) 18 Exchange ~ ST(i) and ST(O) COMPARISON FCOM ~ Compare Integer/real memory to ST(O) ST(i) to ST(O) FCOMP SIB/DISP ESC 000 11010ST(i) ESC 000 11011 ST(i) 26 ESCll0 11011001 26 ESC 001 11100101 Integer/real memory to ST ST(i) to ST(O) ~ FXAM ~ ~ 24 SIBIDISP 26 56-63 Compare and pop twice ST(l) to ST(O) FTST 56-63 Compare and pop ~ FCOMPP 26 Test ST(O) Examine ST(O) CONSTANTS FLOZ ~ Load + 0.0 into ST(O) ESC 001 11101110 20 FLOI ~ Load + 1.0 into ST(O) ESC 001 11101000 24 ESC 001 11101011 40 ESC 001 11101001 40 FLOPI ~ FLOL2T Load pi into ST(O) ~ Load log2(10) into ST(O) Shaded areas indicate instructions not available in 8087/80287. NOTE: a. When loading single- or double-precision zero from memory, add 5 clocks. 34 inter 80387 80387 Extensions to the 80386 Instruction Set (Continued) Instruction Oplional 32-Bil Bytes 2-6 Real 16-Bil Inleger CONSTANTS (Continued) FLDL2E = Load log2(e) into ST(O) ESC 001 11101010 40 FLDLG2 = Load IOg10(2) into ST(O) ESC 001 11101100 41 FLDLN2 = Load log.(2) into ST(O) ESC 001 11101101 41 ARITHMETIC FADD = Add Integer/real memory with ST(O) SIB/DISP 24-32 FSUB 57-72 29-37 71-85 23-31 b STeil and ST(O) = Subtract SIB/DISP Integer/real memory with ST(O) 24-32 57-82 28-36 71-83c 26-34d STeil and ST(O) FMUL = Multiply Integer/real memory with ST(O) SIB/DISP 27-35 FDIV 61-82 32-57 76-87 29-57e STeil and ST(O) = Divide Integer/real memory with ST(O) SIB/DISP FSQRTi FSCALE = Square root = Scale ST(O) by ST(I) 89 120-127f ESC 001 11111010 122-129 ESC 001 11111101 67-86 ESC 001 11111100 66-80 70-76 FPREM = Partial remainder FRNDINT = Round ST(O) to integer FXTRACT oIST(O) 94 BSh STO) and ST(O) = Extract components ESC 001 11110100 FABS = Absolute value 01 ST(O) ESC 001 11100001 22 FCHS = Change sign of ST(O) ESC 001 11100000 24-25 Shaded areas indicate instructions not available in 8087/80287. NOTES: b. Add 3 clocks to the range when d = 1. c. Add 1 clock to each range when R = 1. d. Add 3 clocks to the range when d = O. e. typical = 52 (When d = 0, 46-54, typical f. Add 1 clock to the range when R = 1. g. 135-141 when R = 1. h. Add 3 clocks to the range when d = 1. i. ~O s ST(O) s + 00. = 49). 35 136-1409 inter 80387 80387 Extensions to the 80386 Instruction Set (Continued) Instruction Optional Bytes 2-6 Clock Count Range TRANSCENDENTAL FeW:;;; P~~!l!$Thc:+L:eSpoil1: .( '.'1111,11t'1:;'·I· FPTANk ~ Partial tangent of ST(O) FPATAN ~ Partial arctangent I I I I ESC 001 ESC 001 11110010 11110011 iiSlNk'" SiriEi'ofS'F(oi ;; ;,.; ,. :.< ""l','; , . :1;.~C OO'f;.~.'I' ;'1 t#'j.t'~." . . ·.•. 1.2$ ..172l. I I ~.,.~~:~~~,~~~~~~~~~';:.f:·::eSp~l·;:t' i1j1;f'1~ft:' : F2XMl ~ 2ST(O) - 1 I ESC 001 I 1111 0000 I FYL2xm ~ ST(I) , IOg2(ST(0» I ESC 001 I 1111 0001 I FYL2XP1" ~ ST(I) 'log2(ST(0) + 1.0) I ESC 001 I 11111001 I 11. '.;.'" 191-497i ' .. 211-476 i 120-538 257-547 PROCESSOR CONTROL FINIT Initialize NPX ~ ~ FSTSW AX ~ FLDCW ESCOll 11100011 33 11100000 Store status word 13 Load control word SIB/DISP 19 FSTCW ~ Store control word SIB/DISP 15 FSTSW ~ Store status word SIB/DISP 15 FCLEX ~ 11100010 Clear exceptions 11 FSTENV ~ Store environment SIB/DISP FLDENV ~ Load environment SIB/DISP 71 SIB/DISP 375-376 SIB/DISP 308 FSAVE ~ Save state FRSTOR ~ Restore state FINCSTP ~ Increment stack pointer FDECSTP ~ Decrement stack pOinter FFREE ~ Free ST(i) FNOP ~ No operations 103-104 11110111 21 ESC 001 11110110 22 ESC 101 11000 ST(i) 18 ESC 001 11010000 12 Shaded areas indicate instructions not available in 8087/80287. NOTES: j. These timings hold for operands in the range Ixl needed to reduce the operand. k. 0 ,;: I ST(O) I < 263. I. -1.0 ,;: ST(O) ,;: 1.0. m.O ,;: ST(O) < "", - "" < ST(1) < + "". n. 0 ,;: IST(O)I < (2 - SQRT(2))/2, - 00 < ST(l) < 7T 14. For operands not in this range, up to 76 additional clocks may be < + 00. 36 inter 80387 6. Interrupt 7 will occur in the 80286 when executing ESC instructions with either TS (task switched) or EM (emulation) of the 80286 MSW set (TS = 1 or EM = 1). If TS is set, then a WAIT instruction will also cause interrupt 7. An exception handler should be included in 80286/80287 code to handle these situations. APPENDIX A COMPATIBILITY BETWEEN THE 80287 AND THE 8087 The 80286/80287 operating in Real-Address mode will execute 808618087 programs without major modification. However, because of differences in the handling of numeric exceptions by the 80287 NPX and the 8087 NPX, exception-handling routines may need to be changed. 7. Interrupt 9 will occur if the second or subsequent words of a floating-point operand fall outside a segment's size. Interrupt 13 will occur if the starting address of a numeric operand falls outside a segment's size. An exception handler should be included in 80286/80287 code to report these programming errors. This appendix summarizes the differences between the 80287 NPX and the 8087 NPX, and provides details showing how 8086/8087 programs can be ported to the 80286/80287. 8. Except for the processor control instructions, all of the 80287 numeric instructions are automatically synchronized by the 80286 CPU-the 80286 automatically tests the BUSY line from the 80287 to ensure that the 80287 has completed its previous instruction before executing the next ESC instruction. No explicit WAIT instructions are required to assure this synchronization. For the 8087 used with 8086 and 8088 processors, explicit WAITs are required before each numeric instruction to ensure synchronization. Although 808618087 programs having explicit WAIT instructions will execute perfectly on the 80286/80287 without reassembly, these WAIT instructions are unnecessary. 1. The NPX signals exceptions through a dedicated ERROR line to the 80286. The NPX error signal does not pass through an interrupt controller (the 8087 INT Signal does). Therefore, any interruptcontroller-oriented instructions in numeric exception handlers for the 8086/8087 should be deleted. 2. The 8087 instructions FENI/FNENI and FDISII FNDISI perform no useful function in the 80287. If the 80287 encounters one of these opcodes in its instruction stream, the instruction will effectively be ignored-none of the 80287 internal states will be updated. While 8086/8087 containing these instructions may be executed on the 80286/80287, it is unlikely that the exceptionhandling routines containing these instructions will be completely portable to the 80287. 3. Interrupt vector 16 must point to the numeric exception handling routine. 9. Since the 80287 does not require WAIT instructions before each numeric instruction, the ASM286 assembler does not automatically generate these WAIT instructions. The ASM86 assembler, however, automatically precedes every ESC instruction with a WAIT instruction. Although numeric routines generated using the ASM86 assembler will generally execute correctly on the 80286/80287, reassembly using ASM286 may result in a more compact code image. The processor control instructions for the 80287 may be coded using either a WAIT or No-WAIT form of mnemonic. The WAIT forms of these instructions cause ASM286 to precede the ESC instruction with a CPU WAIT instruction, in the identical manner as does ASM86. 4. The ESC instruction address saved in the 80287 includes any leading prefixes before the ESC opcode. The corresponding address saved in the 8087 does not include leading prefixes. 5. In Protected-Address mode, the format of the 80287's saved instruction and address pointers is different than for the 8087. The instruction opcode is not saved in Protected mode-exception handlers will have to retrieve the opcode from memory if needed. 37 PC/A T-Compatib/e 80387 Connection F APPENDIX F PCI AT*-COMPATIBLE 80387 CONNECTION The PC/AT uses a nonstandard scheme to report 80287 exceptions to the 80286. When replicating the PC/AT coprocessor interface in 80386-based systems, the PC/AT interface cannot be used in exactly the same way; however, this appendix outlines a similar interface that works on 80386/80387 systems and maintains compatibility with the nonstandard PC / A T scheme. Note that the interface outlined here does not represent a new interface standard; it needs to be incorporated in AT-compatible designs only because the 80286 and 80287 in the PC / A T are not connected according to the standards defined by Intel. The standard 80386/80387 connection recommended by Intel in the 80387 Data Sheet functions properly; the 80386 implementation has not been and will not be altered. F.1 THE PCI AT INTERFACE In the PC/AT, the ERROR# input to the 80286 is tied inactive (high) permanently. The ERROR# output of the 80287 is tied to an interrupt port (IRQI3). This interrupt replaces exception signaling via the 80286's ERROR# input. To guarantee (in the case of an 80287 exception) that INTR 13 will be serviced prior to the execution of any further 80287 instructions, an edge-triggered flip-flop latches BUSY # using ERROR# as a clock. The output of this latch is ORed with the BUSY # output of the 80287 and drives the BUSY# input of the 80286. This PC/AT scheme effectively delays deactivation of BUSY # at the 80286 whenever an 80287 ERROR# is signaled. Since the 80286 BUSY # input remains active after an exception, the 80286 interrupt 13 handler is guaranteed to execute before any other 80287 instructions may begin. The interrupt 13 handler clears the BUSY# latch (via a write to a special I/O port), thus allowing execution of 80287 instructions to proceed. The interrupt 13 handler then branches to the NMI handler, where the user-defined numerics exception handler resides in PC-compatible systems. The use of an interrupt guarantees that an exception from a coprocessor instruction will be detected. Latching BUSY # guarantees that any coprocessor instruction (except FINIT, FSETPM, and FCLEX) following the instruction that raised the exception will not be executed before the NMI handler is executed. This PC/AT scheme approximates the exception reporting scheme between the 8087 and 8088 in the original Pc. F-1 PCI AT-COMPATIBLE 80387 CONNECTION F.2 HOW TO ACHIEVE THE SAME EFFECT IN AN 80386 SYSTEM The 80386 can use a PC/AT-compatible interface to communicate with an 80387 provided that, when an NPX exception occurs, BUSY # active time is extended and PEREQ is reactivated only after 80387 BUSY # has gone inactive. The 80387 is left active (tying STEN high) at all times. Also, the 80386 and 80387 must be reset by the same RESET signaL The reactivation of PEREQ for the 80386 is needed for store instructions (for example, FST mem) because the 80387 drops PEREQ once it signals an exception. While the 80386 has not yet recognized the occurrence of the exception, it still expects the data transfers to complete via PEREQ reactivation. It is permissible for the 80386 to receive undefined data during such I/O read cycles. Disabling the 80387 is not necessary, because the dummy datatransfer cycles directed to the 80387 when PEREQ is externally reactivated for the 80386 will not disturb the operation of the 80387. The interrupt 13 handler should remove the extension of BUSY # and reactivation of PEREQ via a write to PC / AT -compatible hardware at I/O port FOH. F-2 Glossary of 80387 and Floating-Point Terminology GLOSSARY OF 80387 AND FLOATING-POINT TERMINOLOGY This glossary defines many terms that have precise technical meanings as specified in the IEEE 754 Standard or as specified in this manual. Where these terms are used, they have been italicized to emphasize the precision of their meanings. In reading these definitions, you may therefore interpret any italicized terms or phrases as cross-references. Base: (1) a term used in logarithms and exponentials. In both contexts, it is a number that is being raised to a power. The two equations (y = log base b of x) and (bY = x) are the same. Base: (2) a number that defines the representation being used for a string of digits. Base 2 is the binary representation; base lOis the decimal representation; base 16 is the hexadecimal representation. In each case, the base is the factor of increased significance for each succeeding digit (working up from the bottom). Bias: a constant that is added to the true exponent of a real number to obtain the exponent field of that number's floating-point representation in the 80387. To obtain the true exponent, you must subtract the bias from the given exponent. For example, the single real format has a bias of 127 whenever the given exponent is nonzero. If the 8-bit exponent field contains 10000011, which is 131, the true exponent is 131-127, or +4. Biased Exponent: the exponent as it appears in a floating-point representation of a number. The biased exponent is interpreted as an unsigned, positive number. In the above example, 131 is the biased exponent. Binary Coded Decimal: a method of storing numbers that retains a base 10 representation. Each decimal digit occupies 4 full bits (one hexadecimal digit). The hexadecimal values A through F (1010 through 1111) are not used. The 80387 supports a packed decimal format that consists of 9 bytes of binary coded decimal (18 decimal digits) and one sign byte. Binary Point: an entity just like a decimal point, except that it exists in binary numbers. Each binary digit to the right of the binary point is multiplied by an increasing negative power of two. C3-CO: the four "condition code" bits of the 80387 status word. These bits are set to certain values by the compare, test, examine, and remainder functions of the 80387. Characteristic: a term used for some non-Intel computers, meaning the exponent field of a floating-point number. Chop: to set one or more low-order bits of a real number to zero, yielding the nearest representable number in the direction of zero. Condition Code: the four bits of the 80387 status word that indicate the results of the compare, test, examine, and remainder functions of the 80387. Glossary-1 GLOSSARY Control Word: a 16-bit 80387 register that the user can set, to determine the modes of computation the 80387 will use and the exception interrupts that will be enabled. Denormal: a special form of floating-point number. On the 80387, a denormal is defined as a number that has a biased exponent of zero. By providing a significand with leading zeros, the range of possible negative exponents can be extended by the number of bits in the significand. Each leading zero is a bit of lost accuracy, so the extended exponent range is obtained by reducing significance. Double Extended: the Standard's term for the 80387's extended format, with more exponent and significand bits than the double format and an explicit integer bit in the significand. Double Format: a floating-point format supported by the 80387 that consists of a sign, an II-bit biased exponent, an implicit integer bit, and a 52-bit significand-a total of 64 explicit bits. Environment: the 14 or 28 (depending on addressing mode) bytes of 80387 registers affected by the FSTENV and FLDENV instructions. It encompasses the entire state of the 80387, except for the 8 registers of the 80387 stack. Included are the control word, status word, tag word, and the instruction, opcode, and operand information provided by interrupts. Exception: any of the six conditions (invalid operand, denormal, numeric overflow, numeric underflow, zero-divide, and precision) detected by the 80387 that may be signaled by status flags or by traps. Exception Pointers: The data maintained by the 80386 to help exception handlers identify the cause of an exception. This data consists of a pointer to the most recently executed ESC instruction and a pointer to the memory operand of this instruction, if it had a memory operand. An exception handler can use the FSTENV and FSA VE instructions to access these pointers. Exponent: (I) any number that indicates the power to which another number is raised. Exponent: (2) the field of a floating-point number that indicates the magnitude of the number. This would fall under the above more general definition (I), except that a bias sometimes needs to be subtracted to obtain the correct power. Extended Format: the 80387's implementation of the Standard's double extended format. Extendedformat is the main floating-point format used by the 80387. It consists of a sign, a I5-bit biased exponent, and a significand with an explicit integer bit and 63 fractionalpart bits. Floating-Point: of or pertaining to a number that is expressed as base, a sign, a significand, and a signed exponent. The value of the number is the signed product of its significand and the base raised to the power of the exponent. Floating-point representations are more versatile than integer representations in two ways. First, they include fractions. Second, their exponent parts allow a much wider range of magnitude than possible with fixed-length integer representations. Glossary-2 GLOSSARY Gradual Underflow: a method of handling the underflow error condition that minimizes the loss of accuracy in the result. If there is a denormal number that represents the correct result, that denormal is returned. Thus, digits are lost only to the extent of denormalization. Most computers return zero when underflow occurs, losing all significant digits. Implicit Integer Bit: a part of the significand in the single real and double real formats that is not explicitly given. In these formats, the entire given significand is considered to be to the right of the binary point. A single implicit integer bit to the left of the binary point is always one, except in one case. When the exponent is the minimum (biased exponent is zero), the implicit integer bit is zero. Indefinite: a special value that is returned by functions when the inputs are such that no other sensible answer is possible. For eachjZoating-point format there exists one quiet NaN that is designated as the indefinite value. For binary integer formats, the negative number furthest from zero is often considered the indefinite value. For the 80387 packed decimal format, the indefinite value contains all 1's in the sign byte and the uppermost digits byte. Inexact: The Standard's term for the 80387's precision exception. Infinity: a value that has greater magnitude than any integer or any real number. It is often useful to consider infinity as another number, subject to special rules of arithmetic. All three Intel floating-point formats provide representations for +00 and -00. Integer: a number (positive, negative, or zero) that is finite and has no fractional part. Integer can also mean the computer representation for such a number: a sequence of data bytes, interpreted in a standard way. It is perfectly reasonable for integers to be represented in a floating-point format; this is what the 80387 does whenever an integer is pushed onto the 80387 stack. Integer Bit: a part of the significand injZoating-point formats. In these formats, the integer bit is the only part of the significand considered to be to the left of the binary point. The integer bit is always one, except in one case: when the exponent is the minimum (biased exponent is zero), the integer bit is zero. In the extended format the integer bit is explicit; in the single format and double format the integer bit is implicit; i.e., it is not actually stored in memory. Invalid Operation: the exception condition for the 80387 that covers all cases not covered by other exceptions. Included are 80387 stack overflow and underflow, NaN inputs, illegal infinite inputs, out-of-range inputs, and inputs in unsupported formats. Long Integer: an integer format supported by the 80387 that consists of a 64-bit two's complement quantity. Long Real: an older term for the 80387's 64-bit double format. Mantissa: a term used with some non-Intel computers for the significand of afloating-point number. Glossary-3 GLOSSARY Masked: a term that applies to each of the six 80387 exceptions I,D,Z,O,U,P. An exception is masked if a corresponding bit in the 80387 control word is set to one. If an exception is masked, the 80387 will not generate an interrupt when the exception condition occurs; it will instead provide its own exception recovery. Mode: One of the status word fields "rounding control" and "precision control" which programs can set, sense, save, and restore to control the execution of subsequent arithmetic operations. NaN: an abbreviation for "Not a Number"; a floating-point quantity that does not represent any numeric or infinite quantity. NaNs should be returned by functions that encounter serious errors. If created during a sequence of calculations, they are transmitted to the final answer and can contain information about where the error occurred. Normal: the representation of a number in a floating-point format in which the significand has an integer bit one (either explicit or implicit). Normalize: convert a denormal representation of a number to a normal representation. NPX: Numeric Processor Extension. This is the 80387, 80287, or 8087. Overflow: an exception condition in which the correct answer is finite, but has magnitude too great to be represented in the destination format. This kind of overflow (also called numeric overflow) is not to be confused with stack overflow. Packed Decimal: an integer format supported by the 80387. A packed decimal number is a lO-byte quantity, with nine bytes of 18 binary coded decimal digits and one byte for the sign. Pop: to remove from a stack the last item that was placed on the stack. Precision: The effective number of bits in the significand of the floating-point representation of a number. Precision Control: an option, programmed through the 80387 control word, that allows all 80387 arithmetic to be performed with reduced precision. Because no speed advantage results from this option, its only use is for strict compatibility with the standard and with other computer systems. Precision Exception: an 80387 exception condition that results when a calculation does not return an exact answer. This exception is usually masked and ignored; it is used only in extremely critical applications, when the user must know if the results are exact. The precision exception is called inexact in the standard. Pseudozero: one of a set of special values of the extended real format. The set consists of numbers with a zero significand and an exponent that is neither all zeros nor all ones. Pseudozeros are not created by the 80387 but are handled correctly when encountered as operands. Glossary-4 GLOSSARY Quiet NaN: a NaN in which the most significant bit of the fractional part of the significand is one. By convention, these NaNs can undergo certain operations without causing an exception. Real: any finite value (negative, positive, or zero) that can be represented by a (possibly infinite) decimal expansion. Reals can be represented as the points of a line marked off like a ruler. The term real can also refer to afloating-point number that represents a real value. Short Integer: an integer format supported by the 80387 that consists of a 32-bit two's complement quantity. short integer is not the shortest 80387 integer format-the 16-bit word integer is. Short Real: an older term for the 80387's 32-bit single format. Signaling NaN: a NaN that causes an invalid-operation exception whenever it enters into a calculation or comparison, even a nonordered comparison. Significand: the part of a floating-point number that consists of the most significant nonzero bits of the number, if the number were written out in an unlimited binary format. The significand is composed of an integer bit and a fraction. The integer bit is implicit in the single format and double format. The significand is considered to have a binary point after the integer bit; the binary point is then moved according to the value of the exponent. Single Extended: a floating-point format, required by the standard, that provides greater precision than single; it also provides an explicit integer bit in the significand. The 80387's extended format meets the single extended requirement as well as the double extended requirement. Single Format: a floating-point format supported by the 80387, which consists of a sign, an 8-bit biased exponent, an implicit integer bit, and a 23-bit significand-a total of 32 explicit bits. Stack Fault: a special case of the invalid-operation exception which is indicated by a one in the SF bit of the status word. This condition usually results from stack underflow or overflow. Standard: "IEEE Standard for Binary Floating-Point Arithmetic," ANSI/IEEE Std 754-1985. Status Word: A 16-bit 80387 register that can be manually set, but which is usually controlled by side effects to 80387 instructions. It contains condition codes, the 80387 stack pointer, busy and interrupt bits, and exception flags. Tag Word: a 16-bit 80387 register that is automatically maintained by the 80387. For each space in the 80387 stack, it tells if the space is occupied by a number; if so, it gives information about what kind of number. Temporary Real: an older term for the 80387's 80-bit extended format. Glossary-5 GLOSSARY Tiny: of or pertaining to a floating-point number that is so close to zero that its exponent is smaller than smallest exponent that can be represented in the destination format. TOP: The three-bit field of the status word that indicates which 80387 register is the current top of stack. Transcendental: one of a class of functions for which polynomial formulas are always approximate, never exact for more than isolated values. The 80387 supports trigonometric, exponential, and logarithmic functions; all are transcendental. Two's Complement: a method of representing integers. If the uppermost bit is zero, the number is considered positive, with the value given by the rest of the bits. If the uppermost bit is one, the number is negative, with the value obtained by subtracting (2 bit count) from all the given bits. For example, the 8-bit number 11111100 is ~4, obtained by subtracting 2 8 from 252. Unbiased Exponent: the true value that tells how far and in which direction to move the binary point of the significand of a floating-point number. For example, if a Single-format exponent is 131, we subtract the Bias 127 to obtain the unbiased exponent +4. Thus, the real number being represented is the significand with the binary point shifted 4 bits to the right. Underflow: an exception condition in which the correct answer is nonzero, but has a magnitude too small to be represented as a normal number in the destination floating-point format. The Standard specifies that an attempt be made to represent the number as a denormal. This denormalization may result in a loss of significant bits from the significand. This kind of underflow (also called numeric overflow) is not to be confused with stack underflow. Unmasked: a term that applies to each of the six 80387 exceptions: I,D,Z,O,U,P. An exception is unmasked if a corresponding bit in the 80387 control word is set to zero. If an exception is unmasked, the 80387 will generate an interrupt when the exception condition occurs. You can provide an interrupt routine that customizes your exception recovery. Unnormal: a extended real representation in which the explicit integer bit of the significand is zero and the exponent is nonzero. Unnormal values are not supported by the 80387; they cause the invalid-operation exception when encountered as operands. Unsupported Format: Any number representation that is not recognized by the 80387. This includes several formats that are recognized by the 8087 and 80287; namely: pseudo-NaN, pseudoinfinity, and un normal. Word Integer: an integer format supported by both the 80386 and the 80387 that consists of a 16-bit two's complement quantity. Zero divide: an exception conditiGn in which the inputs are finite, but the correct answer, even with an unlimited exponent, has infinite magnitude. Glossary-6 inter DOMESTIC SALES OFFICES ALABAMA GEORGIA NEW MEXICO TEXAS Intel Corp. 5015 Bradford Drtve Suittf2 Huntsville 35805 Tel: (205) 830-4010 Intel Corp. 3280 Pointe Parkway Suite 200 Norcross 30092 Tel. (404) 449-0541 Intel Corp 8500 Menual Boulevard N E SUite B 295 Albuquerque 87112 Tel: (505) 292-8086 Intel Corp. 313 E. Anderson Lane SUite 314 Austin 78752 Tel (512) 454-3628 ARIZONA ILLINOIS NEW YORK Intel Corp 11225 N. 28th Orive Suite 2140 Phoenix 85029 Tel. (602) 869-4980 Inte'Co~.· Intel Corp. 127 MaIO Street Binghamton 13905 Tel: (607) 773-0337 Intel Corp· 12300 Ford Road Suite 380 Danas 75234 Tel: (214) 241-8087 TWX: 910-860-5617 ~nlt~~ CN~r~i Dorado Place Suite 301 Tucson 85715 Tel: (602) 299-6815 ~~~~mb~7~n~o~~~Qad, Suite 400 Tel. (312) 310-8031 INDIANA Intel Corp." Inlel Corp 8777 Purdue Road Suite 125 Indianapolis 46268 Te1: (317) 875-0623 Tel· (716) 425-2750 TWX 510-253-7391 CALIFORNIA IOWA Intel Corp 21515 Vanowen Street SUite 116 ¥:t{3f8f~~~81~oOg Intel Corp. 2250 E, Imperial Highway SUite 218 ~~~~~3)d~~~~~lo Intel Corp. ~~~~a~~t~ ~~l1~ulte 101 Intel Corp SI. Andrews Building 1930 SI. Andrews Drive N E Cedar Rapids 52402 Tel· (319) 393-5510 KANSAS Intel Corp. 8400 W. 11 Oth Street SUite 170 Overland Park 66210 Tel· (913) 345-2727 Tel. (916) 920-8096 MARYLAND Inlel Corp 4350 ExecutilJe DrlII8 SUite 105 Intel Corp· 7321 Parkway Drive South SUlteC Hanover 21076 Tel. (301) 796-7500 TWX: 710-862-1944 ~:I~ (~~e~t4~~!g~BO Intel Corp· 400 N Tustin Avenue SUite 450 Santa Ana 92705 Tel: (714) 835-9642 TWX: 910-595-'114 Intel Corp 7833 Walker Dnve Greenbelt 20770 Tel (301)441-1020 MASSACHUSETTS Intel Corp." San Tomas 4 . 2700 San Tomas Expressway Santa Crara. CA 95051 Tel: (40B) 986-8086 TWX: 910-338-0255 Intel Corp." Westford Corp Center 3 Carlisle Road Westford 01886 Tel: (617) 692-3222 TWX: 710-343-6333 COlORADO ~~~~~slS.~~6s Office Park Intel Corp" 300 Motor Parkway Hauppauge 11787 Tel: (516)231-3300 TWX: 510-227-6236 UTAH Intel Corp SUite 28 Hollowbrook Park 15 Myers Corners Road ~~~(g1~2);;;_'~1 ~~590 VIRGINIA Intel Corp 1603 Santa Rosa Road SUite log Richmond 23288 Tel: (804) 282-5668 NORTH CAROLINA Intel Corp 5700 E:a:ecul!ve Cenler Drive SUite 213 Charlone 28212 Tel (704) 568-8966 Intel Corp ~;J~~ ~6iilff Road ~:,'ei~~ 9~77~~~8022 Intel Corp' 3401 Park Center Dnve SUite 220 Dayton 45414 Tel: (513) 890-5350 TWX: 810-450-2528 Intel Corp." 25700 Science Park Dr Beachwood 44122 Tel: (216) 464-2736 TWX. 810-427-9::!OS ~~,':O[;~~ ~Ct6b~~0907 Intel Corp: :~~~r~~2r~ St-, SUite 915 MINNESOTA Tel. (303) 321-8086 TWX. 910-931-2289 Intel Corp 3500 W. 80tn 51.. SUite 360 CONNECTICUT ~~1.o(~1~~'835~gii~ ~:1~~~~)O;:J-1130 TWX· 710-456-1199 SUite 100 FLORIDA Intel Corp 242 N. Westmonte Or.. Suite 105 ~~~~~\e8~~~~~i832714 Intel Corp. ~~a~d'Z~d~'~ ~oJ>ulte 100 Tel: (305) 771-0600 TWX: 510-956-9407 Intel Corp. 11300 4th Street North Suite 110 St. Petersburg 33702 Te'· (813) 577-2413 Intel Corp. 6801 N. Broadway Suite 115 Oklahoma City 73116 Tel· (405) 848-8086 Intel Corp 15254 N W Greenbrier Parkway, Bldg. B Beaverton 97006 Tel: (503) 645-8051 TWX· 910-467-8741 MISSOURI PENNSYLVANIA Intel Corp 4203 Eartn City Expressway Suite 131 Earth City 63045 Tel: (314) 291-1990 NEW JERSEY InteICorp.Parkway 109 Office Center 328 Newman Springs Road Red Bank 07701 Tel. (201) 747-2233 Inlel Corp. 280 Corporate Center 75 Livingston Avenue First Floor Roseland 07068 Tel· (201) 740-0111 Intet Corp 155-108 Avenue N.E SUite 386 Bel!evue 98004 Tel: (206) 453-8086 TWX. 910-443-3002 WISCONSIN OREGON TWX: 910-576-2867 Intel Corp 26 Mill Plain Road WASHINGTON Intet Corp 408 N. Mullan Road SUite 102 SpOkane 99206 Tel: (509) 928-8086 OHIO OKLAHOMA Intel Corp 7071 Orcnard Lake Road Suite 100 West Bloomfield 48033 TeJ: (313) 851-8096 Intel Corp 5201 Green Street SUite 290 Murray 64123 Tel. (801) 263-8051 TWX: 510-248-0060 MICHIGAN 'nteICarp. 4445 NorthparJ( Drive Suite 100 Inlel Corp." 7322 S.W. Freeway Suite 1490 Houston 77074 Tel: (713) 988-8086 TWX. 910-881-2490 Intel Corp 1513 Cedar Cliff Dnve Camp Hlfj 17011 Tel. (717) 737-5035 Intel Corp." 455 Pennsylvania Avenue Fort Washington 19034 Tel· (215) 641-1000 TWX. 510-661-2077 Inlel Corp.' 400 Penn Center Blvd., Suite 610 ~~~.s(~~rN)\~~~:~70 Intel Corp 330 S Executive Dr SUite 102 Brookfield 53005 Tel: (414) 184-8087 FAX.: 414-796-2115 CANADA BRITISH COLUMBIA Intel Semiconductor of Canada. Ltd 301-2245 W Broadway Vancouver V6K 2E4 Tel. (604) 738-6522 ONTARIO Intel Semiconductor of Canada, Ltd 2650 QueenSlliew Drive SUite 250 Ottawa K28 BHfi Tel· (613) 829-9714 TLX: 053-4115 Intel Semiconductor of Canada, ltd. 190 Attwell Drive SUlle500 Rexdale M9W 6H8 Tel: (416) 675-2105 TLX· 06983574 QUEBEC Intel Sertllconductor of Canada. Ltd 620 St. Jean Boulevard POinte Claire H9A 3K3 Tel: (514) 694-9130 TWX: 514-694-9134 PUERTO RICO Intel Microprocessor Corp South Industrial Park P.O. Box 910 Las Piedras 00671 Tel. (809) 733-8616 'Field ApplicatIOn Location CG-3/17/a1 DOMESTIC DISTRIBUTORS ALABAMA CALIFORNIA (Cont'd) CONNECTICUT ILUNOIS (Conl'd) MICHIGAN Arrow Electronics, Inc 1015 Henderson Road Huntsville 35805 Tel' (205) 837-6955 tHamilton Electro Sales 10950W. Washington Blvd tArrow Electronics, Inc. 12 Beaumont Road MT\ System'S Sales 1100 West Thorndale Itasca 60143 Tel: (312) 773-2300 Arrow ElectroniOs. Inc 755 Phoenix Drive: Ann Arbor 481 04 ~~II:V~l ~)~:~~~~B TWX: 910-340-6364 tHamiltonJAvnet Elect,omcs 4940 Research Drive Huntsville 35805 Tel: (205) 637-7210 TWX: 810-726-2162 Pioneer(fechnologies Group Inc ~~~~S~j~!:e;~~~5SQuare Tel: (205) 837-9300 TWX: 810-726-2197 ARIZONA tHamilton/Avnet Electronics 505 S. Madison Drive TAmpe 85281 Tel: (602) 231-5100 TWX: 910-950-0077 Kierulff Electronics, Inc 4134 E, Wood Street PhoeniX 85040 Tel: (602) 437-0750 TWX: 910-951-1550 Wyle Distribution Group 17855 N. Black Canyon Highway PhoeniX 85023 Tel: (602) 866-2888 CALIFORNIA Arrow Electronics, Inc 19748 Dearborn Street Chatsworth 9131 1 Tel: (818) 701-7500 TWX: 910-493-2086 Arrow Electroflics. Inc 1502 Crocker Avenue Hayward 94544 Tel: (408) 487-4600 Arrow Electronics. Inc 9511 Ridgehaven Court ~:I~ (~~~t5~~~iioo TLX' 888064 tArrow ElectrOnics. Inc 521 Weddell Drive Sunnyvale 940B6 Tel: (408) 745-6600 TWX' 910-339-9371 Arrow Electronics. Inc 2961 Dow Avenue Tustin 92680 Tel: (714) 838-5422 TWX: 910-595-2860 tAvnet Electronics 350 McCormick Avenue Costa Mesa 92626 Tel: (714) 754-6051 TWX: 910-595-192B Hamllton/Avnet ElectrOniCS 1175 Bordeaux Drive ~~I~(XO~~e7~j~~~00 TWX: 910·339-9332 tHamllton/Avnet ElectroniCs 4545 Viewridge Avenue ~;~ (~~e£o5~~~i?00 TWX: 910-595-263B tHamilton/Avnet Electronics 20501 Plummer Street Chatsworth 91311 Tel: (818) 700-6271 TWX: 910·494-2207 tHamilton/Avnet ElectroniCS 4103 Northgate Boulevard Sacramento 95834 Tel: (916) 920-3150 tHamilton/Avnet Electronics 3002 G Street Ontario 91311 Tel: (714) 989-9411 HamiltonjAvnet Electronics 19515 So Vermont Avenue Torrance 90502 Tel: (213) 615-3909 TWX: 910-349-6263 Hamilton Electro Sales 9650 De Soto Avenue Chatsworth 91311 Tel: (81B) 700-6500 Hamilton Electro Sales 1361 B West 190th Street Gardena 9024B Tel: (213) 558-2131 tHamliton Electro Sales 3170 Pullman Street Costa Mesa 92626 Tel: (714) 64'1-4150 TWX: 910-595-2638 Klerulff Electronics, Inc 10824 Hope Street Cypress 90430 Tel: (714) 220-6300 tKierulH Electronics, Inc 1180 Murphy Avenue San Jose 95131 Tel' (40B) 971-2600 TWX: 910-379-6430 tKlerulH Electronics, Inc 14101 Franklin Avenue Tustin 92680 Tel. (714) 731-5711 TWX: 910-595·2599 tKlerulff Electronics. Inc 5650 Jillson Street Commerce 90040 Tel. (213) 725-0325 TWX: 910-580-3666 ~~~'(~gmrg6~~ij~ 1 TWX: 710-476-0162 HamiltonjAvnet Electronics Commerce Industrial Park Commerce Drive Danbury 06810 Tel: (203) 797-2800 TWX: 710-456-9974 tPioneer Northeasl Electronics 112 Main Street Norwalk 06851 Tel: (203) 853-1515 TWX, 710-46B-3373 ~~: ~3~~i ~~~§68%0007 TWX: 910-222-1834 ~:;~~~~:v;~~eElectronics tArrow ElectrOnics, Inc. 350 Fairway Drive Deerfield Beach 33441 Tel: (305) 429-B200 TWX: 510-955-9456 Carmel 46032 Tel: (317) 844-9333 TWX: 810-260-3966 Arrow ElectroniCS, Inc 1001 NW. 62nd St, 5te. lOB Ft Lauderdale 33309 Tel: (305) 776-7790 TWX: 510-955-9456 tPioneer Electrorllcs 6408 Castle place Drive Indianapolis 46250 Tel: (317) 849-7300 TWX: 810-260-1794 tHamilton/Avnet Electronics Pioneer ElectrOrllcs 10551 Lackman Rd. Lenexa 66215 Tei: (913) 492-0500 tWyle Distribution Group 124 Maryland Street Hamilton/Avnet Electronics 3197 Tech Dflve North SI. Petersburg 33702 Tel: (813) 576-3930 TWX: Bl0-B63-0374 KENTUCKY Hamilton/Allnet Electronics 1051 O. Newton Park i:fl('9J~) 2\09~~~75 TWX: 910-34B-7140 or 7111 Wyle Distribution Group 11151 Sun Center Drive Rancho Cordova 95670 Tel: (916) 638-52B2 tWyle Distribution Group 9525 Chesapeake Drive ~;I~ (~if~o5~;~~~71 TWX: 910-335-1590 tWyle Distribution Group 3000 Bowers AvenlJ~ Santa Clara 95051 Tel. (408) 727-2500 TWX: 910-33B-0296 Wyle Military 18910 Teller Avenue Irvine 92750 Tel: (714) 851-9958 TWX: 310-371-9127 Wyle Systems 7382 Lampson Avenue Garden Grove 9264 I Tel: (714) 891-'717 TWX: 910-595-2642 COLORADO Arrow Eleclronics, Inc 1390 S. Potomac Street Suite 136 Aurora 80012 Tel: (303) 696-1111 tHamiiton/Avnet Electronics 8765 E. Orchard Road Suite 708 Englewood 80111 Tel: (303) 740-1017 TWX: 910-935-0787 tWyle Oistributiorl Group 451 E. 124th Avenue Thornton 80241 Tel: (303) 457-9953 TWX. 910-936-0770 tPior.aer Electronics 13485 Stamford Li~onia 48150 Tel: (313) 525-1800 TWX: 810-242-3271 tArrow Electronics, Inc. 5230 W. 73rd Street Edina 55435 Tel: (612) 830-1800 TWX: 910-576-3125 Hamilton/Avnet Electronics 12400 White Water Drive Minnetonka 55343 Tel: (612) 932-0600 TWX: (910) 576-2720 tPioneer Electronics 10203 Bren Road East Minnetonka 55343 Tel: (S12) 935-5444 TWX: 910-576-2738 MISSOURI Hamllton/Avnet ElectroniCS tWyle Dlstnoutlon G~oup 17872 Cowan Avenue IrVine 92714 Tel: (714) B63-9953 TWX: 910-595-1572 Pioneer Electronics 4505 8roadmoor Ave. S.E. Grand Rapids 49508 Tel: (616) 555-1800 MINNESOTA tHamiltonjAvnet Electronics 9219 Quivera Road Overland Park 66215 Tel: (913) 8BB-8900 TWX: 910-743-0005 ~~.02a~d'Z;d~1!h3~t69 ~:~il~~ASt~:~F~~t~onics Space A5 Grand Rapids 49508 Tel: (616) 243-8805 TWX: 810-273-6921 KANSAS tArrow Electronics, Inc 50 Woodlake Drive W" Bldg. B Palm Bay 32905 Tel: (305) 725·1480 TWX: 510·959-6337 Tel: (305) 971-2900 TWX' 510-956-3097 ~:x(~Jrb-~2~~20~~ tHamiltonjAvnet Electronics 32487 Schoolcraft Road Livonia 48150 Tel: (313) 522-4700 TWX: 810-242-8775 INDIANA tArrow Electronics, Inc 2495 Directors Row. Suite H Indianapolis 46241 Tel: (317) 243-9353 TWX: 810-341-3119 FLORIDA Wyle Distnbution Group 26560 Agoura Street Calabasas 91302 Tel: (SIB) 8S0-9000 TWX: 818-372-0232 ~~~(~~~)d~2~~~{gO tPioneer Electronics 1551 Carmen Drive ~~~e~n~;:r;2~9~oulevard Tel: (3g5) 628-38BB TWX, 810-853-0322 tPioneer Electronics 337 N. Lake Blvd" Ste 1000 ~~~ (~8~)t~if.~8~g 32701 TWX: 810·853-0284 Pioneer ElectroniCs 674 S. Military Trail Deerfield Beach 33442 Tel: (305) 42S-8877 TWX: 510-955-9653 GEORGIA tArrow Electronics, Inc 3155 Northwoods Parkway SUite A Norcross 30071 Tel: (404) 449·8252 TWX' 810-766-0439 MARYLAND Arrow Electronics. Inc 8300 Gulford Road #H Rivers Center Columbia 21046 Tel: (301) 995-0003 TWX: 71 0-236-9005 tHamiltonjAvnet Electronics 6822 Oak Hall Lane Columbia 21045 Tel: (301) 995-3500 TWX: 710-862-1B61 tMesa Technology Corp 9720 Patuxentwood Dr Columbia 21046 Tel: (301) 720-5020 TWX: 710-B28-9702 tPloneer ElectroniCS 9100 Gaither Road Gaithersburg 20B77 Tel: (301) 921-0660 TWX: 710-B28-0545 ~:~II~n~~~~~:r;~e~~~~~~~ MASSACHUSETTS Norcross 30092 Tel: (404) 447-7500 TWX: 810-766-0432 tArrow Eleclronics, Inc 1 Arrow Drive Woburn 01801 Tel: (617) 933-8130 TWX: 710-393-6770 Pioneer ElectrOnics 3100 F. Northwoods Place Norcross 30071 Tel: (404) 44B-1111 TWX: 810-766-4515 tHamilton/Avnet Electronics 100 Centennial Drive ~:~~~~1) ~1~~g701 ILLINOIS TWX: 710-393-0382 tArrow Electronics, Inc 2000 E Alonquin Street Kierulf1 Electronics. Inc 13 Fortune Dr Billerica 01821 Tel: (617) 667-8331 i~t(~~2j~~7~g~lg TWX. 910-291-3544 tHamilton/Avnet Electronics 1130 Thorndale Avenue Bensenville 60106 Tel: (312) 860-7780 TWX: 910-227-0060 Klerul1f Electronics, Inc 1140 W. Thorndale Itasca 60143 Tel: (312) 250-0500 MTI Systems Sales 13 Fortune Drive Billenca 01821 Pioneer Northeast ElectroniCS 44 Hartwell Avenue tArrow ElectroniCS. Inc. 2380 Schuet! St louis 63141 Tel: (314) 567-68B8 TWX: 910-764-08B2 tHamilton/Avnet Electronics 13743 Shoreline Court Earth City 63045 Tel: (314) 344-1200 TWX: 910-762-0684 Kierulff Electronics. Inc 11804 Borman Dr St. LuiS 63146 Tel: (314) 997-4956 NEW HAMPSHIRE tArrow Electronics. Inc 3 Perimeter Road Manchester 03103 Tel: (603) 668-6968 TWX: 710-220-1684 Hamilton/Avnet Electronics 444 E. Industrial Drive Manchester 03104 Tel: (603) 624-9400 NEW JERSEY tArrow Electronics. Inc 6000 Lincoln East Marlton 08053 Tel. (609) 596-8000 TWX: 710-897-0829 tArrow Electronics, Inc 2 Industrial Road Fairfield 07006 Tel: (201) 575-5300 TWX: 710-998-2206 tHamilton/Avnet Electronics 1 Keystone Avenue Bldg. 36 i~I~(2'O~,!I~~~?gll 0 TWX: 710-940-0262 i:~7211~) 8~211_~~00 TWX: 710·326-6617 tMlcrocomputer System Technical Distributor Centers CG-3!17/87 DOMESTIC DISTRIBUTORS NEW JERSEY (Cont'd) NEW YORK (Cont'd) OREGON (Cont'd) UTAH tHamilton/Avnat ElectroniCS 10 Industrial Fairfield 07006 Tel: (201) 575-3390 TWX: 701-734-4388 tPloneer Northeast ElectroniCs B40 Fairport Park Fairport 14450 Tet· (716) 3B1-7070 TWX: 510-253-7001 Wyte Distribution Group 5250 N.E Elam Young Parkway SUIte 600 Hillsboro 97124 Tel: (503) 640-6000 TWX: 910-460-2203 tHamllton/Avnel Electronics 1585 West 2100 South tPioneer Northeast Electronics 45 Roule 46 Plnebrook 07058 Tel: (201) 575-3510 TWX: 710-734-4382 NORTH CAROLINA lWX: 510-928-1856 tHamllton/Avnet Electronics Hamilton/Avnet Electronics NEW MEXICO Tel 878-0819 TWX. 510-928-1836 NEW YORK Arrow Electronics. Inc 25 Hub Drive Melville 11747 Tel: (SIS) 694-6800 TWX. 510-224-6126 ~:,'~(~~ 9~~~~~3132 ~~~~ Stf~~~~orest (~'9) Drrve Pioneer Electronics 9801 A-Southern Pine Blvd Crrarlotte 28210 Tet: {704} 527-8188 TWX. 810-621-0366 OHIO Arrow Electronics. Inc 7620 McEwen Road Centerville 45459 Tel (513) 435-5563 TWX 810-459-1611 tArrow Electronecs. Inc 6238 Cochran Road Solon 44139 Tel {216} 248-3990 TWX 810-427-9409 [:v~~p~~\t~~ofarlve Tel: (315) 652-1000 TWX 710-545-0230 tHamilton/Avnet ElectroniCS 954 Senate Drrve Dayton 45459 Tel· (513) 433-0610 rNX 810-450-2531 Arrow ElectroniCS. Inc 20 Oser Avenue Hauppauge 11788 Tel. (SIS) 231-1000 TWX: 510-227-6623 tHamllton/Avnet Electronics 4588 Emery Industrral Park-way WarrenSVille Heights 44128 Tel. (216) 831-3500 TWX. 810-427-9452 Hamllton/Avne! ElactronlCS 333 Metro Park Rochester 14623 Tel. (716)475-9130 TWX 510-253-5470 tPloneer Electrontcs 4433 InterpOlnt Blvd Dayton 45424 Tel (513) 236-9900 TWX 810-459-1622 tHamlJton/ Avnet ElectroniCS 103 TWin Oaks Drive Syracuse 13206 Tel· (315) 437-2641 TWX: 710-541-1560 tPloneer Electrontcs 4800 E. 131st Street Cleveland 44105 Tel: (216) 587-3600 TWX: 810-422-2211 tHamiiton/Avnet Electromcs 933 Motor Parkway Hauppauge 11788 Tel (516) 231-9800 TWX· 510-224-6166 tMTI Systems Sales 3B Harbor Park Dflve POBox 271 Port Washington 11050 Tel: (516) 621-6200 TWX: 510-223-0846 tPloneer Northeast ElectroniCs 1806 Vastat Parkway East Vestal 13850 Tel: (607) 748-8211 TWX. 510-252-0893 tPloneer Northeast ElectroniCS 60 Crossway Park West ~eT1s~679~'~8~J~and 11797 ~?g~b~lrbe~2~(t. Tel Bldg E (41~)281-4150 Pioneer Electronics 259 Kappa Dnve ~?~h~:~g 96~~Wrt Road OREGON tAlmac ElectroniCS Corporation 1885 N W. 169th Place Bflavflrton 97006 Tel. (503) 629-8090 TWX. 910-467-8743 tHamiltonjAvnet ElectroniCs 6024 S W Jean Road Bldg C, SUite 10 !r:~(5~3)6~g-~;gr TWX· 910-455-8179 ~::~ (~b~) ~i7_::JJ9 WASHINGTON TWX: 710-795-3122 tPloneer Electronics 261 Glbralter Road Horsham 19044 Tel: (215) 674-4000 TWX 510-665-6778 Arrow ElectrOnics, Inc 14320 N.E 2151 Street Bellevue 98007 Tel. (206) 643-4800 TWX· 910-444-2017 TEXAS Hamilton/Avnet Electronics 14212 N.E. 21st Street Bellevue 98005 Te!: (206) 453-5874 TWX 910-443-2469 tArrow ElectroniCs, Inc 3220 Commander Drive Carrollton 75006 Tel (214) 380-6464 TWX. 910-860-5377 tArrow ElectrOfllCS, Inc 10125 Metropolitan Ausiin 78758 Tel: (512) 835-4180 TWX: 910-874-1348 tHamlltonjAvnet Electronics 2401 Rutland AUStin 78758 Tel. (512) 837-8911 TWX 910-874-1319 tHamllton/Avnat Electronics 2111 W Walnut HIli lane IrVing 75062 Tel (214)659-4100 TWX: 910-860-5929 tHamllton/ Allnet ElectroniCs ~~;f~o:i1p!f7oad Wyle Distribution Group 1750 132nd Ave., N.E Bellvue 98005 Tel: (206) 453-8300 Hamilton/Avnet Electromcs 2975 Moorland Road New Berlin 53151 Tel· (414) 784-4510 TWX· 910-262-1182 Klerulff Electromcs, Inc 2238-E W. Bluemound Rd Waukeshaw 53186 Tel· (414) 784-8160 CANADA ALBERTA Hamilton/Avnet Electromcs 2816 21st Street N.E Klerulff Electronics. Inc 9610 Skillman Oallas 75243 Tel. (214) 343-2400 TWX. 03-827-642 tPloneer ElectronicS , 826 D Kramer Lane Ausltn 78758 Tel. (512) 835-4000 TWX 910-B74-1323 tPioneer Electronics 13710 Omega Road Dallas 75234 Tel (214) 386-7300 TWX. 910-850-5563 tPloneer ElectrOnics 5853 Pornt West Drive Houston 77036 Tel: (713) 988-5555 TWX 910-881-1606 ~~(~ ~i~8~~~ ONTARIO Arrow Electronics Inc 24 Martin Ross Avenue Downsview M3J 2K9 Tel: (416) 661-0220 TLX: 06-218213 Arrow Electronics Inc. 14B Colonnade Road Nepean K2E 7 JS Tel: (613) 226-6903 tHamiiton/Avnet Electronics 6845 Rexwood Road UnltsG&H ~~~(~f6)a~7:~¥31 R2 TWX: 510-492-8867 tHamilton/Avnet Electronics 210 Colonnade Road South Nepean K2E 7L5 Tel: (613) 226-1700 TWX: 05-349-71 WISCONSIN tArrow Electronics, Inc 430 W Rausson Avenue Oakcreek 53154 Tel. (414) 764-6600 TWX: 910-262-1193 #190 Tel· (713) 780-1771 TWX. 910-881-5523 OKLAHOMA Arrow ElectroniCS. Inc 4719 S Memorial DrIVe Tulsa 74145 Tel: (918) 665-7700 Zentronics 590 ~rry Street Wyle Distribution Group 1325 West 2200 South SUite E tAlmac Electronics Corp. 14360 S.E. Eastgate Way Bellevue 96007 Tel: (206) 643-9992 TWX: 910-444-2067 ~~~.s~~r~t71~~~goo tArrow Electronics. Inc. 10899 Klnghurst Suite 100 Houston 77099 Tel· (713) 530-4700 TWX 910-880-4439 t Arrow ElectrOniCS, Inc 3375 Brighton-Henrietta Townltne Rd Rochester 14623 Tel (716) 427-0300 TWX. 510-253-4766 Arrow Electronics, Inc. TWX. 910-925-4018 Zentronics Tel: (604) 273-5575 TWX: 04-S077-89 MANITOBA tMTI Systems Sales 383 Route 46 W Fairfield 07006 Tel: (201) 227-5552 Hamilton/Avnet Electronics 2524 Baylor Dflve S E Albuquerque 87106 Tel· (505) 765-1500 TWX: 910-989-0614 ~::~ Mm ~;~-::JJ 9 PENNSYLVANIA tArrow ElectrOniCS, Inc 5240 Greendalry Road Arrow ElectrOniCS, Inc 650 Seco Road Monroeville 15146 Tel: (412) 856-7000 Alliance Electronics Inc 11030 COChttl S.E Albuquerque B7123 Tel· (505) 292-3360 TWX· 910-989-1151 BRITISH COLUMBIA (Cont'd) ~:I,g(a;63T~~0~~~86 HamiitonjAvne! Electromcs 6845 Rexwood Road Umt 6 ~~f(~f6)u~~7~~W:IO L4Vl R2 tZentromcs ~300~~t~ Avenue N.E ¥:,I?(~63n~2~621 Zentrontcs 564/10 Weber Street North Waterloo N2L SC5 Tel: (519) 884-5700 tZentronlcs 155 Colonnade Road Unit 17 Nepean K2E 7K1 Tel: (613) 225-8840 TWX: 06-976-78 aUEBEC tArrow Electronics Inc. 4050 Jean Talon Quest Montreal H4P 1WI Tel: (514) 735-5511 TlX· 05-25596 Arrow Electronics Inc. 909 Charest Blvd. Quebec 61 N 269 Tel: (418) 687-4231 TLX: 05-13388 Hamllton/Avnet ElectrOniCS 2795 Aue Halpern St. laurent H4S 1PB Tel: (514) 335-1000 TWX: 610-421-3731 Zwtronics 505 locke Street St laurent H4T 1X7 Tel: (514) 735-5361 TWX· 05~827~S35 BRtTtSH COLUMBIA Hamllto~Avnet Electrontcs ~~~~~,~y e~~ng2'3Road Tel (604) 272-4242 TWX· 510-221-2184 tMlcrocomputer System Technical Distributor Centers CG-3/17/87 EUROPEAN SALES OFFICES BELGIUM WEST GERMANY ISRAEL SPAIN ~~~I ~:~~~~~ ~5A Intel Semiconductor GmbH" Seid[estrasse 27 D~8000 Muenchen 2 Te[: (89) 53891 TLX. 05-23177 [NTl 0 Intel Semiconductor Ltd" Attidim Industrial Park Neve Share! Ovora Hanevla Bldg. No 13, 4th Floor P.O. Box 43202 Tel Aviv 61430 Tel· (3) 491-099. 491-098 TLX: 371215 Inlel Iberia Calle Zurbaran 28-IZODA 28010 Madrid Tel: (1) 410-4004 TLX: 46880 B~1180 Brussels Tel. (02) 347~0666 DENMARK bUel Denmark AlS' Glentevej 61 ~ 3rd Floor ~~~~ci~ 1~~C~_3~agen TLX: 19567 FINLAND Intel Finland OY Rousilantle2 00390 Helsinki lei· (8) 0544-644 TLX: 123332 FRANCE Intel Paris 1 Rue Edison, BP 303 78054 Salnt-Quentln-en-Yvelines Cedex Tel: (33)1-30-57-7000 TLX. 69901677 Intel Semiconductor GmbH Verkaufsbuero Wlesbadsn Abraham-lincoln Str 16-18 6200 Wiesbadsn Tsl: (6121) 76050 TlX: 041861831NTW 0 ITALY Intel Corporation S P.A • Mllanoflofl. Palazzo E/4 20090 Assago (Milano) Tel: (02)824-4071 TlX 3412861NTMIL Intel Semiconductor GmbH Verkaufsbuero Hannover Hohenz:ollernstrasse 5 3000 Hannover 1 Tel· (511) 34-40-81 TLX 923625 INTH D NETHERLANDS Intel Semiconductor GmbH Verkaufsbuero Stuttgart Bruckstrasse 61 7012 Fellbach Tel: (711) 58-00-82 TLX· 7254826 INTS D Intel Semiconductor (Nederland) B V • Alexanderpoort BUIlding Marten Meesweg 93 3068 Rotterdam Tel· (10) 21-23-77 TLX· 22283 NORWAY Intel Corporation, S.A R.l Immeuble BBC 4 Quai des Etroits SWEDEN Intel Sweden A.S: Dalvagen 24 8-171 36 Solna Tel· (8) 734-0100 TLX: 12261 SWITZERLAND Intel Semiconductor A.G • Talackerstrasse 17 8152 Glattbrugg CH-8065 Zurich Tel: (01) 829-2977 TLX· 57989 ICH CH UNITED KINGDOM Intel Corporation (U K) Ltd • Pipers Way SWlndon, Wiltshire SNI lRJ Tel: (0793) 696-000 TLX 444447 INT SWN ~.~~ ~~~9al A/s Hvamvelen 4 N-2013. SkJetten Tel. (2) 742-420 TLX· 78018 ~:~~~)Llf2.4089 TLX.305153 "Field Application Location EUROPEAN DISTRIBUTORS/ REPRESENTATIVES AUSTRIA FRANCE (Cont'd) ITALY (Cont'd) Bacher Elektromcs Ges m.b H Rotenmuehlgasse 26 A-1120Wlen Tel: (222) 835-6460 TUC·131532 Tekelec Alrtronlc C,te des Bruyeres Aue Carle Vernel BP 2 92310 Sevres Tel: (1)45-34-75-35 TLX· 204552 Intesl Mllanollon E5 BELGIUM WEST GERMANY ~:~c~~e~:~~ ~/Guerre, 94 Bruxelles 1120 Tel: (02)216-01-60 TlX· 64475 Electromc 2000 Vertriebs AG ~h~~t~~~~~negn 1~ Tel. (OS9) 42-00-10 TLX 522561 ElEC D BENELUX Koning en Hartman Electrotechmek B V Postbus 125 2600 AC Delft Tel: (15) 609-90S TLX: 38250 ~~h~rs1r~~~H84 6277 Bad Camberg Tel (064) 34-231 TLX 415257-0JERM D FINLAND Oy Fintronic AS Melkonkatu 24A SF-0021O Helsinki 2t Tel: (0) 692-60-22 TLX: 124224 FTRON SF Metrologle GmbH Rhelnstr 94-96 6100 Darmstadt Tel: (06151) 33661 TLX: 176151820 Proeleclron Vertnebs AG Max-Planck-5trasse 1-3 6072 Orelelch Tel (06103) 3040 TLX: 417972 ITT-MultlKomponent Bahnhofstrasse 44 NORWAY Nordlsk Electronlk AjS Postboks 130 N-1364 Hvalstad Tel (2)846-210 TLX. 77546 NENAS N ~~~~o~~:r;'~~~~ SPAtN In ~1~Jr~i~8gI1;ngel Tel (t}419-54-00 TWX· 27461 A.T.D Electronlca S A PI e.udad dp Vlena 6 28040 Madrid Tel (1) 234-4000 TWX· 42477 SWEDEN Accent Electronic Components Ltd i~~~:oHrfh~s~e~~b~~~ ~Q~ England Tel: (0462) 686666 TLX 626923 By tech Ltd Unit 2 Western Centre Western Industrial Estate Bracknell. Berkshire AG12 1RW England Tel (0344) 482211 TLK 848215 Comway Mlcrosystems Ltd. John Scott House, Market St Bracknell, Berkshire AJt2 lOP England Tel: (0344) 55333 TLX· 847201 IBA MICrocomputers Ltd Unit 2 Western Centre Western lndustnal Estate Bracknell. Berkshire RG12 lAW England Tel: (0344) 466-555 TLX· 849381 Jermyn Industnes Vestry Estate, Olford Road Sevenoaks, Kent TN14 5EU England Tel: (0732) 450144 TLX.95142 Rapid Silicon Rapid House, Denmark 5t. High Wycombe, Bucks HP11 2ER England Tel· (0494) 442266 TLX 837931 TLX 7264399 MUKO D ISRAel Eastromcs Ltd. 11 Rosanis Sireet PO. Box 39300 Tel Aviv 61392 Tel. (3) 47-51-51 TLX: 342610 DATIX IL or 33638 AONIX IL Metrologie Tour d'Asnieres 4, Avenue Laurent Cely 92606 Asnieres Tel: (1) 47-90-62-40 TLX: 611448 Lasl Elettromca S P.A Vlale Fulvlo Testl. t26 20092 Clnlsello Balsamo Tel. (02) 244-0012. 244-0212 TlX 352040 PORTUGAL FRANCE Generim Zone d·Activile de Courtaboeuf Avenue de la Baltlque 91943 Les UliS Cedex Tel: (1) 69-07-78-78 TLX.691700 TLX 311351 Dltram Avemda Marques de Tomar. 46A l1sboa P-1000 Tel. (351-1) 734-834 TWX· (0404) 14182 DENMARK ITT MultlKomponent Naverland 29 DK-2S00 Gloslrup Tel. (02) 456-66-45 TlX: 33355 InCG OK f~?~g2~82j?0 I UNITED KINGDOM ITALY Eledra Compopentl S.P A I/"- Glacol"ftO Welt, 37 20143 Milano Tel: (02) 82821 TLX: 332332 Nordisk Eleklromk AB Box 1409 5-17127 Solna Tel: (8) 734·97-70 TLX· 10547 Rapid Systems Rapid House. Denmark SI High Wycombe, Bucks HP11 2EA England Tel: (0494) 450244 TL.X: 837931 SWITZERLAND lndustrade AG Hertlstrasse 31 CH-8304 Wallisellen Tet: (01) 830-5040 TLX· 56788 Micro Marketing Glenageary Office Park Gtenageary, Co. Dublin Ireland Tel: (0001) 856288 TLX.31584 YUGOSLAVIA H.R. Mlcroelectromcs Corp 2005 De La Cruz Blvd., Ste. 223 Santa Clara, CA 95050 U.S.A. Tel: (408) 98S-0286 TLX: 387452 CG-3/17/87 intJ INTERNATIONAL SALES OFFICES AUSTRALIA JAPAN JAPAN (Cont'd) KOREA Intel Australia?ty Ltd_' S~trur:n Building Intel Japan K.K 5·6 Tokodal Toyosato-machi ~:~~~~g:i~~i ~~saShl-KOSU91 Bldg. Intel TechnOlogy ASia Ltd. ~ro::~~~t~~w~~~56 i!~~O~91}~~7~~;~ki-ken 300-26 i~~WO~jl-2744 TLX: 03656-160 FAX. (2) 923-2632 Intel PRC Corporation Inlel Japan K.K." Dailchl MltsUgl Bldg. 1-8889 Fuchu-cho Fuchu-shl, Tokyo 183 Tel: (04) 23-60-7871 j~~ G~~it.1e~ ~~~cS~:~t Intel Japan K.K· CHINA Beijing, PRC Tel: (1) 500-4850 TLX: 22947 INTEL eN FAX' (1) 500-2953 Flower-HIli Shln-machl Bldg 1-23-9 Shlnmachl ~:f.a(t3)a.;~622~~~YO 154 Intel Japan K.K: HONG KONG ~_69~~X~~~dg Intel Semiconductor Ltd • 1701-3 Connaugh\ Centre 1 Connaught Road Tel: (5) 844-4555 TWX: 63869 ISLHK HX FAX' (5) 294-589 Kumagaya, Saltama 360 TeJ. (04) 85-24-6871 g15-20 Shinmaruko, Nakahara-ku KBwasaki-shi, Kanagawa 211 Tel: (04) 47-33-7011 Intel Japan K.K Nlhon Seimel Bldg 1-12 Asahl-cho ~~~:U{&r)K:;_~~~~~1 ~43 Intel Japan K.K: Ryokuchl-Station Bldg 2-4-1 Terauchi Toyonaka, Osaka 560 Tel. (06) 863-1091 Intel Japan K.K Shinmaru Bldg 1-5-1 Marunouchi Chlyoda-ku. TOkyo 100 Tel' (03j201-3621 Inlel Japan K K ~11s~~~a~~~~h~~I~i:~?~~_Shl ~~~Y~~O~~~~)Po~~~~'ElUngpo-ku 580U1150 ~~.(~9~~-1~~~LKO FAX: (2) 784-8096 SINGAPORE Intel Singapore Technology, Ltd 1-1 Thomson Road #21 -06 GoldhlH Square Singapore 1130 Tel: 250-7811 Tl.X: 39921 INTEL FAX: 250-9256 TAIWAN Intel Technology (Far East) Ltd. Taiwan Branch lO/F., No. 205, Tun Hua N. RU<:I.d Taipei, R.O.C. Tet. (02) 716-9660 Tl.X: 13159 tNTELTWN FAX: (02) 717-2455 Shlzuaka-ken411 Tel' (05) 59~72-2141 -Field Application Location INTERNATIONAL DISTRIBUTORS/REPRESENTATIVES ARGENTINA CHINA (Cont'd) VLC S.R.L Bartalome Mitre 1711 Schmidt & Co Ltd 18/F Great Eagle Centre 23 Harbour Road Wanchal, Hong Kong Tel. 852-5-833-0222 TWX. 74766 SCHMC HX FAX 852-5-891·8754 3 Piso 1037 Buenos Aires Tel: 54-1~49-2092 TLX' 17575 EDARG-AA AUSTRALIA Total Electromcs Private Bag 250 9 Harker Street Burwood, Vlctona 3125 Tel: 61-3-288-4044 TLX: AA 31261 Total Electronics P.O. Box 139 Artamon, N.S.W. 2064 Tel: 61-02-438-1855 TLX: 26297 BRAZIL Elebra Mlcroelectronica S/A Geraldo Flauslno Gomes. 78 9 Andar 04575 - Sao Paulo - S.P Tel: 55-11-534-9600 TLX: 3911125131 ELBR SR FAX. 55-11-534-9424 JAPAN (Conl'd) Northrup Instruments & Systems Ltd. ~~g.'~(g:~~~~ :eo:~arket Auckland 1 Tel: 64-9-501-219, 501-801 TLX: 21570 THERMAL Okaya Kokl 2-4-18 Sakae INDIA ~:tO~2_'2~?2J:1sh' 460 Mlcromc DeVices Arun Complex No 65 OV.G. Road Basavanagudl FAX ~:I~~~I~~2:gg0~~;1 TLX: 0645-8332 MD BG IN 052~204-2901 Ryoyo Electro Corp Konwa Bldg 1-12-22 TsuklJi ~~I~~;_~4~~;6f,'04 FAX' 03·546-5044 Micronic Devices 403, Gagan Deep 12, RSJendra Place New Delhi 110 008 Tel' 91-58-97-71 TLX: 03163235 MOND IN Mlcronic DeVices No. 516 5th Floor Swastik Chambers KOREA J-Tek Corporation 6th Floor, Government PenSIon Bldg $~~n~~~~o~g~;~u Seoul 150 Tel: 82-2-782-8039 TLX. 25299 KODlGIT FAX. 82·2-764-8391 CHILE ~~~b~~~~8r~!1 Road Tel: 91-52-39-63 TLX: 9531 171447 MDEV IN DIN Instruments JAPAN Suecia 2323 Casilta 6055, Correa 22 Santiago Tel' 56~2-225-8139 TLX: 440422 RUDY CZ Asahi ElectrOniCS Co Ltd KMM Bldg. 2-14-1 Asano Kokurakita-ku ~~1~'69U;~~1~~~~~2 MEXICO CHINA FAX. 093-551-7861 Dicopei S A Tochtli 368 Fracc Ind San AntoniO Azcapotzalco C.P. 02760-Mexico, O.F. Tel: 52-5-561·3211 TLX: 1773790 DICOME C. Itoh Techno-Science Co., Ltd. C.ltoh Bld~, 2-5-1 Klta-Aoyama ~~~~~~97_4~Oo 107 FAX: 03-497-4969 NEW ZEALAND Sam sung SemIconductor & Telecommunications Co . Ltd 150. 2-KA. Tafpyung-ro. Chung.ku Seoul 100 Tel: 82-2-751-3987 TLX: 27970 KORSST FAX: 82-2-753-0967 Northrup Instruments & Systems Ltd. P.O. Box 2406 ~~~I~_t.r_~:~.~~~a TLX: NZ3380 FAX' 64-4-857276 SINGAPORE Francotone Electronics Pte Ltd 1? Harvey Road #04-01 Smgapore 1336 Tel: 283-0888, 289-1618 TWX' 56541 FRELS FAX' 2895327 SOUTH AFRICA Electronic Building Elements, Pty. !...td P.O. Box 4609 Pine Square. 18th Street Hazelwood, Pretoria 0001 Tel: 27-12-469921 Tl.X: 3-227786 SA TAIWAN Mitac Corporation No: 585, Ming Shen East Rd TaIpei, R.O.C Te(' 886-2-501-8231 FAX. 886-2-501-4265 VENEZUELA P. Benavides SIA Avilanes a Rio Resldencia.s Kamarata locales 4 A 17 La Candelaria. Caracas Tel: 58-2-571-0396 TLX: 28450 PBVEN VC FAX: 58-2-572-3321 "Field Application Location CG-3/17/87 inter DOMESTIC SERVICE OFFICES ALABAMA CONNECTICUT MICHIGAN PENNSYLVANIA Intel Corp 5015 Bradford Drive, #2 Huntsville 35805 Intel Corp 26 Mill Plain Road Intel Corp. 7071 Orchard Lake Road Suitfi 100 West Bloomfield 48033 Intel Corp. 201 Penn Center Boulevard Suite 301 W Tel: (205) 830-4010 ~:1~~~~)~:~-1130 ARIZONA FLORIDA Intel Corp. 11225 N. 2Bth Dr #D214 Phoenix 85029 Tel: (602) 869-4980 Intel Corp 1500 N.w. 62, SUIte 104 Ft. Lauderdale 33309 Tel; (305) 771-0600 TWX: 510-956-9407 Intel Corp. 500 E. Fry Blvd., Suite M-15 SIerra Vista 85635 TAl: (602) 459-501 0 Intel Corp. 242 N. Westmante Drive Suite 105 ARKANSAS ~~~71b5\e8~~~~~:832714 Intel Corp P.O. Box 206 Ulm 72170 Tel. (501)241-3264 CALIFORNIA Intel Corp 21515 Vanowen Suite 116 ~:,~(~f8r~~:~~g~ Intel Corp. 2250 E. Imperial Highway SUite 218 [I Segundo 90245 Tel: 1-800-468-3548 GEORGIA Intel Corp. 3280 POinte Parkway Suite 200 Norcross 30092 Tel: (404)441-1171 Tel: (313) 851-8905 ~~.s~rjlr3~~~~O MISSOURI TEXAS Intel Corp 4203 Earth City Expressway Intel Corp. 313 E. Anderson Lane Suite 314 Austin 78752 Suite 143 Earth City 63045 Tel. (314) 291-2015 Tel: (512) 454..J628 TWX: 910-674-1347 NEW JERSEY Intel Corp 385 Sylvan Avenue Englewood Cliffs 07632 Tel· (201) 567-0821 TWX: 710-991-8593 Intel Corp. Raritan Plaza II! Raritan Center Edison 08817 Tel· (201) 225-3000 Intel Corp 12300 Ford Road SUite 380 Dallas 75234 Tel: (214) 241-2820 TWX: 910-860-5617 Intel Corp. 8815 Dyer St., Suite 225 EI Paso 79904 Tel: (915) 751-0186 VIRGINIA ILLINOIS NORTH CAROLINA Intel Corp 300 N. Martingale Ad Suite 300 Schaumburg 60194 Tel: (312) 310-5733 Intel Corp 2306 W. Meadowv!ew Road SUIte 206 Greensboro 27407 Tel. (919) 294-1541 Intel Corp. 1603 Santa Rosa Rd. #109 Richmond 23288 Tel: (804) 282-5668 WASHINGTON INDIANA Intel Corp 8777 Purdue Ad., #125 Indianapolis 46268 Tel: (317) 875-0623 Intel Corp 2700 Tryc1iff Ad, Suite 102 ~:II.ei§~ 9F7~~~8022 OHIO Intel Corp. 110 110th Avenue N.E. Suite 510 Bellevue 98004 Tel: 1-800-468-3548 TWX: 910-443-3002 KANSAS Intel Corp 2000 E. 4th Street Suite 110 Sanla Ana 92705 Tel: (714) 835-5789 TWX· 910-595-2475 Inte! Corp 2700 San Tomas Expressway Santa Clara 95051 Tel· (408) 970-1740 Intel Corp 8400 W. 11 Oth Street Suite 170 Overland ParK 66210 Tel: (913) 345-2727 Intel Corp Chagrin-Brainard Bldg SUite 305 ~?~eia~~a2~f2~oulevard Tel: (216) 464-6915 TWX 810-427-9298 Intel Corp. 6500 Poe Dayton 45414 Tel. (513) 890-5350 Intel Corp 4350 Executive Dnve SUite 150 i:~I{~ci6) 2~~:~~ 45 OREGON ~:~ (~I~)04~~~;~80 MARYlAND Intel Corp. 650 South Cherry SUite 915 Denver 80222 Tel: (303) 321-8086 TWX. 910-931-2289 Intel Corp. 450 N. Sunnyslope Road Surte 130 Brookfield 53005 Tel: (414) 784-8087 KENTUCKY Intel Corp 3525 Tatescreek Road. #51 COLORADO WISCONSIN Intel Corp 5th Floor 7833 Walker Drive Greenbelt 20770 Tel: (301)441-1020 MASSACHUSETTS Inlel Corp. 15254 N.W. Greenbrier Beaverton 01886 Tel (503) 645-8051 TWX 910-467-8741 Intel Corp 5200 N E. Elam Young Parkway Hillsboro 97123 Tel: (503) 681 -8080 Intel Corp. 3 Carlisle Road Westford 01886 Tel. (617) 692-1060 CANADA Intel Corp 190 Altwell Drive, Suite 103 Rexdale, Ontario Canada K2H 8A2 Tel: (416) 675-2105 Intel Corp 620 5t. Jean Blvd. Pointe Claire, Quebec Canada H9R 3K2 Tel. (514) 694-9130 Intel Corp 2650 Queensvlew Drive. #250 Ottawa, OntariO, Canada K2B SH6 Tel: (613) 829-9714 CUSTOMER TRAINING CENTERS CALIFORNIA ILLINOIS MASSACHUSETTS MARYLAND 2700 San Tomas Expressway Santa Clara 95051 Tel: (408) 970-1700 ~~~a~m~~7~n~g~ei3#300 3 Carlisle Road Westford 01886 Tel (617) 692-1000 7833 Walker Dr., 4th Floor Greenbelt 20770 Tel· (301) 220-3380 Tel. (312) 310-5700 SYSTEMS ENGINEERING OFFICES CALIFORNIA ILLINOIS MASSACHUSETTS NEW YORK 2700 San Tomas Expressway Santa Clara 95051 Tel: (408) 986-8086 ~~~a~m~~7;~~~~3#300 3 Carlisle Road Westford 01886 Tel: (617) 692-3222 300 Motor Parkway Hauppauge 11788 Tel: (516) 231-3300 Tel: (312) 310-8031 CG-3/17/87
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c043 52.372728, 2009/01/18-15:56:37 Create Date : 2012:08:12 08:17:51-08:00 Modify Date : 2012:08:12 10:02:29-07:00 Metadata Date : 2012:08:12 10:02:29-07:00 Producer : Adobe Acrobat 9.51 Paper Capture Plug-in Format : application/pdf Document ID : uuid:6db89b5d-2457-4098-b28b-674bdb75b039 Instance ID : uuid:cc4c496c-7238-4fc2-b6c0-7d58112f1a90 Page Layout : SinglePage Page Mode : UseNone Page Count : 258EXIF Metadata provided by EXIF.tools