241731 002_Intel486_Microprocessors_and_Related_Products_Jan95 002 Intel486 Microprocessors And Related Products Jan95
User Manual: 241731-002_Intel486_Microprocessors_and_Related_Products_Jan95
Open the PDF directly: View PDF .
Page Count: 978
Download | |
Open PDF In Browser | View PDF |
LITERATURE For additional information on Intel products in the U.S. or Canada, call Intel's Literature Center at (800) 548-4725 or write to: Intel Literature P.O. Box 7641 Mt. Prospect, II 60056-7641 To order literature outside. of the U.S. and Canada contact your local international sales office. CURRENT DATABOOKS Product line databooks contain datasheets, application notes, article reprints, and other design information. Databooks can be ordered in the U.S. and Canada by calling TABlMcGraw-Hili at 1-800-822-8158; outside of the U.S. and Canada contact your local international sales office. Title Intel Order Number Automotive Products Embedded Applications (2 vol. set) Embedded Microcontrollers Embedded Microprocessors Flash Memory (2 vol. set) Intel486™ Microprocessors and Related Products i960® Processors and Related Products Military and Special Products Networking OEM Boards, Systems and Software Packaging Pentium™ Processors and Related Products Peripheral Components 231792 270648 270646 272396 210830 241731 272084 210461 297360 280407 240800 241732 296467 ISBN N/A 1-55512-242-6 1-55512-230-2 1-55512-231-0 1-55512-232-9 1-55512-235-3 1-55512-234-5 N/A 1-55512-236-1 1-55512-237-X 1-55512-238-8 1-55512-239-6 1-55512-240-X A complete set of this information is available on CD-ROM through Intel's Data on Demand program, order number 240897. For information about Intel's Data on Demand ask for item number 240952. intel® 24-HOUR AUTOMATED TECHNICAL SUPPORT* Intel's Application Bulletin Board System (BBS) and FaxBack System are at your service, 24-hours a day, at no charge, and the information is updated frequently. FaxBack SYSTEM Technical and product information are available 24-hours a day! Order documents containing: • • • Product Announcements Product Literature Intel Device Characteristics • Design/Application Recommendations • Stepping/Change Notifications • Quality and Reliability Information Information on the following subjects is also available: • Microcontroller and Flash • OEM Branded Systems • MultibuS/BBS Listing • Multimedia • Development Tools • Quality and Reliability/Change Notification • Microprocessor/PCI/Peripheral • Intel Architecture Lab To use FaxBack for Intel components and systems, dial (800) 628-2283 or (916) 356-3105 (U.S. and Canada) or +44{O} 1793-496646 (Europe) and follow the automated voice-prompt menu. Document orders will be faxed to the fax number you specify. Catalogs are updated twice a month, so call for the latest information! BULLETIN BOARD SYSTEM Intel's Application Bulletin Board System (BBS) enables file retrieval 24-hours a day. The following can· be located on the BBS: • ProducVTechnical Documentation • Firmware Upgrades • Quality and Reliability Data • Software Drivers • Tool Information • Software/Application Utilities To use the Intel Application BBS (components and systems), dial (916) 356-3600 for download access (U.S. and Canada) or +44{O} 1793-496340 (Europe). The BBS will support 1200-19200 baud rate modem. Typical modem configuration: 9600 baud rate, No Parity, 8 Data Bits, 1 Stop Bit. A directory listing of BBS files is also available through FaxBack or our 800 BSS (800-897-2536); Retail Products Information on Intel's retail· products (Coprocessors and wireless, video, personal conferencing 'and network products) is available through the following services: Internet: CompuServe: Country ftp.intel.com (143.185.65.2) GO INTELFORUM (modem settings: E-7-1, up to 14.4 Kbps) BBS (N-8-1, up to 14.4 Kbps) FaxBack North America (503) 264-7999 Europe +44 1 793-432955 +44 1 793-432509 Australia +61 2 975-3066 +61 2 975-3922 Taiwan +886 2 718-6422 +8862514-0815 Singapore +65256-4n6 +65 256-5350 Hong Kong +852 530-4116 +852 844-4448 Korea +822 784-3430 +822767-2594 'Support services provided courtesy of Intel Application Support (800) 525-3019 or (503) 264-6835 Intel486™ Microprocessor and Related Products Microprocessors, PC/sets, Peripheral Components 1995 I Information in this document is provided solely to enable use of Intel products. Intel assumes no liability whatsoever, including infringement of any patent or copyright, for sale and use of Intel products except as provided in Intel's Terms and Conditions of Sale for such products. Intel Corporation makes no warranty for the use of its products and assumes no responsibility for any errors which may appear in this document nor does it make a commitment to update the information contained herein. Intel retains the right to make changes to these specifications at any time, without notice. Contact your local Intel sales office or your distributor to obtain the latest specifications before placing your product order. MDS is an ordering code only and is not used as a product name or trademark of Intel Corporation. Intel Corporation and Intel's FASTPATH are not affiliated with Kinetics, a division of Excelan, Inc. or its FASTPATH trademark or products. 'Other brands and names are the property of their respective owners. Additional copies of this document or other Intel literature may be obtained from: Intel Corporation Literature Sales P.O. Box 7641 Mt. Prospect, IL 60056-7641 or call 1-800-879-4683 @INTELCORPORATION, 1995 DATASHEET DESIGNATIONS Intel uses various datasheet markings to designate each phase of the document as it relates to the product. The markings appear in the lower inside comer of each datasheet page. Following are the definitions of each marking: Datasheet Marking Description Product Preview Contains information on products in the design phase of development. Do not finalize a design with this information. Revised information will be published when the product becomes available. Advanced Information Contains information on products being sampled or in the initial production phase of development. * Preliminary Contains preliminary information on new products in production. * No Marking Contains information on products in full production. * * Specifications within these datasheets are subject to change without notice. Verify with your local Intel sales office that you have the latest datasheet before finalizing a design. infel® Overview Intel486™ Microprocessor Intel OverDrive™ Processors Peripheral Components Flash Memory Components Intel486™ Microprocessor SmartDie ™ Products CONTENTS Table of Contents Alphanumeric Index .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi CHAPTER 1 Overview Intel486™ Microprocessor................................................. 1-1 CHAPTER 2 Intel486™ Microprocessor DATA SHEETS Intel486 Processor Family. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82420 PClset ....................... , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 82420EX PClset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. APPLICATION NOTES AP-469 Cache and Memory Design Considerations for the IntelDX2TM Microprocessor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. AP-485 Intel Processor Identification with the CPUID Instruction. . . . . . . . . . . . . . . .. AP-496 Migrating from the Intel486 SL Microprocessor to the SL Enhanced Intel486 Microprocessor. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. AP-497 Managing Power with the SL Enhanced Intel486 Microprocessor. . . . . . . .. AP-498 Thermal Design for High Performance Notebooks. . . . . . . . . . . . . . . . . . . . .. AP-504 Clock Throttling the SL Enhanced Intel486 Processor in a Networked Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. AP-505 Picking Up the Pace: Designing the IntelDX4TM Processor into Intel486 Processor-Based Desktop Systems ....................................... 2-1 2-399 2-401 2-405 2-442 2-471 2-483 2-491 2-511 2-542 CHAPTER 3 Intel OverDrive™ Processors IntelOverDrive™ Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 3-1 CHAPTER 4 Peripheral Components DESKTOP AND MOBILE PERIPHERAL DATA SHEETS 82091 AA Advanced Integrated Peripheral (AlP) .............................. . 82078 CHMOS Single-Chip Floppy Disk Controller ............................ . 82078 44 Pin CHMOS Single-Chip Floppy Disk Controller ...................... . 82078 64 Pin CHMOS Single-Chip Floppy Disk Controller ...................... . 82077SL CHMOS Single-Chip Floppy Disk Controller ......................... . 82595 ISA/PCMCIA High Integration Ethernet Contro~er ...................... . 82593 CSMAlCD Core LAN Controller ...................................... . 82503 Dual Serial Transceiver (DST) ....................................... . Ethernet LAN Card Product Brief ........................................... . DataFax 14.4 Card Product Brief ........................................... . FaxModem 24/96 Card Product Brief ....................................... . DESKTOP PERIPHERAL DATA SHEETS 82489DX Advanced Programmable Interrupt Controller ....................... . UPI-41 AH/42AH Universal Peripheral Interface 8-Bit Slave Microcontroller ...... . UPI-C42/UPI-L42 Universal Peripheral Interface CHMOS 8-Bit Slave Microcontroller ........................................................ . MOBILE PERIPHERAL DATA SHEET 8XC51 SLlLow Voltage 8XC51 SL Keyboard Controller ........................ . APPLICATION NOTES AP-366 89C124FX DatalFax Modem Chip Set-Reduction of Power Consumption .. AP-358 Intel 82077SLfor Super Dense Floppies ............................. . I 4-1 4-204 4-207 4-208 4-209 4-210 4-211 4-212 4-213 4-216 4-218 4-220 4-303 4-304 4-305 4-306 4-310 ix CONTENTS Table of Contents (Continued) CHAPTERS Flash Memory Components 28F001 BX-T128F001 BX-B 1M (128K x 8) CMOS Flash Memory .....•........... 28F200BX-T/B. 28F002BX-T/B 2 Mbit (128K x 16. 256K x 8) Boot Block Flash Memory Family .....................•................................... 28F200BL-TlB. 28F002BL-TIB 2-Mbit (128K x 16. 256K x 8) Low Power Boot BlockFlash Memory Family ....................................•.. : ..... . 28F400BX-i/B. 28F004BX-T/B 4 Mbit (256K x 16. 512Kx 8) Boot Block Flash Memory Family ............................................ .J., ......•... 28F008SA 8-Mbit (1-Mbitx 8) FlashFile Memory Extended Temperature Specifications Included ................................................. . 28F016SA 16-Mbit (1-Mbit x 16. 2-Mbit x 8) FlashFile. Memory ... : ............ ; .. CHAPTER 6 . Intel486™ Microprocessor SmartDie™ Products . .. SL Enhanced Intel486 DX2 Microprocessor SmartDie™ Product Specification .... . SL Enhanced Intel486 SX Microprocessor SmartDie Product Specification .; ..... . x 5-L 5-2 5-3 5-4 5-5 5-6 6-1 6-2 I ALPHANUMERIC INDEX Alphanumeric Index 28F001 BX-T128F001 BX-B 1 M (128K x 8) CMOS Flash Memory ...................... . 28F008SA 8-Mbit (1-Mbit x 8) FlashFile Memory Extended Temperature Specifications Included ..................................................................... . 28F016SA 16-Mbit (1-Mbit x 16, 2-Mbit x 8) FlashFile Memory ........................ . 28F200BL-T IB, 28F002BL-T IB 2-Mbit (128K x 16, 256K x 8) Low Power Boot Block Flash Memory Family ............................................................... . 28F200BX-T IB, 28F002BX-T IB 2 Mbit (128K x 16, 256K x 8) Boot Block Flash Memory Family ....................................................................... . 28F400BX-T IB, 28F004BX-T IB 4 Mbit (256K x 16, 512K x 8) Boot Block Flash Memory Family .............. -. ....................................................... . 82017SL CHMOS Single-Chip Floppy Disk Controller ................................ . 82078 44 Pin CHMOS Single-Chip Floppy Disk Controller ............................ . 82078 64 Pin CHMOS Single-Chip Floppy Disk Controller ............................ . 82078 CHMOS Single-Chip Floppy Disk Controller .................................. . 82091 AA Advanced Integrated Peripheral (AlP) ..................................... . 82420 PClset ................................................................... . 82420EX PClset ................................................................ . 82489DX Advanced Programmable Interrupt Controller .............................. . 82503 Dual Serial Transceiver (DST) .............................................. . 82593 CSMAlCD Core LAN Controller ............................................ . 82595 ISAlPCMCIA High Integration Ethernet Controller ............................. . 8XC51 SLlLow Voltage 8XC51 SL Keyboard Controller ............................... . AP-358 Intel 820nSL for Super Dense Floppies .................................... . AP-366 89C124FX Data/Fax Modem Chip Set-Reduction of Power Consumption ....... . AP-469 Cache and Memory Design Considerations for the IntelDX2™ Microprocessor ... . AP-485 Intel Processor Identification with the CPUID Instruction ...................... . AP-496 Migrating from the Intel486™ SL Microprocessor to the SL Enhanced Intel486 Microprocessor ............................................................... . AP-497 Managing Power with the SL Enhanced Intel486 Microprocessor ............... . AP-498 Thermal Design for High Performance Notebooks ............................ . AP-504 Clock Throttling the SL Enhanced Intel486 Processor jn a Networked Environment ................................................................. . AP-505 Picking Up the Pace: Designing the IntelDX4TM Processor into Intel486 Processor-Based Desktop Systems ............................................. . DataFax 14.4 Card Product Brief .................................................. . Ethernet LAN Card Product Brief .................................................. . FaxModem 24/96 Card Product Brief .............................................. . Intel OverDrive™ Processors .................................................... . Intel486 Microprocessor ......................................................... . Intel486 Processor Family ....................................................... . SL Enhanced Intel486 DX2 Microprocessor SmartDie™ Product Specification .......... . SL Enhanced Intel486 SX Microprocessor SmartDie Product Specification ............. . UPI-41 AHI 42AH Universal Peripheral Interface 8-Bit Slave Microcontroller ............. . UPI-C42/UPI-L42 Universal Peripheral Interface CHMOS 8-Bit Slave Microcontroller .... . I 5-1 5-5 5-6 5-3 5-2 5-4 4-209 4-207 4-208 4-204 4-1 2-399 2-401 4-220 4-212 4-211 4-210 4-305 4-310 4-306 2-405 2-442 2-471 2-483 2-491 2-511 2-542 4-216 4-213 4-218 3-1 1-1 2-1 6-1 6-2 4-303 4-304 xi 1 • OverVIew I Intel486TM MICROPROCESSOR INTRODUCTION Intel microprocessors and peripherals provide a broad range of time-saving, energy-efficient, high-performance solutions to designers of both mobile and desktop microprocessor-based systems. Intel's microprocessor/ peripheral interface delivers time and performance advantages to the designers of microprocessor-based systems, meeting their demand for greater performance, lower power consumption and a wider variety of builtin features for their customers. HIGH-PERFORMANCE . ENTRY-LEVEL SYSTEMS Intel offers an entire product line of entry-level microprocessors for desktop and mobile systems, ranging from the 25 MHz version of the Intel486™ SX processor to the high performance 100 MHz IntelDX4 processor. The IntelDX4 processor is the world's fastest 486, providing a new level of affordable computing power to the business desktop and unparalled performance to mobile computers. Intel couples superior performance with sophisticated energy-efficient SL technology to meet the requirements of the EPA's Energy Star guidelines. Intel offers a Wide variety of off-the-shelf components to fulfill the requirements of system designers while simplifying the implementation of their designs. Offthe-shelf system solutions greatly decrease the potential risk of costly project delays due to component incompatibility. These system solutions greatly reduce the amount of time required to design, debug, manufacture and test microprocessor-based systems. INCREASED RELIABILITY High reliability is a tangible goal that translates to higher reliability for your product, reduced downtime, and reduced repair costs. As more and more functions are integrated into fewer components, the resulting system requires less power, produces less heat, and requires fewer mechanical connections-again resulting in greater system reliability. LOWER COSTS Using proven, reliable off-the-shelf components will reduce design costs, manufacturing costs, and time to market while increasing project viability and product reliability. REDUCED TIME TO MARKET Intel's universal motherboard for Intel486 microprocessor-based desktop systems and scalable architecture for mobile systems reduces the development effort required to produce an entire product line of high-performance entry-level systems, thereby reducing time to market. December 1994 Order Number: 241818-002 1-1 infel® Intel486™ Microprocessor I 2 INTEL486™ PROCESSOR FAMILY • • • IntelDX4TM Processor - Up to 100-MHz Operation - Speed-Multiplying Technology - 32-Bit Architecture - 16K-Byte On-Chip Cache - Integrated Floating-Point Unit - 3.3V Core Operation with 5V Tolerant I/O Buffers - SL Technology - Static Design -IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base Write-Back Enhanced IntelDX2TM Processor - Speed-Multiplying Technology - 32-Bit Architecture . - 8K-Byte On-Chip Write-Back Cache - Integrated Floating-Point Unit - SL Technology - Static Design -IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base IntelDX2TM Processor - Speed-Multiplying Technology - 32-Bit Architecture - 8K-Byte On-Chip Cache -Integrated Floating-Point Unit - SL Technology - Static Design -IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base • IntelSX2™ Processor - Speed-Multiplying Technology - 32-Bit Architecture - 8K-Byte On-Chip Cache - SL Technology - Static Design -IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base • Intel486TM DX Processor - 32-Bit Architecture - 8K-Byte On-Chip Cache -Integrated Floating-Point Unit - SL Technology - Static Design - IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base • Intel486 SX Processor - 32-Bit Architecture - 8K-Byte On-Chip Cache - SL Technology - Static Design -IEEE 1149.1 Boundary Scan Compatibility - Binary Compatible with Large Software Base 'Other brands and names are the. property of their respective owners. December 1994 Order Number: 242202-001 2-1 Intel486TM PROCESSOR FAMILY DATA SHEET DESIGNATIONS Intel uses various data sheet markings to designate each phase of the document as it relates to the product. The marking appears in the lower, inside corner of the data sheet. The following is the definition of these . markings: Data Sheet Marking Product Preview Advance Information Preliminary No Marking t .. Description Contains information on products in the design phase of development. Do not finalize a design with this information.,Revised information will be published when the product becomes available. Contains information on products being sampled or in the initial production phase of development. t Contains preliminary information on new products in production. t Contains information on products in full production. t Speclflcallons within these data sheets are subject to change without notice. Verify with your local Intel sales office that you have the latest datasheet before finalizing a design. 2-2 I INTEL486™ PROCESSOR FAMILY CONTENTS PAGE CONTENTS PAGE 1.0 INTRODUCTION ..................... 2-9 4.4.2 SEGMENT REGISTER USAGE ......................... 2-69 1.1 Processor Features ................ 2-9 4.51/0 Space ........................ 2-69 1.2 Intel486™ Processor Product Family ............................. 2-11 4.6 Addressing Modes ................ 2-70 2.0 HOW TO USE THIS DOCUMENT ... 2-12 4.6.1 ADDRESSING MODES OVERVIEW ..................... 2-70 2.1 Introduction ...................... 2-12 2.2 Section Contents and Processor Specific Information ................ 2-12 2.3 Documents Replaced by This Data Sheet .............................. 2-15 4.6.2 REGISTER AND IMMEDIATE MODES ......................... 2-70 4.6.3 32-BIT MEMORY ADDRESSING MODES ......... 2-70 3.0 PIN DESCRIPTION ................. 2-15 4.6.4 DIFFERENCES BETWEEN 16- AND 32-BIT ADDRESSES ... 2-71 3.1 Pin Assignments .................. 2-15 4.7 Data Formats ..................... 2-72 3.2 Quick Pin Reference .............. 2-33 4.7.1 DATA TyPES ................ 2-72 4.0 ARCHITECTURAL OVERVIEW ..... 2-43 4.7.2 LITTLE ENDIAN VS. BIG ENDIAN DATA FORMATS ...... 2-76 4.1 Introduction ...................... 2-43 4.1.1 INTEL486 DX,INTELDX2™, AND INTELDX4TM PROCESSOR ON-CHIP FLOATING POINT UNIT ............................ 2-44 4.1.2 UPGRADE POWER DOWN MODE .......................... 2-44 4.2 Register Set ...................... 2-44 4.2.1 FLOATING POINT REGISTERS .................... 2-45 4.8.1 INTERRUPTS AND EXCEPTIONS ................... 2-76 4.8.2 INTERRUPT PROCESSING .................. 2-77 4.8.3 MASKABLE INTERRUPT .... 2-77 4.8.4 NON-MASKABLE INTERRUPT ...... ~ ............. 2-79 4.8.5 SOFTWARE INTERRUPTS .. 2-79 4.2.2 BASE ARCHITECTURE REGISTERS .................... 2-45 4.8.6 INTERRUPT AND EXCEPTION PRIORITIES ....... 2-79 4.2.3 SYSTEM LEVEL REGISTERS .................... 2-50 4.8.7 INSTRUCTION RESTART ... 2-81 4.2.4 FLOATING POINT REGISTERS .................... 2-56 4.8.8 DOUBLE FAULT ............. 2-81 4.8.9 FLOATING POINT INTERRUPT VECTORS ......... 2-81 4.2.5 DEBUG AND TEST REGISTERS .................... 2-65 5.0 REAL MODE ARCHITECTURE ..... 2-82 4.2.6 REGISTER ACCESSIBILITY ................ 2-65 5.2 Memory Addressing .............. 2-82 4.2.7 COMPATIBILITy ............. 2-66 5.3 Reserved Locations .............. 2-83 4.3 Instruction Set .................... 2-67 5.4 Interrupts ......................... 2-83 4.3.1 FLOATING POINT INSTRUCTIONS ................ 2-67 5.5 Shutdown and Halt ............... 2-83 4.4 Memory Organization ............. 2-67 4.4.1 ADDRESS SPACES ......... 2-68 I 4.8 Interrupts ......................... 2-76 5.1 Introduction ...................... 2-82 6.0 PROTECTED MODE ARCHITECTURE ..................... 2-84 6.1 Addressing Mechanism ........... 2-84 6.2 Segmentation .................... 2-85 2-3 CONTENTS PAGE PAGE 7.1.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE ........................ 2-115 6.2.3 DESCRIPTOR TABLES ...... 2-86 7.1.2 INTELDX4 PROCESSOR CACHE ........................ 2-115 7.2 Cache Control ................... 2-115 6.2.4 DESCRIPTORS .............. 2-87 6.3 Protection ........................ 2-95 6.3.1 PROTECTION CONCEPTS .. 2-95 6.3.2 RULES OF PRIVILEGE ...... 2-96 6.3.3 PRIVILEGE LEVELS ......... 2-96 6.3.4 PRIVILEGE LEVEL TRANSFERS ................... 2-99 6.3.5 CALL GATES ............... 2-100 6.3.6 TASK SWiTCHING ......... 2-100 6.3.7 INITIALIZATION AND TRANSITION TO PROTECTED MODE ..................... : ... 2-102 6.4 Paging .......................... 2-102 6.4.1 PAGING CONCEPTS ....... 2-102 6.4.2 PAGING ORGANIZATION .. 2-103 6.4.3 PAGE LEVEL PROTECTION (R/W, U/S BITS) .............. 2-105 6.4.4 PAGE CACHEABILITY (PWT AND PCD BITS) ............... 2-106 6.4.5 TRANSLATION LOOKASIDE BUFFER ....................... 2-106 6.4.6 PAGING OPERATION ...... 2-107 6.4.7 OPERATING SYSTEM RESPONSIBILITIES ........... 2-107 6.5 Virtual 8086 Environment ........ 2-108 6.5.1 EXECUTING 8086 PROGRAMS ................... 2-108 6.5.2 VIRTUAL 8086 MODE ADDRESSING MECHANISM ... 2-108 6.5.3 PAGING IN VIRTUAL MODE ......................... 2-108 6.5.4 PROTECTION AND 110 PERMISSION BITMAP ......... 2-109 6.5.5INTERRUPTHANDLING ... 2-110 6.5.6 ENTERING AND LEAVING VIRTUAL 8086 MODE ......... 2-111 7.0 ON-CHIP CACHE .................. 2-114 7.1 Cache Organization ............. 2-114 2-4 CONTENTS 6.2.1 SEGMENTATION INTRODUCTION ................ 2-85 6.2.2 TERMINOLOGY ............. 2-85 7.2.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE CONTROL AND OPERATING MODES ....................... 2-117 7.3 Cache Line Fills ................. 2-117 7.4 Cache Line Invalidations ......... 2-118 . 7.4.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR SNOOP CYCLES AND WRITEBACK MODE INVALIDATION .. 2-118 7.5 Cache Repla,cement ............. 2-118. 7.6 Page Cacheability ............... 2-119 7.6.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR PAGE CACHEABILITY ................ 2-120 7.7 Cache Flushing .................. 2-120 7.7.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE FLUSHING .................... 2-121 7.8 Write-Back Enhanced IntelDX2 Processor Write-Back Cache Architecture ...................... 2-121 7.8.1 WRITE-BACK CACHE COHERENCY PROTOCOL ..... 2-122 7.8.2 DETECTING ON-CHIP WRITE-BACK CACHE OF THE WRITE-BACK ENHANCED INTELDX2 PROCESSOR ...... 2-124 8.0 SYSTEM MANAGEMENT MODE (SMM) ARCHITECTURES ........... 2-125 8.1 SMM Overview .................. 2-125 8.2 Terminology ..................... 2-125 8.3 System Management Interrupt Processing ................ ; ....... 2-125 8.3.1 SYSTEM MANAGEMENT INTERRUPT (SMI#) ........... 2-126 8.3.2 SMI # ACTIVE (SMIACT#) .................... 2-127 8.3.3 SMRAM .............. '...... 2-129 8.3.4 EXIT FROM SMM .......... 2-131 8.4 System Management Mode Programming Model .............. 2-131 I CONTENTS PAGE 8.4.1 ENTERING SYSTEM MANAGEMENT MODE ........ 2-131 8.4.2 PROCESSOR ENVIRONMENT ............... 2-132 8.4.3 EXECUTING SYSTEM MANAGEMENT MODE HANDLER ..................... 2-133 8.5 SMM Features .................. 2-134 8.5.1 SMM REVISION IDENTIFIER ................... 2-134 PAGE 9.2.4 DATA LINES (D31-DO) ..... 2-146 9.2.5 PARITY .................... 2-146 9.2.6 BUS CYCLE DEFINITION ... 2-146 9.2.7 BUS CONTROL ............ 2-147 9.2.8 BURST CONTROL ......... 2-148 9.2.9 INTERRUPT SIGNALS ..... 2-148 9.2.10 BUS ARBITRATION SIGNALS ...................... 2-150 9.2.11 CACHE INVALIDATION ... 2-151 8.5.2 AUTO HALT RESTART ..... 2-134 9.2.12 CACHE CONTROL ........ 2-151 8.5.3 1/0 INSTRUCTION RESTART ..................... 2-135 9.2.13 PAGE CACHEABILITY (PWT, PC D) .................... 2-152 8.5.4 SMM BASE RELOCATION .. 2-135 9.2.14 UPGRADE PRESENT (UP#) ......................... 2-152 8.6 SMM System Design Considerations .................... 2-136 8.6.1 SMRAM INTERFACE ....... 2-136 8.6.2 CACHE FLUSHES .......... 2-137 8.6.3 A20M# PIN AND 5MBASE RELOCATION ................. 2-140 8.6.4 PROCESSOR RESET DURING SMM ................. 2-141 8.6.5 SMM AND SECOND LEVEL WRITE BUFFERS .............. 2-141 8.6.6 NESTED SMI#s AND 1/0 RESTART ..................... 2-134 8.7 SMM Software Considerations ... 2-141 8.7.1 SMM CODE CONSIDERATIONS ............ 2-141 8.7.2 EXCEPTION HANDLING ... 2-142 8.7.3 HALT DURING SMM ....... 2-142 8.7.4 RELOCATING SMRAM TO AN ADDRESS ABOVE ONE MEGABYTE ................... 2-142 9.0 HARDWARE INTERFACE ......... 9.1 Introduction ..................... 9.2 Signal Descriptions .............. 9.2.1 CLOCK (CLK) .............. 9.2.2 INTELDX4 PROCESSOR CLOCK MULTIPLIER SELECTABLE INPUT (CLKMUL) ..................... 9.2.3 ADDRESS BUS (A31-A2, BEO#-BE3#) ................. I CONTENTS 2-143 2·143 2-143 2-143 2-143 9.2.15 NUMERIC ERROR REPORTING (FERR #, IGNNE#) ...................... 2-152 9.2.16 BUS SIZE CONTROL (BS16#, BS8#) ............... 2-153 9.2.17 ADDRESS BIT 20 MASK (A20M#) ...................... 2-153 9.2.18 WRITE-BACK ENHANCED INTELDX2 PROCESSOR SIGNALS AND OTHER ENHANCED BUS FEATURES .. 2-153 9.2.19 INTELDX4 PROCESSOR VOLTAGE DETECT SENSE OUTPUT (VOLDET) ........ , ... 2-156 9.2.20 BOUNDARY SCAN TEST SIGNALS ...................... 2-156 9.3 Interrupt and Non-Maskable Interrupt Interface ...... '" ........ 2-157 9.3.1 INTERRUPT LOGiC ........ 2-157 9.3.~ NMI LOGiC.; ............... 2-158 9.3.3 SMI# LOGiC ............... 2,158 9.3.4 STPCLK# LOGIC .......... 2-158 9.4 Write Buffers .................... 2-159 9.4.1 WRITE BUFFERS AND 1/0 CYCLES ....................... 2-160 9.4.2 WRITE BUFFERS IMPLICATIONS ON LOCKED BUS CyCLES .................. 2-160 2-145 2-5 CONTENTS PAGE 9.5 Reset and Initialization ........... 2-160 PAGE 10.2.8 INVALIDATE CyCLES ..... 2-197 9.5.1 FLOATING POINT REGISTER VALUES ........... 2-160 10.2.9 BUS HOLD ................ 2-199 9.5.2 PIN STATE DURING RESET .................. .'..... 2-161 9.6 Clock Control .................... 2-164 10.2.11 SPECIAL BUS CYCLES .. 2-203 9.6.1 STOP GRANT BUS CYCLE ........................ 2-164 9.6.2 PIN STATE DURING STOP GRANT ........................ 2-164 9.6.3 WRITE-BACK ENHANCED INTELDX2 PIN STATE DURING STOP GRANT SPECiFiCS ..... 2-164 9.6.4 CLOCK CONTROL STATE DIAGRAM ... ; .................. 2-166 9.6.5 WRITE- BACK ENHANCED INTELDX2 PROCESSOR CLOCK CONTROL STATE DIAGRAM .. 2-169 9.6.6 STOP CLOCK SNOOP STATE (CACHE INVALIDATIONS) .............. 2-171 9.6. 7 SUPPLY CURRENT MODEL FOR STOP CLOCK MODES AND TRANSITIONS ................. 2-171 10.0 BUS OPERATION ................ 2-173 10.1 Data Transfer Mechanism ...... 2-173 10.1.1 MEMORY AND I/O SPACES ....................... 2-173 10.1.2.DYNAMIC DATA BUS SIZING ................. : ...... 2-174 10.1.3 INTERFACING WITH 8~, 16AND 32-BIT MEMORIES ....... 2-176 10.1.4 DYNAMIC BUS SIZING DURING CACHE LINE FILLS .. 2-178 10.1.50PERANDALIGNMENT .. 2-178 10.2 Bus Functional Description ..... 2-179 10.2.1 NON-CACHEABLE NONBURST SINGLE CYCLE ....... 2-180 10.2.2 MULTIPLE AND BURST CYCLE BUS TRANSFERS ..... 2-182 10.2.10 INTERRUPT ACKNOWLEDGE .............. 2-202 10.2.12 BUS CYCLE RESTART .. 2-205 10.2.13 BUS STATES ............ 2-206 10.2.14 FLOATING POINT ERROR HANDLING FOR THE INTEL486 DX, INTELDX2, AND INTELDX4 PROCESSORS ..... 2-207 10.2.15 INTEL486 DX, INTELDX2, AND INTELDX4 PROCESSORS FLOATING POINT ERROR HANDLING IN AT-COMPATIBLE SYSTEMS '..................... 2-208 10.3 Enhanced Bus Mode Operation (Write-Back Mode) for the WriteBack Enhanced IntelDX2 Processor ........................ 2-210 10.3.1 SUMMARY OF BUS DIFFERENCES ................ 2-210 10.3.2 BURST CyCLES .......... 2-210 10.3.3 CACHE CONSISTENCY CYCLES ....................... 2-211 10.3.4 LOCKED CYCLES ......... 2-222 10.3.5 FLUSH OPERATION ...... 2-224 10.3.6 PSEUDO LOCKED CYCLES ....................... 2-225 11.0 TESTABILITY .................... 2-228 11.1 Built-In Self Test (BIST) ......... 2-228 11.2 On-Chip Cache Testing ......... 2-228 11.2.1 CACHE TESTING REGISTERS TR3, TR4 AND TR5 ........................... 2-228 11.2.2 CACHE TESTING REGISTERS FOR THE INTELDX4 PROCESSOR ...... 2-230 11.2.3 CACHE TESTABILITY WRITE ........................ 2-230 10.2.3 CACHEABLE CYCLES .... 2-185 11.2.4 CACHE TESTABILITY READ ......................... 2-231 10.2.4 BURST MODE DETAILS .. 2-189 11.2.5 FLUSH CACHE ............ 2-231 10.2.58- AND 16-BIT CYCLES ... 2-193 11.2.6 ADDITIONAL CACHE TESTING FEATURES FOR ENHANCED BUS (WRITE-BACK) MODE ......................... 2-232 10.2.6 LOCKED CYCLES ......... 2-195 10.2.7 PSEUDO-LOCKED CYCLES ....................... 2-196 2-6 CONTENTS I CONTENTS PAGE 11.3 Translation Lookaside Buffer (TLB) Testing ..................... 2-233 11.3.1 TRANSLATION LOOKASIDE BUFFER ORGANiZATION ..... 2-233 11.3.2 TLB TEST REGISTERS TR6 AND TR7 ...................... 2-234 11.3.3 TLB WRITE TEST ......... 2-236 11.3.4 TLB LOOKUP TEST ....... 2-236 11.4 Tri-State Output Test Mode ..... 2-237 11.5 Intel486 Processor Boundary Scan (JTAG) ...................... 2-237 11.5.1 BOUNDARY SCAN ARCHITECTURE .............. 2-237 11.5.2 DATA REGISTERS ........ 2-237 11.5.3 INSTRUCTION REGISTER .................... 2-239 ")1.5.4 TEST ACCESS PORT (TAP) CONTROLLER ................ 2-242 11.5.5 BOUNDARY SCAN REGISTER BITS AND BIT ORDERS ...................... 2-245 11.5.6 TAP CONTROLLER INITIALIZATION ............... 2-246 11.5.7 BOUNDARY SCAN DESCRIPTION LANGUAGE (BSDL) FILES .................. 2-246 12.0 DEBUGGING SUPPORT . ......... 2-247 12.1 Breakpoint Instruction .......... 2-247 12.2 Single~Step Trap ............... 2-247 12.3 Debug Registers ............... 2-247 12.3.1 LINEAR ADDRESS BREAKPOINT REGISTERS (DRO-DR3) .................... 2-247 12.3.2 DEBUG CONTROL REGISTER (DR7) .............. 2-247 CONTENTS PAGE 13.1.3 ENCODING OF INTEGER INSTRUCTION FIELDS ........ 2-254 13.1.4 ENCODING OF FLOATING POINT INSTRUCTION FIELDS ........................ 2-260 13.2 Clock Count Summary .......... 2-260 13.2.1 INSTRUCTION CLOCK COUNT ASSUMPTIONS ....... 2-260 14.0 DIFFERENCES BETWEEN INTEL486 PROCESSORS AND INTEL386 PROCESSORS ........... 2-286 14.1 Differences between the Intel386 Processor with an Intel387TM Math CoProcessor and Intel486 OX, IntelDX2 and IntelDX4 Processors ....................... 2-286 15.0 DIFFERENCES BETWEEN THE PGA, SQFP AND PQFP VERSIONS OF THE INTEL486 SX AND INTEL486 OX PROCESSORS ....... 2-287 15.1 2X Clock Mode ................. 2-287 15.1.1 PIN ASSIGNMENTS ....... 2-287 15.1.2 QUICK PIN REFERENCE .. 2-288 15.1.3 CLOCK CONTROL ........ 2-289 15.1.4 DC SPECIFICATIONS FOR 2X CLOCK OPTION ............ 2-293 15.1.5 AC SPECIFICATIONS FOR 2X CLOCK OPTION ............ 2-293 16.00verDrive™ PROCESSOR SOCKET ............................ 2-299 16.1 OverDrive Processor Socket Overview ......................... 2-301 16.2 OverDrive Processor Circuit Design ........... , ................ 2-301 16.2.1 BACKWARD COMPATIBILITy ............... 2-302 12.3.3 DEBUG STATUS REGISTER (DR6) .......................... 2-250 12.3.4 USE OF RESUMEFLAG (RF) IN FLAG REGISTER ...... 2-251 16.3 Socket Layout .................. 2-302 16.3.1 MECHANICAL DESIGN CONSIDERATIONS ............ 2-302 13.0 INSTRUCTION SET SUMMARY .. 2-252 16.3.2 DESIGN RECOMMENDATIONS ........ 2-302 13.1 Instruction Encoding ........... 2-252 13.1.1 OVERVIEW ............... 2-252 13.1.2 32-BIT EXTENSIONS OF THE INSTRUCTION SET ....... 2-253 I 16.3.3 ZI F SOCKET VEN DORS .. 2-304 16.4 Thermal Design Considerations .................... 2-304 16.4.1 ACTIVE HEATSINK THERMAL DESIGN ............ 2-304 2-7 CONTENTS PAGE 16.4.2 PASSIVE HEATSINK THERMAL DESIGN ............ 2-304 16.5 BIOS and Software ............. 2-304 16.5.1 OVERDRIVE PROCESSOR DETECTION ................... 2-304 16.5.2 TIMING DEPENDENT LOOPS ........................ 2-304 16.6 Test Requirements ............. 2-305 16.7 OverDrive Processor Socket Pinout ............................ 2-305 16.7.1 PIN DESCRIPTION ........ 2-305 16.7.2 RESERVED PIN SPECIFICATION ............... 2-305 16.7.3 INC "INTERNAL NO CONNECT" PIN SPECIFICATIONS ............. 2-305 16.7.4 SHARED WRITE-BACK PINS ........................... 16.7.5 PiNOUT ................... 16.8 3.3V Socket Specification ...... 16.9 DCI AC Specifications .......... 2-305 2-306 2-310 2-310 17.0 ELECTRICAL DATA .............. 2-312 17,.1 Power and Grounding .......... 2-312 17.1.1 POWER CONNECTIONS .. 2-312 17.1.2INTEL486 PROCESSOR POWER DECOUPLING RECOMMENDATIONS ........ 2-312 17.1.3 VCC5 AND Vcc POWER SUPPLY REQUIREMENTS FOR THE INTELDX4 PROCESSOR .................. 2-312 17.1.4 SYSTEM CLOCK RECOMMENDATIONS ........ 17.1.5 OTHER CONNECTION RECOMMENDATIONS ........ 17.2 Maximum Ratings .............. 17.3 DC Specifications .............. 2-313 2-313 2-313 2-314 17.3.1 3.3V DC CHARACTERISTICS ........... 2-314 CONTENTS PAGE 17.4 AC Specifications .............. 2-324 17.4.1 3.3V AC CHARACTERISTICS ........... 2-324 17.4.2 5V AC CHARACTERISTICS ........... 2-334 17.5 Capacitive Derating Curves ..... 2-342 18.0 MECHANICAL DATA ............. 2-347 18.1 Intel486 Processor Package Dimensions ....................... 2-347 18.1.1168-PIN PGA PACKAGE .. 2-347 18.1.2 208-LEAD SQFP PACKAGE ..................... 2-349 18.1.3 196-LEAD PQFP PACKAGE ..................... 2-350 18.2 Package Thermal Specifications ......... .' ........... 2-351 18.2.1 168-PIN PGA PACKAGE THERMAL CHARACTERISTICS FOR 3.3V INTELDX4 PROCESSOR .................. 2-352 18.2.2168-PIN PGAPACKAGE THERMAL CHARACTERISTICS FOR 5V INTEL486 PROCESSORS ................ 2-354 18.2.3 THERMAL SPECIFICATIONS FOR 208LEAD SQFP PACKAGE ........ 2-355 18.2.4 THERMAL SPECIFICATIONS FOR 196LEAD PQFP PACKAGE ........ 2-357 APPENDIX A ADVANCED FEATURES ............ 2-358 APPENDIX B FEATURE DETERMINATION ........ 2-359 APPENDIX C 1/0 BUFFER MODELS .............. 2-360 APPENDIX D BSDL LISTINGS ..................... 2-382 APPENDIX E SYSTEM DESIGN NOTES ........... 2-389 17.3.2 5V DC CHARACTERISTICS ........... 2-319 17.3.3 EXTERNAL RESISTORS RECOMMENDED TO MINIMIZE LEAKAGE CURRENTS FOR THE WRITE-BACK ENHANCED INTELDX2 PROCESSOR ...... 2-323 2-8 I Intel486TM PROCESSOR FAMILY 1.0 INTRODUCTION The Intel486 processor family enables a range of low-cost, high-performance entry-level system designs capable of running the entire installed base of DOS', Windows', OS/2', and UNIX' applications written for the Intel architecture. This family includes the IntelDX4 processor, the fastest Intel486 processor (up to 50% faster than an IntelDX2 processor). The IntelDX4 processor integrates a 16K unified cache and floating point hardware on-chip for improved performance. The IntelDX2 processor integrates an 8K unified cache and floating point hardware on chip. The IntelDX2 processor is also available with a write-back on-chip cache for improved entry-level performance. The IntelDX4 and IntelDX2 processors use' Intel's speed-multiplying technology, allowing the processor to operate at frequencies higher than the external memory bus. The Intel486 OX processor offers the features of the IntelDX2 processors without speed-multiplying. The Intel486 SX processor offers the features of the Intel486 OX processor without floating point hardware and the IntelSX2 processor adds speed-multiplying to the Intel486 SX processor. The entire Intel486 processor family incorporates energy efficient "SL Technology" for mobile and desktop computing. SL Technology enables desktop system designs that exceed the Environment Protection Agency's (EPA) Energy Star program guidelines without compromising performance. It also increases system design flexibility and improves battery life in all Intel486 processor-based notebooks. SLTechnology allows system designers to differentiate their power management schemes with a variety of energy-efficient or battery-life preserving features. Intel486 processors provide power management features that are transparent to application and operating system software. Stop Clock, Auto HALT Power Down, and Auto Idle power down allow software transparent control over processor power management. Equally important is the capability of the processor to manage system power consumption. Intel486 processor System Management Mode (SMM) incorporates a non-maskable System Management Interrupt (SMI#), a corr~sponding Resume (RSM) instruction and a new memory space for system management code. Intel's SMM ensures seamless power control of the processor core, system logic, main memory, and one or more peripheral devices, that is transparent to any application or operating system. I Intel486 processors are available in a full range of speeds (25 MHz to 100 MHz), packages (PGA, SQFP PQFP), and voltages (5V, 3.3V) to meet any system design requirements. 1.1 Processor Features All of the Intel486 processors consist of a 32-bit integer processing unit, an on-chip cache, and a memory management unit. This ensures full binary compatibility with the 8086, 8088, 80186, 80286, Intel386™ SX, Intel386 OX, and all versions of Intel486 processors. All of the Intel486 processors offer the following features: • 32-bit RISC integer core-The Intel486 processor performs a complete set of arithmetic and logical operations on 8-, 16-, and 32-bit data types using a full-width ALU and eight general purpose registers. • Single Cycle Execution-Many instructions execute in a single clock cycle. • Instruction Pipelining-The fetching, decoding, address translation and execution of instructions are overlapped within the Intel486 processor. • On-Chip Floating Point Unit-lntel486 processors support the 32-, 64-, and 80-bit formats specified in IEEE standard 754. The unit is binary compatible with the 8087, Intel287™, Intel387TM coprocessors, and Intel OverDrive™ processor. • On-Chip Cache with Cache Consistency Support-An 8-Kbyte (16 Kbyte on the IntelDX4 processor) internal cache is used for both data and instructions. Cache hits provide zero wait-state access times for data within the cache. Bus activity is tracked to detect alterations in the memory represented by the internal cache. The internal cache can be invalidated or flushed so that an external cache controller can maintain cache consistency. • External Cache Contro/-Write-back and flush controls for an external cache are. provided so the processor can maintain cache consistency. • On-Chip Memory Management Unit-Address ' management and memory space protection mechanisms maintain the integrity of memory in a multitasking and virtual memory environment. Both segmentation and paging are supported. 2-9 Intel486™ PROCESSOR FAMILY • Burst Cycles-Burst transfers allow a new double word to be read from memory on each bus clock cycle. This capability is especially useful for instruction prefetch and for filling the internal cache. • Write Buffers-The processor contains four write buffers to enhance the performance of consecutive writes to memory. The processor can continue internal operations after a write to these buffers, without waiting for the write to be completed on the external bus. • Bus Backoff-If another bus master needs control of the bus during a processor initiated bus cycle, the Intel486 processor will float its bus signals, then restart the cycle when the bus becomes available again. • Instruction Restart-Programs can continue execution following an exception generated by an unsuccessful attempt to access memory. This feature is important for supporting demand-paged virtual memory applications. • Dynamic Bus Sizing-External controllers can dynamically alter the effective width of the data bus. Bus widths of 8, 16, or 32 bits can be used. • Boundary Scan (JTAG)-Boundary Scan provides in-circuit testing of components on printed circuit boards. The Intel Boundary Scan implementation conforms with the IEEE Standard Test Access Port and Boundary Scan Architecture. 2-10 Sl Technology provides the following features: • Intel System Management Mode-A unique Intel architecture operating mode provides a dedicated special purpose interrupt and address space that can be used to implement intelligent power management and other enhanced functions in a manner that is completely transparent to theoperating system and applications software. • //0 Restart-An 1/0 instruction interrupted by a System Management Interrupt (SM I #) can automatically be restarted following the execution of the RSM instruction. • Stop Clock-The Intel486 processor has a stop clock control mechanism that provides two lowpower states: a "fast wake-up" Stop Grant state (-20 mA-100 mA) and a "slow wake-up" Stop Clock state with ClK frequency at 0 MHz (100 /LA-1000 /LA). • Auto HAL T Power Down-After the execution of a HALT instruction, the Intel486 processor issues a normal Halt bus cycle and the clock input to the Intel486 processor core is automatically stopped, causing the processor to enter the Auto HALT Power Down state (- 20 mA -100 mA). • Upgrade Power Down Mode-When a Intel486 processor upgrade is installed, the upgrade power down mode detects the presence of the upgrade, powers down the core, and tri-states all outputs of the original processor, so the Intel486 processor enters a very low current mode. • Auto Idle Power Down -This function allows the processor to reduce the core frequency to the bus frequency when both the core and bus are idle. Auto Idle Power Down is software transparent and does not affect processor performance. Auto Idle Power Down provides an average power savings of 10% and is only applicable to clock multiplied processors. I Intel486TM PROCESSOR FAMILY • Write Burstin{rData written from the processor Enhanced Bus Mode Features (for the Write-Back Enhanced IntelDX2 Processor only): to memory can be bursted (zero wait state transfer). • Write Back Internal Cache-The Write-Back Enhanced IntelDX2 processor adds write-back support to the 8-Kbyte unified cache. The on-chip cache is configurable to be write-back or writethrough on a line by line basis. The internal cache implements a modified MESI protocol, which is most applicable to uniprocessor systems. 1.2 Intel486™ Processor Product Family Table 1-1 shows the Intel486 processors available by Clock Mode, Supply Voltage, Maximum Frequency, and Package. Likewise, an individual product will have either a 5V supply voltage or a 3.3V supply voltage, but not both. An individual product will have either a 1X clock or a 2X clock, but not both. Please contact Intel for the latest product availability and specifications. • Enhanced Bus Mode-The definitions of some signals have been changed to support the new Enhanced Bus mode (write-back mode). Table 1-1. Product Options Intel486TM Processors Processor Frequency (MHz) Vee 25 33 3.3V ",- ",- 5V ",- ",- 40 50 66 75 100 168Pin PGA 208-Lead SQFP 196-Lead PQFP 1XCIock Intel486 SX Processor IntelSX2TM Processor 5V Intel486 DX Processor 3.3V ",- 5V(1) ",- intelDX2TM Processor 3.3V Write-Back Enhanced IntelDX2 Processor 3.3V IntelDX4TM Processor 3.3V ",",- ",",- ",",- ",- ",- ",- 5V ",- 5V ",- ",- ",- ",",",- ",",",- ",- ",- ",- ",- ",- 2XCIock Intel486 SX Processor(2) 3.3V ",- ",- 5V ",- ",- Intel486 DX Processor(2) 3.3V ",- 5V ",- ",",- ",-. ",- NOTES: 1. The 5V 33-MHz Intel486 OX processor is available in 168-pin PGA and 196-lead PQFP packages. The 5V 50-MHz Intel486 OX processor is available in a 168-pin PGA package only. 2. With the addition of SL Technology to the Intel486 processor family, the Low Power Intel486 SX and Low Power Intel486 OX processors have been superseded with the 3.3V Intel486 processors described in this document. I 2-11 Intel486TM PROCESSOR FAMILY 2.0 HOW TO USE THIS DOCUMENT 2.2 Section Contents and Processor Specific Information 2.1 Introduction The following is a brief description of the contents of each section: Section 1: "Introduction." This section is an overview of the current Intel486 processor family, product features and highlights. This section also lists product frequency, voltage and package offerings. "How to Use This Document." This Section 2: section presents information to aid in the use of this data sheet. Section 3: "Pin Description." This section contains all of the pin configurations for the various package options (168Pin PGA, 208-Lead SQF, and 196Lead PQFP), package diagrams, pin assignment tables and pin assignment differences for the various processors within a package class. The 168-Pin PGA and 208-Lead SQFP package diagrams shown are for the IntelDX2, Write-Back Enhanced IntelDX2 and IntelDX4 processors, with differences for other members of the Intel486 processor family listed in separate tables. The 196-Lead PQFP package diagram is for the Intel486 DX processor. Differences for the Intel486 SX processor in the 196-Lead package are also listed in a separate table. This section also provides a quick pin reference table that lists pin signals for the Intel486 processor family. The table, whenever necessary, has sections applicable to each current Intel486 processor family member. "Architectural Overview." This secSection 4: tion describes the Intel486 processor architecture, including the register and instruction sets, memory organization, data types and formats, and interrupts for all Intel486 processors. This data sheet is a compilation of previously published individual data sheets for the Intel486 SX, IntelSX2, Intel486 DX, IntelDX2 and IntelDX4 processors. With the addition of the Write-Back Enhanced IntelDX2 and information previously published for the introduction of the SL Enhanced Intel486 processors, this data sheet encompasses the entire current Intel486 processor family. This data sheet describes the Intel486 processor architecture, features and technical details. Unless otherwise stated, any description for the Intel486 processor listed in this data sheet applies to all Intel486 processors. Where architectural or other differences do occur (for example, the IntelDX4 processor has a 16-Kbyte on-chip cache, all other Intel486 processors have an 8-Kbyte on-chip cache), these differences are described in separate sections. Section 2.2 provides a brief section description, highlighting the specific sections that contain processor-unique information. This data sheet does not detail the Intel486SL proc' essor, the Low· Power Intel486 SX or Low Power Intel486 DX directly. The Low Power Intel486 processors have been superseded by current versions of the Intel486 processors. It is important to note that all Intel486 DX, IntelDX2, and IntelDX4 processors have an on-chip floating point unit. The Intel486 SX and IntelSX2 processors do not have on-chip floating point and do not provide FERR # and IGNNE #, floating point error reporting signals. The 5V 50-MHz Intel486 DX processor does not implement SL Technoiogy and does not contain the following pins: SMIACT#, SRESET, SMI#, STPCLK#, and UP#. Boundary Scan (JTAG) testability features, capability and associated test signals (TCK, TMS, TDI, and TDO) are standard on all Intel486 processors except the Intel486 SX processors in 168-pin PGA package. 2-12 I Intel486™ PROCESSOR FAMILV Section 5: Section 6: Section 7: Section 8: I The architectural overview describes the 32-bit RISC integer core of the Intel486 processor. The on-chip floating point unit for the Intel486 DX IntelDX2 and IntelDX4 processors is included in this section. Operational differences for the Intel486 SX and IntelSX2 processors (Le. processors that do not containing on-chip floating point units) are also described in detail. "Real Mode Architecture." This section describes the Intel4B6 processor real-mode architecture, including memory addressing, reserved locations, interrupts, and Shutdown and HALT. This section applies to all Intel4B6 processors. "Protected Mode Architecture." This section describes the Intel4B6 pro.tected-mode architecture, including addressing mechanism, segmentation, protection, paging and virtual BOB6 environment. This section applies to all Intel4B6 processors. "On-Chip Cache." This section describes the on-chip cache of the Intel4B6 processors. Specific information on size, features, modes, and configurations is described. The differences between the IntelDX4 processor on-chip cache (16-KByte) and other members of the Intel4B6 processor family on chip cache (B-KByte) are detailed. This section also documents features, modes and operational issues specific to the Write-Back Enhanced IntelDX2 processor. The specifics for the Write-Back Enhanced IntelDX2 are interleaved with sections on the Standard mode (write-through) cache of other Intel4B6 processors as appropriate. "System Management Mode (SMM) Architectures." This section describes the System Management Mode architecture of the Intel4B6 processors, including system management mode interrupt processing and programming mode. Specific information to the Write-Back Enhanced IntelDX2 processor only are listed in appropriate sections. Section 9: Section 10: Section 11: This section applies to Intel4B6 processors except the 50 MHz Intel4B6 DX processor, which does not implement SL Technology. "Hardware Interface." This section describes the hardware interface of the current Intel4B6 processor family, including signal descriptions, interrupt interfaces, write buffers, reset and initialization, and clock control. The IntelDX4 processor speed multiplying options are detailed in this section. The Write-Back Enhanced IntelDX2 processor signals (both new pins and those which have different operational functions) are detailed in this section. Reset and initialization, as it applies to all of the Intel4B6 processor family, is also .documented here. Use and operation of the Stop Clock, Auto HALT Power Down and other power-saving SL Technology features are described. Information specific to the Write-Back Enhanced IntelDX2 processor is also documented whenever appropriate. "Bus Operation." This section describes the Intel4B6 processor bus operation, including the data transfer mechanism and bus functional description. When in Standard Bus mode, the Write-Back Enhanced IntelDX2 processor bus operation is the same as other members of the Intel4B6 processor family. Specific information to the Write-Back IntelDX2 processor in Enhanced Bus mode is detailed in a separate section for ease of use. "Testability." This section describes the testability of the Intel4B6 processors, including the built-in self test (BIST), on-chip cache testing, translation lookaside buffer (TLB) testing, tri-state output test mode, and boundary scan (JTAG). Both the Write-Back Enhanced IntelDX2 and the IntelDX4 processors have unique cache structures that alter testing in comparison to other members of the current Intel486 processor family These processor-specific differences are 2-13 Intel486™ PROCESSOR FAMILY Section 12: Section 13: Section 14: Section 15: Section 16: 2-14 documented in this section. A complete listing of Boundary Scan 10 Codes and Boundary Scan Register Bits orders are also included. "Debugging Support." This section describes the Intel486 processor debugging support, including the breakpoint instruction, single-step trap and debug registers. This section applies to all Intel486 processors. "Instruction Set Summary." This section provides clock count and instruction encoding summaries for all the Intel486 processors. ' "Differences between Intel486 Processors· and Intel386™ Processors." This section lists the differences between the Intel486 processor family and the Intel386 processor family. Also described and documented are differences between the Intel386 with an Intel387TM math coprocessors and the Intel486 processors with on-chip floating point units. This section applies to all Intel486 processors. "Differences between the PGA, SQFP and PQFP Versions of the Inte1486,SX and Intel486 OX Processors." The Low Power Intel486 SX and Intel486 OX processors have been superseded by the current Intel486 processors. This section lists the differences between the current Intel486 SX and Intel486 OX products offered inPGA, SQFP and PQFP packages. The current Intel486 SX and Intel486 OX processors in the PQFP package can operate in 2X clock mode, which is described in detail here. Electrical specifications for the Intel486 SX and Intel486 OX processors in 2X Clock mode are listed in this section. ''OverDrive™ Processor Socket." This section describes the OverDrive processor socket requirements for end'user upgradability of the Intel486 processor family. This section applies to all Intel486 processors. Section 17: Section 18: Appendix A: Appendix B: Appendix C: Appendix 0: Appendix E: "Electrical Data." This section lists the AC and DC specifications for all Intel486 processors. Processor specific information is listed in both common and separate tables and sections as appropriate. "Mechanical Data." This section lists the mechanical and thermal data, including the package specifications (PGA, SQFP and PQFP) for all Intel486 processors. Processor specific information is listed in both common and separate tables and sections as appropriate. "Advanced Features." This section documents the advanced features of the Intel486 processor family not covered in other sections of this data sheet. "Features Determination." This section documents the CPUID function to determine the Intel486 processor family identification and processor specific information. This section applies to all Intel486 processors. "IBIS Models." This section provides a detailed sample listing of the types of 1/0 buffer modeling information available for the Intel486 processor family. This section applies to all Intel486 processors. "BSDL Listing." This section provides a sample listing of a BSDL file for the Intel486 processor family. This section applies to all Intel486 processors. "System Design Notes." This section provides design notes applicable to the use of System Management Mode and SMM routines with the Intel486 processor. This section applies to all Intel486 processors, except the' 50-MHz Intel486 OX processor. I Intel486TM PROCESSOR FAMILV 2.3 Documents Replaced by This Data Sheet This Data Sheet contains all of the latest information for the Intel486 processor family and replaces the following documentation: SL Enhanced Intel486TM Microprocessor Data Sheet Addendu"!, Order No. 241696 Intel486TM SX Microprocessor Data Book, Order No. 240950 InteISX2TM Microprocessor Data Sheet, Order No. 241966 3.0 PIN DESCRIPTION 3.1 Pin ASSignments The following figures show the pin assignments of each package type for the Intel486 processor product family. Tables are provided showing the pin differences between the existing Intel486 processor products and the Intel486 processor products. 16B-Pin PGA-Pin Grid Array • Package Diagram • Pin Assignment Difference Table • Pin Cross Reference by Pin Name Intel486TM OX Microprocessor Data Book, Order No. 240440 Intel486T M DX2 Microprocessor Data Book, Order No. 241245 20B-Lead SQFP-Quad Flat Pack • Package Diagram • Pin Assignment Difference Table • Pin Assignment Table in numerical order InteIDX4TM Microprocessor Data· Book, Order No. 241944 196-Lead PQFP-Plastic Quad Flat Pack Intel486TM Family of Microprocessors Low Power Version Data Sheet, Order No. 241199 • Package Diagram • Pin Assignment Difference Table • Pin Assignment Table in numerical order I 2-15 intel® Intel486TM PROCESSOR FAMILY G D20 II 13 14 IS " 17 vss vee DJ DS vss vss D2 DO D1 A29 D4 DPO AJO 0 0 0. AJ1 A28 A27 A2S A2. vee A2J Dll D9 D22 D21 D18 D1J Tel vss eLK D17 D10 DIS Dll DP2 D16 D14 D7 0 0 0 0 0 0 0. 0 0 0 D2J vss vee 0 0 0. 0 0 DPJ vss vee A21 A18 AU D2S D27 D29 11 DP1 M vss D19 a a a a a a vss a a vss a a a Ne a a a a a a D24 10 vss vss H· INC VSS INC TDI a a a a a a a a a a a vee vee vee vee a a a a a a 0. a a a a a a vee a a vee a a vee a a a a a a EADs, a DJ1 SHII INC INC THO IGNNE' NMI INTR AHOLD A TDO D8 a a a a a a a a a a KEN' a a a a a vee vee a a a 0. a vss sorr. vss a a a a a D' D28 168-PinPGA IntelDX2™ Processor UPI 8516. A22 A1S All A8 A2 S'rPCLK. BRDY' BElt vee A7 FERR' ROYI a 0. a 0. a vee 0. a vee a a vee 0. a vee 0 a 0. a 0. a vee 0. a a 0. 0. a vee a a 0. a A24 AS NO 858. 0 Ne A9 SHIACT. RESET 0 vss A13 Pin Side View 0. vee 0. VSS 0 BE2' BEO. PWT DIce a a a a vee vee vee a a a a vss vss 0 0 0 a BEll PCD VSS LOC. . HLDA BREO AJ PLOCK. BLASTI MIlO' W/R' vss 0 0 0. 0. AU SRESET' a A19 A20 DJO HOLD A17 0. D2. FLUSHI A20N' a a a a vss a 0. 0. a PCHJ(' 0 INC 0 a a a vss a vss . a a a vss a a vss a a a vss Al2 VBS 10 VSS AI0 Ai At ADS' 0 11 II 13 14 IS 16 17 M 242202-1 Figure 3·1. Package Diagram for 168·Pln PGA Package of the IntelDX2TM Processor 2·16 I Intel486TM PROCESSOR FAMILY H A 020 019 011 09 0 0 0 021 DIS 013 vss oPt vss vss vss vee vss vss 0 000 0 D3 05 vee ~6 vee 02 A31 A28 0 0 0 0 01 A29 VSS A25 A26 0 0 0 0 A10 A17 vee A2l 0 0 0 A19 VSS 000 0 000 0 0 0 0 0 o o vss D17 010 DP2 016 014 07 04 OPO o 022 TCK eLK 000 vee 08 DIS 'ICC 012 000000000000 o 00 0 vss vee 0 0 o vss vee A2l 0 0 024 02S 027 0 0 vss vee 026 023 o OP) o o o o o 000 029 o VSS o 10 11 12 13 14 15 16 17 UN o vss o HITM. 031 028 0 0 vee 030 0 0 168·Pin PGA NC 0 o AlB AU 0 o o o vee vss 0 A24 vee A22 A15 A20 A27 0 vss A12 000 Write-Back Enhanced A16 vee vss 000 IntelDX2TM Processor A13 vee vss SHIt SRESET 0 0 vee up, A9 0 0 000 Pin Side View 000 AS CACHE' SHIACTlJ vee vss All vss A8 A10 o o o o'" o o 0 0 vee o 0 o 000 0000000000000000 o o o 0 INC WB/WTIt TOI THS IGNNE' NMI INTR TOO 0 Ne FLUSH. A20H. RESET 8sa.. HOLD KENt vee ROY' STPCLK' BRDYI vee vee BE2. BEO. PWl' Ole, LOCU HLDA SREO BEll vee vee vee MilO' vee PLOCXI BlAST' A3 A6 A4 00000000000000000 AHOLO £ADS. 8516. BOFF. VSS BEl' vss 0000000 vss o PCD vss vss vss o o o o WfR. o VSS PCHK. 0 0 11 12 13 vss A2 FERRI 10 INC 0 14 15 16 ADS' 0 17 A 242202-2 Figure 3-2. Package Diagram for 168-Pin PGA Package of the Write-Back Enhanced IntelDX2TM Processor I 2·17 intel® Intel486TM PROCESSOR FAMILY K A ,. 11 12 13 14 15 16 17 N M Q 02. 019 011 09 yss DPl vss VSS vecs vss vss '.'55 "02: A31 A2B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 022 021 018 013 vee DB 'ICC 03 05 vee 06 vee 01 A29 VSS A2S A26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TeK VSS eLK 017 01. 015 012 DP2 016 0" 07 O. DP. A3. A17 vee A23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 023 vss vee D. 0 0 A19 vss A27 0 VOLDE'l' 0 0 0 0_ 0 0 DP3 vss vee A21 AlB A" 0 0 0 0 0 0 024 025 027 A24 vee vss 0 0 0 0 0 0 VSS vee 026 A22 A1S A12 0 0 0 029 Dll D2B l6B·Pin PGA 0 0 0 A2. vee VSS 0 0 0 A16 vee VSS 0 0 0 vss vee D). 0 0 0 0 0 0 INC SHU SRESET Al3 vee vss 0 0 0 0 0 0 vss vee up, A9 vee .vss 0 0 0 0 0 0 AS All VSS 0 0 0 IntelDX4™ Processor Pin Side View INC INC 0 0 0 INC INC Ne A7 AB A1. 0 0 0 0 0 0 TOI TMS 0 0 IGNNE' NMI 0 0 INTR 0 0 AliOLO 0 A TO. EADS' 0 SMIACTI FERRI 0 nUSH' A20Mt 0 RESET 0 8516. 0 HOLD KEN' STPCLK. BROY' BE2' BEOI PWT ole' LOCKI HLOA 0 0 0 0 0 0 0 0 0 0 0 assl vee ROY' vee vee BEll vee vee vee MIlO' vee 0 0 0 0 0 0 0 0 0 0 0 BOFF' 0 0 vss BE)' vss VSS peD VSS VSS VSS W/R' VSS 0 0 0 0 0 0 0 0 0 0 M N ,. 11 12 13 A2 vee vss 0 0 0 BREQ A3 A6 I" 0 0 1'5 0 PLOCK' BLAST' 0 0 PCRK' CLKMUt 0 0 A' 0 ADS' 0 I 16 17 242202-3 Figure 3-3. 1G8-Pln PGA Pinout Diagram (Pin Side) for the IntelDX4™ Process()r 2·18 I Intel486™ PROCESSOR FAMILV Table 3-1. Pinout Differences for 168·Pin PGA Package Previous Previous Previous Intel486TM Intel486SX IntelSX2TM Intel486 OX IntelDX2 IntelDX4TM IntelDX2TM Pin Intel486 DX Processor Processor Processor Processor Processor SX Processor(7) Processor(7) Processor(7) A3 NC(1) NC TCK NC TCK(4) TCK TCK(5) TCK TCK A10 iNC(2) INC INC INC INC INC INC INV(6) INC A12 INC INC INC INC INC INC INC HITM#(6) INC A13 NC INC INC NC INC NC INC INC A14 NC NC TOI NC TOI(4) TOI TOI(5) TOI TOI A15 NMI NMI INC IGNNE# iGNNE# IGNNE# IGNNE# IGNNE# 810 INC SMI# SMi# INC SMI# INC SMI# SMI# 812 INC INC INC INC INC INC iNC CACHE#(6) INC 813 INC INC INC INC INC INC INC W8/WT#(6) INC 814 NC NC TMS NC TMS(4) TMS TMS(5) TMS TMS 815 INC INC NMI NMI NMI NMI NMi NMI 816 NC NC TOO(5) NC TOO(4) TOO TOO(5) TOO TOO C10 INC SRESET SRESET INC SRESET iNC SRESET SRESET C11 INC UP# UP# INC UP# UP# UP# UP# C12 INC SMIACT# SMIACT# INC SMIACT# INC SMIACT# SMIACT# INC iNC FERR# FERR# FERR# FERR# FERR# INC STPCLK# INC STPCLK# STPCLK# C14 INC G15 INC J1 Ved 3) Vee Vee Vee Vee Vee Vee Vee5 R17 iNC INC INC INC iNC INC INC CLKMUL S4 NC NC NC NC NC NC NC VOLDET I STPCLK# STPCLK# 2-19 Inte1486™. PROCESSOR FAMILY NOTES: 1. NC. Do Not Connect. These pins should always remain unconnected. Connection of NC pins to Vee or Vss or to any other signal can result in component malfunction or incompatibility with future steppings of the Intel486 processors. 2 . INC. Internal No Connect. These pins are not connected to any internal pad in Intel486 processors and OverDrive™ processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferations. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin without affecting the operation of the processor. However, the purpose of a specific INC pin should be understood before it is used. If not, the system design will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. 3. This pin location is for the VCC5 pin on the IntelDX4 processor. For compatibility with 3.3V processors that have 5V safe input buffers (i.e., IntelDX4 processors), this pin should be connected to a Vee trace, not to the Vee plane. See section 3.2, "Quick Pin Reference," for a description of the VCC5 pin on the IntelDX4 processor. 4. These pins were only available on previous 50-MHz Intel486 OX processors. These pins are now on all speeds of the Intel486 OX processor. 5. These pins were No Connects on previous Intel486 OX and IntelDX2 processors. For compatibility with old designs, they can still be left unconnected. 6. These pins are used on the Write-Back Enhanced IntelDX2 processor only. 7. Previous versions of the Intel486 processor family do not implement SL Technology and are not described in this data sheet. 2-20 I Intel486™ PROCESSOR FAMILY Table 3-2. Pin Cross Reference for 168-Pin PGA Package of the IntelDX2TM Processor Address Data Control INC(1) Vee Vss A2 ..... Q14 A3 ..... R15 A4 ...... 816 A5 ..... Q12 A6 ...... 815 A7 ..... Q13 A8 ..... R13 A9 ..... Q11 A10 .... 813 A11 .... R12 A12 ...... 87 A13 .... Q10 A14 ...... 85 A15 ..... R7 A16 ..... Q9 A17 ..... Q3 A18 ..... R5 A19 ..... Q4 A20 ..... Q8 A21 ..... Q5 A22 ..... Q7 A23 ...... 83 A24 ..... Q6 A25 ..... R2 A26 ...... 82 A27 ...... 81 A28 ..... R1 A29 ...... P2 A30 ...... P3 A31 ..... Q1 DO ....... P1 01 ...... N2 02 ...... N1 03 ...... H2 04 ...... M3 05 ....... J2 06 ....... L2 07 ....... L3 08 ....... F2 09 ...... 01 010 ..... E3 011 ..... C1 012 ..... G3 013 ..... 02 014 ..... K3 015 ...... F3 016 ...... J3 017 ..... 03 018 ..... C2 019 ..... 81 020 ..... A1 021 ..... 82 022 ..... A2 023 ..... A4 024 ..... A6 025 ..... 86 026 ..... C7 027 ..... C6 028 ..... C8 029 ..... A8 030 ..... C9 031 ..... 88 A20M# .. 015 A08# ... 817 AHOLO .. A17 8EO# .... K15 8E1 # .... J16 8E2# .... J15 8E3# .... F17 8LA8T# .R16 80FF# .. 017 8ROY# .. H15 8REQ ... Q15 888# .... 016 8816# · .C17 CLK ...... C3 O/C# ... M15 OPO ...... N3 OP1 ....... F1 OP2 ...... H3 OP3 ...... A5 EA08# .. 817 FERR# .. C14 FLU8H# .C15 HLOA .... P15 HOLD .... E15 IGNNE# .A15 INTR .... A16 KEN# ... F15 LOCK# .. N15 M/IO# · .N16 NMI ..... 815 PCO ..... J17 PCHK# .. Q17 PWT ..... L15 PLOCK .. Q16 ROY# ... F16 RE8ET · .C16 8MI# .... 810 8MIACT# C12 UP# ..... C11 W/R# ... N17 8TPCLK#G15 8RE8ET .C10 TCK .... A3(3) TOI .... A14(3) TOO ... 816(3) TM8 ... 814(3) A10 A12 A13 812 813 R17 87 89 811 C4 C5 E2 E16 G2 G16 H16 J1 K2 K16 L16 M2 M16 P16 R3 R6 R8 R9 R10 R11 R14 A7 A9 A11 83 84 85 E1 E17 G1 G17 H17 H1 K1 K17 L1 L17 M1 M17 P17 Q2 R4 86 88 89 810 811 812 814 ) I NC(2) C13 84 2-21 Intel486™ PROCESSOR FAMILY NOTES: 1. INC. Internal No Connect. These pins are not connected to any internal pad in Intel486TM processors and OverDrive™ processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferation. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin without affecting the operation of the processor. However, the purpose of a specific INC pin should be understood before it is used. If not, the system design will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. 2. NC. Do Not Connect. These pins should always remain unconnected. Connection of NC pins to Vee or Vss or to any other Signal can result in component malfunction or incompatibility with future steppings of the Intel486 processors. 3. Boundary Scan pins are not included on the 168-pin PGA package version of the Intel486 SX processor. 2-22 I Intel486TM PROCESSOR FAMILY vss vee vss vee veepeHI(jO A25 A26 A27 BROY' BOFF. BSI60 BS80 Al8 vee A2g vee vss Ne A3a A31 VSS CPO ROY' KENO DO vee 01 02 VSS HOlD AHOlD 03 Dol TeK vee vee vss vee vee elK vee 208 Lead SQFP IntelDX2 TM Processor (Top View) HlDA WfRII VSS vee vss vee vee vss vee vee vss vee 05 D6 vee vee Ne BRED BEOII 07 B~" DPI 08 BE211 09 SEll vss vee vee vss VSS MiIOO vee PCD DID 011 012 013 VSS vee Diet PWT vcc_ VSS vee vee 01. DIS vee EADSO A201MO VSS DP2 RESET FlUSHO Dla vss vee vss INTR NMI VSS 242202-4 • Pin 3. See Note 1 for Table 3-3. Figure 3-4. Package Diagram for 208-Lead SQFP of the IntelDX2TM Processor I 2-23 Intel486TM PROCESSOR FAMILY vss vee 'VSS vee vee A25 A25 . A27 A28 PCH~. B_ BROV. BOFF. 881. vee A28 vee AlO Al1 VSS vss NC ROY. OPO KENI 00 01 D2 03 vee VSS HOLD AHOlo DC .vee TCK vee vee VSS vee vee 'ClK vee HlDA WIRII VSS vee BREa BEOII SE1. SE211 SE" vce VSS MlIQI vee OJQI PWI' PCO vee vss vee VSS vee 208 Lead SQFP' Write-Back Enhanced IntelDX2 TM Processor (Top View) vee VSS vee vee vss vee 05 De vee NC 07 OP1 08 DB vss vec VSS 010 011 012 013 VSS vec - 014 015 NMI VCC vee EAOSI vee RESET 0P2 FLUSHt INTR VSS vss 018 VSS vss 242202-5 Figure 3-5. Package Diagram for 208-Lead SQFP oUhe Write-Back Enhahced IntelDX2™ Processor 2-24 I -- _E~~~ ~ ~ ~I ~~ €: ~mm~< ~~~~m~~~~~~~~~~g~~m~~~~o~~~6~P~~~~~M~~~~~i~~~~~~S~ig~~ oo_~.~._onoooo~.o.ooo • • • • ooo~>o~oOOOOO~0000 .~ooo • • • • _~Ooo ® "TI cs' e o iil --=--':'::'JVSS -::=:J lOCKl Cf ~ ." III n ! CD C ~. iiJ 3 NCr-NC L IGNNEf .. 0' :g 5" roN STPCLKt 031 030 co vss ~ III 0° oo -iX Q,j>.r < _. ",CD -C0:111 0- (II .., "0 Co en ~ "TI ~go o Q CD "T1 g: -c ." -- - ~ :; CD S- CD ~ ID ~ Q) en ... ... .c. iii: !I: "D ." :D i.. fJ) fJ) on oo m ~ o I\) ro 01 '"0>I o:D ~g~~~~g~~2~g~~g~g~~~~~~R~~g~gg~gg~g~eS~8~~~~~g~~~~g~ ~ 3: ~ Intel486TM PROCESSOR FAMILY Table 3-3. Pinout Differences for 208-Lead SQFP Package Pin # Intel486TM SX Processor Intel486 DX Processor IntelDX2™ Processor Write-Back Enhanced IntelDX2 Processor IntelDX4TM Processor 3 VCC<1) VCC Vcc VCC VCC5 11 INC(2) INC INC INC CLKMUL 63 INC INC INC HITM# INC 64 INC INC INC WB/WT# INC 66 INC FERR# FERR# FERR# FERR# 70 INC INC INC CACHE# INC 71 INC INC INC INV INC 72 INC IGNNE# IGNNE# IGNNE# IGNNE# NOTES: 1. This pin location is for the VCC5 pin on the IntelDX4 processor. For compatibility with 3.3V processors that have 5V safe input buffers (i.e., IntelDX4 processors), this pin should be connected to a Vcc trace, not to the Vcc plane. See section 3.2, "Quick Pin Reference," for a description of the VCC5 pin on the IntelDX4 processor. 2. INC. Internal No Connect. These pins are not connected to any internal pad in Intel486 processors and OverDrive™ processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferations. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin without affecting the operation of the processor. However, the purpose of a specific INC pin should be understood before it is used. If not, the system design will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. 3. NC. Do Not Connect. These pins should always remain unconnected. Connection of NC pins to VCC or VSS or to any other signal can result in component malfunction or incompatibility with future steppings of the Intel486 processors. 2-26 I Intel486TM PROCESSOR FAMILY Table 3·4. Pin Assignment for 208·Lead SQFP Package of the IntelDX2™ Processor Pin# Pin# 53 3 Vss Vee VeC<1) 4 PCHK# 56 Description Pin# Description Pin# Description 5 BROY# 57 Vss Vee Vss Vee Vss 109 OP2 161 A21 6 BOFF# 58 SRESET 110 7 B816# 59 SMIACT# 111 Vss Vee 162 163 Vee Vee 8 B88# 60 015 164 A20 9 Vee Vss 113 014 165 A19 10 62 Vee Vss Vee 112 114 A18 11 INC(2) 63 INC 115 Vee Vss 166 167 TM8 12 ROY# 64 INC 116 013 168 TOI 13 KEN# 65 SMI# 117 012 169 14 66 011 170 Vee Vss 67 FERR# NC(3) 118 15 Vee Vss 119 010 171 A17 16 HOlO 68 TOO 120 172 Vee 17 AHOlO 69 Vee 121 173 A16 18 TCK 70 INC 122 VSS Vee VSS 174 A15 19 23 Vee Vee VSS Vee Vee 24 ClK 25 Vee 26 HlOA 27 W/R# 79 28 80 29 Vss Vee 30 BREQ 31 BEO# 135 1 2 20 21 22 I Description 54 55 61 157 Vss 158 A24 107 Vss Vee Vss 159 A23 108 016 160 A22 105 106 71 INC 123 09 175 72 IGNNE# 124 08 176 Vss Vee 73 8TPClK# 125 OP1 177 A14 74 031 126 07 178 A13 75 030 127 NC 179 Vee 76 128 Vee 180 A12 77 Vss Vee 129 06 181 Vss 78 029 130 05 182 A11 028 131 132 185 Vee Vss Vee 82 Vee Vss Vee 186 A10 83 027 Vee Vss Vee Vee Vss 183 187 A9 81 133 134 184 2·27 Intel486TM PROCESSOR FAMILY Table 3-4. Pin Assignment for 208-Lead SQFP Package of the IntelDX2TM Processor (Continued) Pln# Description Pin# Description 32 BE1# 84 33 BE2# 85 025 34 BE3# 86 Vee 35 Vee 87 024 36 Vss 88 37 MIIO# 89 38 Vee 90 39 O/C# 91 40 PWT 92 022 41 PCO 93 021 42 026 Pin# 136 Description Pin# Description Vce 188 137 Vee 189 Vss 138 Vss 190 A8 139 Vee 191 Vcc Vss 140 04 192 A7 Vee 141 03 193 A6 OP3 142 02 194 UP# 023 143 01 195 A5 1.44 DO 196 A4 145 OPO 197 A3 Vce· Vee Vee 94 Vss 146 Vss 198 43 Vss 95 Vee 147 A31 199 Vss 44 Vee 96 NC 148 A30 200 Vee 45 Vee 97 Vss 149 A29 201 Vss 46 EAOS# 98 Vee 150 Vee 202 A2 47 A20M# 99 020 151 A28 203 AOS# 48 RESET 100 019 152 A27 204 BLAST # 49 FLUSH# 101 018 153 A26 205 Vee 50 INTR 102 Vee 154 A25 206 PLOCK # 51 NMI 103 017 155 Vee 207 LOCK# 52 Vss 104 Vss 156 Vss 208 Vss .. NOTES: 1. This pin location is for the Vccs pin on the IntelDX4™ processor. For compatibility with 3.3V processors that have 5V safe input buffers (Le., IntelDX4 processors), this pin should be connected to a VCC trace, not to the Vcc plane. See section 3.2, "Quick Pin'Reference," for a description of the Vccs pin on the IntelDX4 processor. 2. INC. Internal No Connect. These pins are not connected to any internal pad in Intel486 processors and OverDrive processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferations. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin without affecting the operation of the processor. However, the purpose of a specific INC pin should be understood before it is used. If not, the system design will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. 3. Ne. Do Not Connect. These pins should always remain unconnected. Connection of Ne pins to VCC or VSS or to any other signal can result in component malfunction or incompatibility with future steppings of the Intel486 processors. 2-28 I Intel486TM PROCESSOR FAMIL V VCC VSS NC NC VSS A3 vss NC M SRESET NC vcc AS SMIACTI NC NC NC NC NC UPt NC A6 NC A7 NC vss All SMII VCC NC NC NC A9 VCC A10 FERRI TOO NC NC NC IGNNBI NC VSS VSS NC VCC NC A11 NC A12 VCC A13 VSS A1. VCC A1S A16 VSS A17 VCC TOI NC 196 Lead SQFP (Top View) STPCLI<. 031 NC NC 030 vee 029 NC 028 vss 027 NC 02tl VCC 025 NC TMS 02. NC VSS A18 OPJ NC NC A19 023 NC VCC A20 VSS NC VCC 022 NC 021 VSS 242202-7 Figure 3-7. Package Diagram for 196-Lead PQFP Package of the Intel486™ OX Processor I 2-29 Intel486TM PROCESSOR FAMILY Table 3-5. Pinout Differences for 196-Lead PQFP Package Pln# Previous Intel486TM SX Processor(3) Low Power Intel486SX Processor Intel486SX Processor Previous Intel486DX Processor(3) Low Power Intel486DX Processor Intel486DX Processor 75 INC(1) INC STPCLK# INC INC STPCLK# 77 NC(2) INC INC IGNNE# IGNNE# IGNNE# 81 85 92 94 NC INC INC FERR# FERR# FERR# INC INC SMI# INC INC SMI# INC INC SMIACH INC INC SMIACT# INC INC SRESET INC INC SRESET INC CLKSEL NC NC CLKSEL NC 127 NOTES: 1. INC. Internal No Connect. These pins are not connected to any internal pad in Intel486™ processors and OverDrive™ processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferations. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin' without affecting the operation of the processor. However, the purpose of a specifiC INC pin should be understood before it is used. If not, the system deSign will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. 2. NC. Do Not Connect. These pins should always remain unconnected. Connection of NC pins to Vex; or Vss or to any other signal can result in component malfunction or incompatibility with future steppings of the Intel486 processors. 3. Previous versions of the Intel486 processor family do not implement SL Technology and are not described in this data sheet. 2-30 I Intel486™ PROCESSOR FAMILY Table 3-6. Pin Assignments for Intel486TM DX Processor 196-Lead PQFP Package Pln# I Description Pln# Description Pin# Description Pin# Description 1 Vss 50 Vss 99 Vss 148 Vss 2 A21 51 100 NMI 149 NC 3 A22 52 021 "NC 101 INTR 150 A3 4 A23 53 022 102 FlUSH# 151 NC 5 A24 54 Vee 103 RESET 152 A4 6 Vee 55 023 104 A20M# 153 NC 7 A25 56 NC 105 EAOS# 154 A5 8 A26 57 OP3 106 PCO 155 9 A27 58 Vss 107 Vee 156 NC UP# 10 A28 59 024 108 PWT 157 NC 11 Vss 60 NC 109 Vss 158 A6 12 A29 61 025 110 O/C# 159 A7 13 A30 62 Vee 111 MIIO# 160 NC 14 A31 63 026 112 Vee 161 A8 15 NC 64 NC 113 BE3# 162 NC 16 OPO 65 027 114 Vss 163 A9 17 00 66 Vss 115 BE2# 164 Vee 18 ",01 67 028 116 BE1# 165 A10 19 Vee 68 NC 117 BEO# 166 NC 20 02 69 029 118 BREQ 167 21 70 Vee 119 Vee 168 22 Vss Vss Vss Vss 71 030 120 W/R# 169 NC 23 03 72 NC 121 Vss 170 Vee 24 Vee 73 NC 122 HlOA 171 NC 25 04 74 031 123 ClK 172 A11 26 05 75 STPClK# 124 NC 173 NC 27 06 76 NC 125 A12 Vee 77 IGNNE# 126 Vee Vss 174 28 175 Vee 29 07 78 NC 127 NC 176 A13 2-31 Intel486TM PROCESSOR FAMILY Table 3-6. Pin Assignments for Intel486™ OX Processor 196·Lead PQFP Package (Continued) Pln# 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 2-32 Description OP1 08 09 Vss NC 010 Vee 011 012 013 Vss 014 015 OP2 016 017 018 019 020 Vee Pln# 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 Description NC TOO FERR# NC NC Vee SMI# Vss NC NC NC NC NC SMIACT# Vee SRESET Vss Vss NC Vee Pin# 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 Description TCK AHOLO HOLO Vee KEN# ROY# NC 8S8# 8S16# BOFF# BROY# PCHK# NC Vss LOCK# PLOCK # BLAST # AOS# A2 Vee Pln# 177 178. 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 Description vsi, A14 Vee A15 A16 Vss A17 Vee TOI NC TMS NC A18 NC A19 NC A20 Vss NC Vee I Intel486™ PROCESSOR FAMILY 3.2 Quick Pin Reference The following is a brief pin description. For detailed signal descriptions refer to section 9.2, "Signal Description." Table 3·7. Intel486TM Processor Pin Descriptions Symbol Type Name and Function I ClocK provides the fundamental timing and the internal operating frequency for the Intel486 processor. All external timing parameters are specified with respect to the rising edge of ClK. ClK ADDRESS BUS A31-A4 A2-A3 I/O BEO-3# 0 The Byte Enable signals indicate active bytes during read and write cycles. During the first cycle of a cache fill, the external system should assume that all byte enables are active. BE3 # applies to 024-031, BE2# applies to 016-023, BE1 # applies to 08-015 and BEO# applies to 00-07. BEO# -BE3 # are active lOW and are not driven during bus hold. I/O The Data Lines, 00-07, define the least significant byte of the data bus while lines 024-031 define the most significant byte of the data bus. These signals must meet setup and hold times t22 and t23 for proper operation on reads. These pins are driven during the second and subsequent clocks of write cycles. 1/0 There is one Data Parity pin for each byte of the data bus. Data parity is generated on all write data cycles with the same timing as the data driven by the Intel486 processor. Even parity information must be driven back into the processor on the data parity pins with the same timing as read information to insure that the correct parity check status is indicated by the Intel486 processor. The signals read on these pins do not affect program execution. 0 The Address Lines. A31-A2, together with the byte enables signals. BEO#BE3 #, define the physical area of memory or input! output space accessed. Address lines A31-A4 are used to drive addresses into the processor to perform cache line invalidations. Input signals must meet setup and hold times t22 .and t23' A31-A2 are not driven during bus or address hold. DATA BUS 031-00 DATAPARITV DPO-DP3 Input signals must meet setup and hold times t22 and t23' OPO-OP3 should be connected to Vee through a pull·up resistor in systems that do not use parity. OPOOP3 are active HIGH and are driven during the second and subsequent clocks of write cycles. PCHK# I 0 Parity Status is driven on the PCHK# pin the clock after ready for read operations. The parity status is for data sampled at the end of the previous clock. A parity error is indicated by PCHK# being lOW. Parity status is only checked for enabled bytes as indicated by the byte enable and bus size signals. PCHK# is valid only in the clock immediately after read data is returned to the processor. At all other times PCHK# is inactive (HIGH). PCHK# is never floated. 2-33 Intel486TM PROCESSOR FAMILY Table 3-7.lnteI486TM Processor Pin Descriptions (Continued) Symbol Type Name and Function BUS CYCLE DEFINITION M/IO# D/C# W/R# 0 0 0 The memory/input-output, data/control and write/read lines are the primary bus definition signals. These signals are driven valid as the AOS# signal is asserted. M/IO# , D/C# W/R# 0 0 0 0 0 1 0 1 0 110 Read 0 1 1 1/0 Write Code Read Bus Cycle Initiated Interrupt Acknowledge ' Halt/Special Cycle 1 0 0 1 0 ' 1 1 1 0 Memory Read 1 1 1 Memory Write Reserved' The bus definition signals are not driven during bus hold and follQw the timing of the address bus. Refer to section 10.2.11, "SpeCial Bus Cycles," for a description of the special bus cycles. l,OCK# , PLOCK# 0 The Bus Lock pin indicates that the current bus cycle is locked. The Intel486 processor will not allow a bus hold when LOCK # is asserted (but address holds are allowed). LOCK# goes active in the first clock of the first locked bus cycle and goes inactive after the last clock of the last locked bus cycle. The,last locked cycle ends when ready is returned. LOCK # is active LOW and is not driven during bus hold. Locked read cycles will not be transformed into cache fill cycles if KEN # is returned active. 0 The Pseudo-Lock pin indicates that the current bus transaction requires more than one bus cycle to complete. For the Intel486 processor, examples of such operations are segmenttable descriptor reads (64 bits), in addition to cache line fills (128 bits). For Intel486 processors with on-chip FPU, floating point long reads and write (64 bits) also require more than one bus cycle to complete. The Intel486 processor will drive PLOCK # active until the addresses for the last bus cycle of the transaction have been driven regardless of whether ROY # or BROY # have been returned. Normally PLOCK# and BLAST# are inverse of each other. However during the first bus cycle of a 64-bit floating point write (for Intel486 processors with on-chip FPU), both PLOCK# and BLAST# will be asserted. PLOCK# is a function of the BS8#, B516# and KEN# inputs. PLOCK# should be sampled only in the clock ready is returned. PLOCK# is active LOW and is not driven during bus hold. 2-34 I Intel486TM PROCESSOR FAMILY Table 3-7. Intel486™ Processor Pin Descriptions (Continued) Symbol Type. Name and Function BUS CONTROL ADS# a The Address Status output indicates that a valid bus cycle definition and address are available on the cycle definition lines and address bus. AOS# is driven active in the same clock as the addresses are driven. AOS # is active lOW and is not driven during bus hold. RDY# I The Non-burst Ready input indicates that the current bus cycle is complete. ROY # indicates that the external system has presented valid data on the data pins in response to a read or that the external system has accepted data from the Intel486 processor in response to a write. ROY # is ignored when the bus is idle and at the end of the first clock of the bus cycle. ROY # is active during address hold. Oata can be returned to the processor while AHOlO is active. ROY # is active lOW, and is not provided with an internal pull-up resistor. ROY # must satisfy setup and hold times t16 and t17 for proper chip operation. BURST CONTROL BRDY# I The Burst Ready input performs the same function during a burst cycle that ROY # performs during a non-burst cycle. BROY # indicates that the external system has presented valid data in response to a read or that the external system has accepted data in response to a write. BROY # is ignored when the bus is idle and at the end of the first clock in a bus cycle. BROY # is sampled in the second and subsequent clocks of a burst cycle. The data presented on the data bus will be strobed into the processor when BROY # is sampled active. If ROY # is returned simultaneously with BROY #, BROY # is ignored and the burst cycle is prematurely aborted. BROY # is active lOW and is provided with a small pull-up resistor. BROY # must satisfy the setup and hold times t16 and t17. BLAST # a The Burst Last signal indicates that the next time BROY # is returned the burst bus cycle is complete. BLAST # is active for both burst and non-burst bus cycles. BlAST# is active lOW and is not driven during bus hold. I The Reset input forces the Intel486 processor to begin execution at a known state. The processor cannot begin execution of instructions until at least 1 ms after Vee and ClK have reached their proper OC and AC specifications. The RESET pin should remain active during this time to insure proper processor operation. RESET is active HIGH. RESET is asynchronous but must meet setup and hold times t20 and t21 for recognition in any specific clock. INTERRUPTS RESET I 2-35 Intel486TM PROCESSOR FAMILY Table 3-7.lnteI486™ Processor Pin Descriptions (Continued) Symbol Type Name and Function INTERRUPTS (Continued) INTR I The Maskable Interrupt indicates that an external interrupt has been generated. If the internal interrupt flag is set in EFlAGS, active interrupt processing will be initiated. The Intel486 processor will generate two locked interrupt acknowledge bus cycles in response to the INTR pin going active. INTR must remain active until the interrupt acknowledges have been performed to assure that the interrupt is recognized. INTR is active HIGH and is not provided with an internal pull·down resistor. INTR is asynchronous, but must meet setup and hold times t20 and t21 for recognition in any specific clock. NMI I The Non-Maskable Interrupt request signal indicates that an external nonmaskable interrupt has been generated. NMI is rising edge sensitive. NMI must be held lOW for at least four ClK periods before this rising edge. NMI is not provided with an internal pull-down resistor. NMI is asynchronous, but must meet setup and hold times t20 and t21 for recognition in any specific clock. SRESET I The Soft Reset pin duplicates all the functionality of the RESET pin with the following two exceptions: 1. The 5MBASE register will retain its previous value. 2. If UP# (I) is asserted, SRESET will not have an effect on the host processor. For soft resets, SRESET should remain active for at least 15 ClK periods. SRESET is active HIGH. SRESET is asynchronous but must meet setup and hold times t20 and t21 for recognition in any specific clock. SMI# I The System Management Interrupt input is used to invoke the System Management Mode (SMM). SMI # is a falling edge triggered signal which forces the processor into SMM at the completion of the current instruction. SMI # is recognized on an instruction boundary and at each iteration for repeat string instructions. SMI # does not break lOCKed bus cycles and cannot interrupt a currently executing SMM. The processor will latch the falling edge of one pending SMI # signal while the processor is executing an existing SMI # . The nested SMI # will not be recognized until after the execution of a Resume (RSM) instruction. SMIACT# a The System Management Interrupt ACTive is an active low output, indicating that the processor is operating in SMM. It is asserted when the processor begins to execute the SMI # state save sequence and will remain active lOW until the processor executes the last state restore cycle out of SMRAM. STPCLK# I The SToP CLocK request input signal indicates a request has been made to turn off the ClK input. When the processor recognizes a STPClK #, the processor will stop execution on the next instruction boundary, unless superseded by a higher priority interrupt, empty all internal pipelines and the write buffers and generate a Stop Grant acknowledge bus cycle. STPClK# is active lOW and is provided with an internal pull-up resistor. STPCLK # is an asynchronous signal, but must remain active until the processor issues the Stop Grant bus cycle. STPCLK # may be de-asserted at any time after the processor has issued the Stop Grant bus cycle. 2-36 I Intel486™ PROCESSOR FAMILV Table 3·7. Intel486™ Processor Pin Descriptions (Continued) Symbol Type Name and Function BUS ARBITRATION BREQ a The Bus Request signal indicates that the Intel486 processor has internally generated a bus request. BREQ is generated whether or not the Intel486 processor is driving the bus. BREQ is active HIGH and is never floated. HOLD I The Bus Hold request allows another bus master complete control of the processor bus. In response to HOLD going active the Intel486 processor will float most of its output and input/output pins. HLDA will be asserted after completing the current bus cycle, burst cycle or sequence of locked cycles. The Intel486 processor will remain in this state until HOLD is de-asserted. HOLD is active high and is not provided with an internal pull-down resistor. HOLD must satisfy setup and hold times t18 and t19 for proper operation. HLDA a Hold Acknowledge goes active in response to a hold request presented on the HOLD pin. HLDA indicates that the Intel486 processor has given the bus to another local bus master. HLDA is driven active in the same clock that the Intel486 processor floats its bus. HLDA is driven inactive when leaving bus hold. HLDA is active HIGH and remains driven during bus hold. I The Backott input forces the Intel486 processor to float its bus in the next clock. The processor will float all pins normally floated during bus hold but HLDA will not be asserted in response to BOFF #. BOFF # has higher priority than RDY # or BRDY #; if both are returned in the same clock, BOFF # takes effect. The processor remains in bus hold until BOFF # is negated. If a bus cycle was in progress when BOFF # was asserted the cycle will be restarted. BOFF # is active LOW and must meet setup and hold times t1 8 and t19 for proper operation. BOFF# CACHE INVALIDATION AHOLD I The Address Hold request allows another bus master access to the processor's address bus for a cache invalidation cycle. The Intel486 processor will stop driving its address bus in the clock following AHOLD going active. Only the address bus will be floated during address hold, the remainder of the bus will remain active. AHOLD is active HIGH and is provided with a small internal pull-down resistor. For proper operation AHOLD must meet setup and hold times t18 and t19. EADS# I This signal indicates that a valid External Address has been driven onto the Intel486 processor address pins. This address will be used to perform an internal cache invalidation cycle. EADS# is active LOW and is provided with an internal pullup resistor. EADS# must satisfy setup and hold times t12 and t13 for proper operation. CACHE CONTROL KEN# I The Cache Enable pin is used to determine whether the current cycle is cacheable. When the Intel486 processor generates a cycle that can be cached and KEN# is active one clock before RDY # or BRDY # during the first transfer of the cycle, the cycle will become a cache line fill cycle. Returning KEN # active one clock before RDY # during the last read in the cache line fill will cause the line to be placed in the on-chip cache. KEN# is active LOW and is provided with a small internal pull-up resistor. KEN# must satisfy setup and hold times t14 and t15 for proper operation. FLUSH # I The Cache Flush input forces the Intel486 processor to flush its entire internal cache. FLUSH # is active low and need only be asserted for one clock. FLUSH # is asynchronous but setup and hold times t20 and t21 must be met for recognition in any specific clock. I 2-37 Intel486TM PROCESSOR FAMIL V Table 3-7.lnteI486™ Processor Pin Descriptions (Continued) Symbol Type Name and Function PAGE CACHEABILITY PWT PCD 0 0 The Page Write-Through and Page Cache Disable pins reflect the state of the page attribute bits, PWT and PCD, in the page table entry, page directory entry or control register 3 (CR3) when paging is enabled. If paging is disabled, the processor ignores the PCD and PWT bits and assumes they are zero for the purpose of caching and driving PCD and PWT pins. PWT and PCD have the same timing as the cycle definition pins (M/IO#, D/C#, and W/R#).PWTand PCD are active HIGH and are not driven during bus hold, PCD is masked by the cache disable bit (CD) in Control Register o. BUS SIZE CONTROL BS16# BS8# I I The Bus Size 16 and Bus Size 8 pins (bus sizing pins) cause the Intel486 processor to run multiple bus cycles to complete a request from devices that cannot provide or accept 32 bits of data in a single cycle. The bus sizing pins are sampled every clock. The state of these pins in the clock before ready is used by the Intel486 processor to determine the bus size. These signals are active LOW and are provided with internal pull-up resistors. These inputs must satisfy setup and hold times t14 and t15 for proper operation. ADDRESS MASK A20M# I When the Address Bit 20 Mask pin is asserted, the Intel486 processor masks physical address bit 20 (A20) before performing a lookup to the internal cache or driving a memory cycle on the bus. A20M # emulates the address wraparound at one Mbyte, which occurs on the 8086 processor. A20M # is active LOW and should be asserted only when the processor is in real mode. This pin is asynchronous but should meet setup and hold times t20 and t21 for recognition in any specific clock. For proper operation, A20M # should be sampled high at the falling edge of RESET. TEST ACCESS PORT TCK I Test ClocK is an input to the Intel486 processor and provides the clocking function required by the JTAG Boundary scan feature. TCK .is used to clock state information and data into component on the rising edge of TCK on TMS and TDI, respectively. Data is clocked out of the part on the falling edge of TCK and TDO. TCK is provided \ with an internal pull-up resistor. TDI I Test Data Input is the serial input used to shift JTAG instructions and data into component. TDI is sampled on the rising edge of TCK, during the SHIFT-IR and SHIFT-DR TAP controller states. During all other tap controller states, TDI is a "don't care." TDI is provided with an internal pull-up resistor. TOO 0 Test Data Output is the serial output used to shift JTAG instructions and data out of the component. TDO is driven on the falling edge of TCK during the SHIFT-IR and SHIFT-DR TAP controller states. At all other times TDO is driven to the high impedance state. TMS I Test Mode Select is decoded by the JTAG TAP (Tap Access Port) to select the operation of the test logic. TMS is sampled on the rising edge of TCK. To guarantee deterministic behavior of the TAP controller TMS is provided with an internal pull-up resistor. 2-38 I Intel486™ PROCESSOR FAMILY Table 3-7. Intel4BSTM Processor Pin Descriptions (Continued) Symbol Type Name and Function PERFORMANCE UPGRADE SUPPORT UP# I The Upgrade Present input detects the presence of the upgrade processor, then powers down the core, and tri·states all outputs of the original processor, so that the original processor consumes very low current. UP# is active LOW and sampled at all times, including after power-up and during reset. NUMERIC ERROR REPORTING FOR INTEL4BS DX, INTELDX2™, AND INTELDX4™ PROCESSORS FERR# 0 The Floating point ERRor pin is driven active when a floating point error occurs. FERR # is similar to the ERROR # pin on the Intel387™ Math CoProcessor. FERR # is included for compatibility with systems using DOS type floating point error reporting. FERR # will not go active if FP errors are masked in FPU register. FERR # is active LOW, and is not floated during bus hold. IGNNE# I When the IGNore Numeric Error pin is asserted the processor will ignore a numeric error and continue executing non-control floating point instructions, but FERR # will still be activated by the processor. When IGNNE# is de-asserted the processor will freeze on a non-control floating point instruction, if a previous floating point instruction caused an error. IGNNE# has no effect when the NE bit in control register 0 is set. IGNNE# is active LOW and is provided with a small internal pull-up resistor. IGNNE# is asynchronous but setup and hold times t20 and t21 must be met to insure recognition on any specific clock. WRITE-BACK ENHANCED INTELDX2 PROCESSOR SIGNAL PINS CACHE# 0 The CACHE# output indicates internal cacheability on read cycles and burst writeback on write cycles. CACHE# is asserted for cacheable reads, cacheable code fetches and write-backs. It is driven inactive for non-cacheable reads, I/O cycles, special cycles, and write-through cycles. FLUSH# I Cache FLUSH# is an existing pin that operates differently if the processor is configured as Enhanced Bus mode (write-back). FLUSH # will cause the processor to write back all modified lines and flush (invalidate) the cache. FLUSH # is asynchronous, but must meet setup and hold times t20 and t21 for recognition in any specific clock. HITM# 0 The Hit/Miss to a Modified Line pin is a cache coherency protocol pin that is driven only in Enhanced Bus mode. When a snoop cycle is run, HITM # indicates that the processor contains the snooped line and that the line has been modified. Assertion of HITM # implies that the line will be written back in its entirety, unless the processor is already in the process of doing a replacement write-back of the same line. INV I The Invalidation Request pin is a cache coherency protocol pin that is used only in the Enhanced Bus mode. It is sampled by the processor on EADS#-driven snoop cycles. It is necessary to assert this pin to get the effect of the processor invalidate cycle on write-through-only lines. INV also invalidates the write-back lines. However, if the snooped line is modified, the line will be written back and then invalidated. INV must satisfy setup and hold times t12 and t13 for proper operation. I 2-39 Intel486TM PROCESSOR FAMILY Table a-7.lnteI486TM Processor Pin Descriptions (Continued) Symbol Type Name and Function WRITE-BACK ENHANCED INTELDX2 PROCESSOR SIGNAL PINS (Continued) PLOCK# 0 In the Enhanced bus mode, Pseudo-Lock Output is always driven inactive. In this mode, a 64-bit data read (caused by an FP operand access or a segment descriptor read) is treated as a multiple cycle read request, which may be a burst or a non-burst access based on whether BRDY # or ROY # is returned by the system. Because only write-back cycles (caused by Snoop write-back or replacement write-back) are write burstable, a 64-bit write will be driven out as two non-burst bus cycles. BLAST # is asserted during both writes. Refer to the Bus Functional Description section 10.3 for details on Pseudo-Locked bus cycles. SRESET I For the Write-Back Enhanced IntelDX2 processor, Soft RESET operates similar to other Intel486 processors. On SRESET, the internal SMRAM base register retains its previous value, does not flush, write-back or disable the internal cache. Because SRESET is treated as an interrupt, it is possible to have a bus cycle while SRESET is asserted. SRESET is serviced only on an instruction boundary. SRESET is asynchronous but must meet setup and hold timl3s t20 and t21 for recognition in any specific clock. WB/WT# I The Write-Back/Write-Through pin enables Enhanced Bus mode (write-back cache). It also defines a cached line as write-through or write-back. For cache configuration, WB/WT # must be valid during RESET and be active for at least two clocks before and two clocks after RESET is de-asserted. To define write-back or write-through configuration of a line, WB/WT # is sampled in the same clock as the first ROY # or BRDY # is returned during a line fill (allocation) cycle. INTELDX4 PROCESSOR CLKMUL, VCC5, AND VOLDET CLKMUL I The CLocK MULtiplier input, defined during device RESET, defines the ratio of internal core frequency to external bus frequency. If sampled low, the core frequency operates at twice the external bus frequency (speed doubled mode). If driven high or left floating, speed triple mode is selected. CLKMUL has an internal pull-up speed to Vee and may be left floating in designs that select speed tripled clock mode. VCC5 I The 5V reference voltage input is the reference voltage for the 5V-tolerant 1/0 buffers. This signal should be connected to + 5V ± 5% for use with 5V logic. If all inputs are from 3V logic, this pin should be connected to 3.3V. VOLDET 0 A VOltage DETect signal allows external system logic to distinguish between a 5V Intel486 processor and the 3.3V IntelDX4 processor. This signal is active low for a 3.3V IntelDX4 processor. This pin is available only on the PGA version of the IntelDX4 processor. 2-40 I Intel486™ PROCESSOR FAMILY Table 3-B. Output Pins Name Active Level Name Active Level BRED HIGH 031-00 HIGH/LOW Bus Hold HLDA HIGH DP3-0PO HIGH Bus Hold BE3#-BEO# LOW Bus Hold PWT, PCO HIGH/LOW Bus Hold W/R#, M/IO#,D/C# HIGH/LOW Bus Hold NOTE: 1. All input/output Signals are floated when UP# is asserted. LOCK# LOW Bus Hold Table 3-10. Test Pins PLOCK# LOW Bus Hold ADS# LOW Bus Hold Bus Hold BLAST # LOW PCHK# LOW When Floated Table 3-9. Input/Output Pins(l) FERR#(I) LOW A3-A2 N/A S~IACT#(2) LOW CACHE#(3) LOW Bus, Address Hold HITM#(3) LOW Bus, Address Hold VOLDET(4) LOW Bus, Address Hold A31-A4 Name HIGH/LOW Input or Output When Floated Bus, Address Hold Sampled/Driven On TCK Input N/A TOI Input Rising Edge of TCK TOO Output Falling Edge of TCK TMS Input Rising Edge of TCK NOTE: 1. The test pins are not present on the Intel486 SXTM processor in the PGA package. NOTES: 1. Present on the Intel486TM OX, InteIOX2™, and IntelOX4TM processors only. 2. Not present in the 50-MHz Intel486 OX processor. 3. Present on the Write-Sack Enhanced IntelOX2 processor only. 4. Present on the IntelOX4 processor only. I 2-41 intel® Intel486TM PROCESSOR FAMILY Table 3-11. Input Pins Name Active Level Synchronousl Asynchronous Internal Pull-Upl Pull-Down CLK, CLK2(1) , RESET HIGH Asynchronous SRESET HIGH Asynchronous HOLD HIGH Synchronous AHOLD HIGH Synchronous Pull-Down EADS# LOW Synchronous Pull-Up Pull-Down BOFF# LOW Synchronous Pull-Up FLUSH # LOW Asynchronous Pull-Up A20M# LOW Asynchronous Pull-Up BS16#, BS8# LOW Synchronous Pull-Up Pull-Up KEN# LOW Synchronous RDY# LOW Synchronous BRDY# LOW SynchronOus INTR HIGH Asynchronous NMI HIGH Asynchronous IGNNE#(2) LOW Asynchronous Pull-Up SMI#(3) LOW Asynchronous Pull-Up STPCLK(2)# LOW Asynchronous Pull-Up Pull-Up UP# LOW Pull-Up TCK(4) HIGH Pull-Up TDI(4) HIGH . Pull-Up TMS(4) HIGH Pull-Up INV(5) HIGH Synchronous Pull-Up WB/WT#(5) HIGHI LOW Synchronous Pull-Down CLKMUL#(6) N/A Pull-Up NOTES: 1. CLK2 is present on 2X clock mode Intel486™ SX and Intel486 DX processors. 2. Present on the lo.tel486 DX, IntelDX2™, and IntelDX4TM processors only. 3, Not present in the 50-MHz Intel486 DX processor. 4. The test pins are not present on the Intel486 SX processor in the PGA package. 5. Present on the Write-Sack Enhanced IntelDX2 processor only. 6, Present on the IntelDX4 processor only. 2-42 I Intel486™ PROCESSOR FAMILV 4.0 ARCHITECTURAL OVERVIEW 4.1 Introduction The Intel486 processor family is a 32-bit architecture with on-chip memory management, floating point, and cache memory units. Figure 4-1 is a block diagram of the Intel486 processor family. The Intel486 processor contains all the features of the Intel386TM processor with enhancements to increase performance. The Intel486 processor instruction set includes the complete Intel386 processor instruction set along with extensions to serve new applications and increase performance. The on-chip memory management unit (MMU) is completely compatible with the Intel386 processor MMU. Software written for previous members of the Intel architecture family will run on the Intel486 processor without any modifications. On-chip cache memory allows frequently used data and code to be stored on-chip reducing accesses to the external bus. RiSe design techniques reduce instruction cycle times. A burst bus feature enables fast cache fills. The memory management unit (MMU) consists of a segmentation unit and a paging unit. Segmentation allows management of the logical address space by providing easy data and code relocatibility and efficient sharing of global resources. The paging mechanism operates beneath segmentation and is transparent to the segmentation process. Paging is optional and can be disabled by system software. Each segment can be divided into one or more 4-Kbyte segments. To implement a virtual memory system, full restartability for all page and segment faults is supported. ,Memory is organized into'. one or more variable length segments, each up to four Gbytes (232 bytes) in size. A segment can have attributes associated with it which include its location, size, type (Le., stack, code or data), and protection characteristics. Each task on an Intel486 processor can have a maximum of 16,381 segments and each are up to four Gbytes in size. Thus, each task has a maximum of 64 terabytes (trillion bytes) of virtual memory. I The segmentation unit provides four levels of protection for isolating and protecting applications and the operating system from each other. The hardware enforced protection allows the design of systems with a high degree of software integrity. The Intel486 processor has two modes of operation: Real Address Mode (Real Mode) and Protected Mode Virtual Address Mode (Protected Mode). In Real Mode the Intel486 processor operates as a very fast 8086. Real Mode is required primarily to set up the Intel486 processor for Protected Mode operation. Protected Mode provides access to the sophisticated memory management paging and privilege capabilities of the processor. Within Protected Mode, software can perform a task switch to enter into tasks designated as Virtual 8086 Mode tasks. Each Virtual 8086 task behaves with 8086 semantics, allowing 8086 processor software (an application program or an entire operating system) to execute. System Management Mode (SMM) provides the system designer with a means of adding new software controlled features to their computer products that always operate transparently to the Operating System (OS) and software applications. SMM is intended for use only by system firmware, not by applications software or general purpose systems software. The on-Chip cache is 16 Kbytes in size for the IntelDX4 processor and 8 Kbytes in size for all other members of the Intel486 processor family. It is 4way set associative and follows a write-through policy. The on-chip cache includes features to provide flexibility in external memory system design. Individual pages can be designated as cacheable or noncacheable by software or hardware. The cache can also be enabled and disabled by software or hardware. The Write-Sack Enhanced IntelDX2 processor can be set to use an on-chip write-back cache policy. The Intel486 processor also has features that facilitate high-performance hardware designs. The 1X bus clock input eases high-frequency board-level' designs. The clock multiplier on Inte1SX2, Inte1DX2, and IntelDX4 processors improves execution performance without increasing board design complexity. The clock multiplier enhances all operations operating out of the cache and/or not blocked by external bus accesses. The burst bus feature enables fast cache fills. 2-43 Intel486TM PROCESSOR FAMILY 32·Bil Data Bus 32-Bil Data Bus 32 ClK Translation lookaside Buff.r AlU Displacement Bus 32 MIcro-lnstruction , r---"--..., 1----1 F~:'U::~~t ~ ,,i i i 2X 16 Bytes ConlrolROM '-----' i '-----' ..•. ______._____J NOTES: • Available on Intel486"'" OX. IntalOX2TM and IntelQX4TU processors. u Ayai~e on aU Inlal486 processors excepl168-pin PGA Inlel486 SX processors. t Avaiable only on the IntelOX4 processor. tt Available only on !he Write-Back Enhanced IntelDX2 processor. 242202-8 Figure 4-1.lnteI486™ Processor Block Diagram 4.1.1 INTEL486 OX, INTELDX2™, AND INTELDX4™ PROCESSOR ON-CHIP FLOATING POINT UNIT The Intel486 OX, Inte1DX2, and IntelDX4 processors incorporate the basic Intel486 processor 32-bit architecture with on-chip memory management and cache memory units. They also have an on-chip floating point unit (FPU) that operates in parallel with the arithmetic and logic unit. The FPU provides arithmetic instructions for a variety of numeric data types and executes numerous built-in transcendental functions (e.g., tangent, sine, cosine, and log functions). The floating point unit fully conforms to the ANSII IEEE standard 754-1985 for floating point arithmetic. All software written for the Intel386 processor, Intel387 math coprocessor and previous members of the 86/87 architectural family will run on these processors without any modifications. 4.1.2 UPGRADE POWER DOWN MODE Upgrade Power Down Mode on the Intel486 processor is initiated by the Intel OverDrive processor using the UP# (upgrade present) pin. Upon senSing the presence 'of the Intel OverDrive Processor, the Intel486 processor tri-states its outputs and enters . the "LJpgrade Power Down Mode," lowering its powerconsumption. The UP# pin of the Intel486 processor is driven active (low) by the UP# pin of the Intel OverDrive processor. 4~2 Register Set The Intel486 processor register set can be split into the following categories: • Base Architecture Registers General Purpose Registers Instruction Pointer Flags Register Segment Registers 2-44 I Intel486™ PROCESSOR FAMILY • Systems Level Registers - Control Registers - System Address Registers General Purpose Registers 31 24'I 23 1615 • Debug and Test Registers The base architecture and floating point registers (see below) are accessible by the applications program. The system level registers can only be accessed at privilege level 0 and used by system level programs. The debug and test registers also can only be accessed at privilege level O. o 8 7 '1 AH AX AL EN< BH B~ BL EBX CH c~ CL ECX DH D/< DL EDX ESI EDI EBP ESP 4.2.1 FLOATING POINT REGISTERS In addition to the registers listed above, the Intel486 OX, Inte10X2, and IntelOX4 processors also have the following: • Floating Point Registers - Data Registers - Tag Word - Status Word - Instruction and Data Pointers - Control Word Segment Registers 15 Code Segment 1------1 ss Stack Segment 1------1 ::} Data Segments 1-------1 '------' 4.2.2 BASE ARCHITECTURE REGISTERS Figure 4-2 shows the Intel486 processor base architecture registers. The contents of these registers are task-specific and are automatically loaded with a new context upon a task switch operation. The base architecture includes six directly accessible descriptors, each specifying a segment up to 4 Gbytes in size. The descriptors are indicated by the selector values placed in the Intel486 processor segment registers. Various selector values can be loaded as a program executes. The selectors are also task-specific, so the segment registers are automatically loaded with new context upon a task switch operation. NOTE: In register descriptions, "set" means "set to 1," and "reset" means "reset to 0." o 1 - - - - - - - 1 CS FS GS InstrucUon Pointer 31 16 15 0 1 Flsgs Register FLAGS I EFLAGS 242202-9 Figure 4·2. Base Architecture Registers 4.2.2.1 General Purpose Registers The eight 32-bit general purpose registers are shown in Figure 4-2. These registers hold data or address quantities. The general purpose registers can support data operands of 1, 8, 16 and 32 bits, and bit fields of 1 to 32 bits. Address operands of 16 and 32 bits are supported. The 32-bit registers are named EAX, EBX, ECX, EOX, ESI, EOI, EBP and ESP. The least significant 16 bits of the general purpose registers can be accessed separately by using the 16-bit names of the registers AX, BX, CX, OX, SI, 01, BP and SP. The upper 16 bits of the register are not changed when the lower 16 bits are accessed separately. I 2-45 Intel486™ PROCESSOR FAMILY Finally, 8·bit operations can individually access the lower byte (bits 0-7) and the highest byte (bits 815) of the general purpose registers AX, BX, CX and OX. The lowest bytes are named AL, BL, CL and OL respectively. The higher bytes are named AH, BH, CH and OH respectively. The individual byte acces· sibility offers additional flexibility for data operations, but is not used for effective address calculation. 4.2.2.2 Instruction Pointer The instruction pointer shown in Figure 4·2 is a 32· bit register named EIP. EIP holds the offset of the next instruction to be executed. The offset is always relative to the base of the code segment (CS). The lower 16 bits (bits 0-15) of the EIP contain the 16·bit instruction pointer named IP, which is used for 16·bit addressing. 4.2.2.3 Flags Register The flags register is a 32-bit register named EFLAGS. The defined bits and bit fields within EFLAGS control certain operations and indicate status of the Intel486 processor. The lower 16 bits (bit 0-15) of EFLAGScontain the 16·bit register named FLAGS, which is most useful when executing 8086 and 80286 processor code. EFLAGS is shown in Figure 4·3. EFLAGS bits 1, 3, 5, 15 and 22-31 are defined as "Intel Reserved." When these bits are stored during interrupt processing or with a PUSHF instruction (push flags onto stack), a one is stored in bit 1 and zeros in bits 3,5,15 and 22-31. FLAGS 3 3 2 2 2 2 2 2 2 222 09876 543 2 EFLAGS 1.·,'.ji;Ll;·i{.·/·· 'j{;.i 1-"'·'·'"' 1·<,,·.·... Identification Flag Virtual Interrupt Pending V irtual Interrupt Flag Alignment Check Virtual Mode Resume Flag Nested Task Flag 110 Privilege Level Overflow Direction Flag Interrupt Enable (!J :< '. 1 rr----------~A~------------~ 111 098 7 6 543 2 1 09876543210 IVVAVR·····N lOP 00 ITS Zi Atp ,,' C O I ICMFOT L F F F F F F F Q· F t F P F P f tj ~: Carry Flag Parity Flag Auxiliary lag Zero Flag Sign Flag Trap Flag indicates Intel Reserved; Do not define. 242202-10 NOTE: See section 4.2.7 "Compatibility." Figure 4-3. Flag Registers 2·46 I Intel486TM PROCESSOR FAMILY 10 VIP VIF AC I (Identification Flag, bit 21) The ability of a program to set and clear the 10 flag indicates that the processor supports the CPUID instruction. (Refer to section 13, "Instruction Set Summary," and Appendix B, "Feature Determination: CPUID Instruction.") (Virtual Interrupt Pending Flag, bit 20) The VIP flag together with the VIF enable each applications program in a multitasking environment to have virtualized versions of the system's IF flag. For more on the use of this flag in virtual-8086 mode and in protected mode. (Refer to Appendix A, "Advanced Features.") (Virtual Interrupt Flag, bit 19) The VIF is a virtual image of IF (the interrupt flag) used with VIP. For more on the use of this flag in virtual-8086 mode and in protected mode. (Refer to Appendix A, "Advanced Features.") (Alignment Check, bit 18) The AC bit is defined in the upper 16 bits of the register. It enables the generation of faults if a memory reference is to a misaligned address. Alignment faults are enabled when AC is set to 1. A misaligned address is a word access an odd address, a dword access to an address that is not on a dword boundary, or an 8-byte reference to an address that is not on a 64-bit word boundary. (See section 10.1.5, "Operand Alignment.") Alignment faults are only generated by programs running at privilege level 3. The AC bit setting is ignored at privilege levels 0, 1 and 2. Note that references to the descriptor tables (for selector loads), or the task state segment (TSS), are implicitly level 0 references even if the instructions causing the references are executed at level 3. Alignment faults are reported through interrupt 17, with an error code of o. Table 4-1 gives the alignment required for the Intel486 processor data types. Table 4-1. Data Type Alignment Requirements Memory Access Alignment (Byte Boundary) Word 2 Dword 4 Single Precision Real 4 Double Precision Real 8 Extended Precision Real 8 Selector 2 48-Bit Segmented Pointer 4 32-Bit Flat Pointer 4 32-Bit Segmented Pointer 2 48-Bit "PseudoDescriptor" 4 FSTENV /FLDENV Save Area 4/2 (On Operand Size) FSAVE/FRSTOR Save Area 4/2 (On Operand Size) Bit String 4 IMPLEMENTATION NOTE: Several instructions on the Intel486 processor generate misaligned references, even if their memory address is aligned. For example, on the Intel486 processor, the SGDT / SlOT (store global/interrupt descriptor table) instruction reads/writes two bytes, and then reads/writes four bytes from a "pseudo-descriptor" at the given address. The Intel486 processor will generate misaligned references unless the address is on a 2 mod 4 boundary. The FSAVE and FRSTOR instructions (floating point save and restore state) will generate misaligned references for onehalf of the register save/restore cycles. The Intel486 processor will not cause any AC faults if the effective address given in the instruction has the proper alignment. 2-47 Intel486™ PROCESSOR FAMILV VM RF NT 2·48 (Virtual 8086 Mode, bit 17) . The VM bit provides Virtual 8086 Mode within Protected Mode. If set while the Intel486 processor is in Protected Mode, the Intel486 processor will switch to Virtual 8086 operation, handling segment loads as the 8086 processor does, but generating exception 13 faults on privileged opcodes. The VM bit can be set only in Protected Mode, by the IRET instruction (if current privilege level = 0) and by task switches at any privilege level. The VM bit is unaffected by POPF. PUSHF always pushes a 0 in this bit, even if executing in Virtual 8086 Mode. The EFLAGS image pushed during interrupt processing or saved during task switches will contain a 1 in this bit if the interrupted code was executing as a Virtual 8086 Task. (Resume Flag, bit 16) The RF flag is used in conjunction with the debug register breakpoints. It is checked at instruction boundaries before breakpoint processing. When RF is set, it causes any debug fault to be ignored on the next instruction. RF is then automatically reset at the sucCessful completion of every instruction (no faults are signaled) except the IRET instruction, the POPF instruction, (and JMP, CALL, and INT instructions causing a task switch). These instructions set RF to the value specified by the memory image. For example, at the end of the breakpoint service routine, the IRET instruction can pop an EFLAG image having the RF bit set and resume the program's execution at the breakpoint address without generating another breakpoint fault on the same location. (Nested Task, bit 14) The flag applies to Protected Mode. NT is set to indicate that the execution of this task is within another task. If set, it indicates that the current nested task's Task State Segment (TSS) has a valid back link to the previous task's TSS. This bit is set or reset by control transfers to other tasks. The value of NT in EFLAGS is tested by the IRET instruction to determine whether to do an inter-task return or an intra-task return. A POPF or an IRET instruction will affect the setting of this bit according to the image popped, at any privilege level. 10PL OF OF IF TF (Input/Output Privilege Level, bits 12-13) This two-bit field applies to Protected Mode. 10PL indicates the numerically maximum CPL (current privilege level) value permitted to execute 110 instructions without generating an exception 13 fault or consulting the I/O Permission Bitmap. It also indicates the maximum CPL value allowing alteration of the IF (INTR Enable Flag) bit when new values are popped into the EFLAG register. POPF and IRET instruction can alter the 10PL field when executed at CPL = O. Task switches can always alter the 10PL field, when the new flag image is loaded from the incoming task's TSS. (Overflow Flag, bit 11) is set if the operation resulted in a signed overflow. Signed overflow occurs when the operation resulted in carry/borrow into the sign bit (high-order bit) of the result but did not result in a carry/borrow out of the highorder bit, or vice-versa. For 8-, 16-, 32-bit operations, OF is set according to overflow at bit 7,15,31, respectively. (Direction Flag, bit 10) OF defines whether ESI and/or EOI registers post decrement or post increment during the string instructions. Post increment occurs if DF is reset. Post decrement occurs if OF is set. (INTR Enable Flag, bit 9) IF flag, when set, allows recognition of external interrupts signaled on the INTR pin. When IF is reset, external interrupts signaled on the INTR are not recognized. 10PL indicates the maximum CPL value allowing alteration of the IF bit when new values are popped into EFLAGS or. FLAGS. (Trap Enable Flag, bit 8) TF controls the generation of exception 1 trap when single-stepping through code. When TF is set, the Intel486 processor generates an exception 1 trap after the next instruction is executed. When TF is reset, exception 1 traps occur only as a function of the breakpoint addresses loaded into debug registers ORO-DR3. I Intel486™ PROCESSOR FAMILV SF (Sign Flag, bit 7) ZF SF is set if the high-order bit of the result is set, it is reset otherwise. For 8-, 16-, 32-bit operations, SF reflects the state of bit 7, 15, 31 respectively. (Zero Flag, bit 6) AF PF CF 4.2.2.4 Segment Registers ZF is set if all bits of the result are O. Otherwise, it is reset. (Auxiliary Carry Flag, bit 4) The Auxiliary Flag is used to simplify the addition and subtraction of packed BCD quantities. AF is set if the operation resulted in a carry out of bit 3 (addition) or a borrow into bit 3 (subtraction). Otherwise, AF is reset. AF is affected by carry out of, or borrow into bit 3 only, regardless of overall operand length: 8, 16 or 32 bits. (Parity Flags, bit 2) PF is set if the low-order eight bits of the operation contains an even number of "1 's" (even parity). PF is reset if the low-order eight bits have odd parity. PF is a function of only the low-order eight bits, regardless of operand size. (Carry Flag, bit 0) CF is set if the operation resulted in a carry out of (addition), or a borrow into (subtraction) the high-order bit. Otherwise, CF is reset. For 8-, 16- or 32-bit operations, CF is set according to carry I borrow at bit 7, 15 or 31, respectively. Segment Registers Selector Selector Selector Selector Selector The six addressable segments are defined by the segment registers CS, SS, OS, ES, FS and GS. The selector in CS indicates the current code segment; the selector in SS indicates the current stack segment; the selectors in OS, ES, FS and GS indicate the current data segments. 4.2_2.5 Segment Descriptor Cache Registers The segment descriptor cache registers are not programmer visible, yet it is very useful to understand their content. A programmer invisible descriptor cache register is associated with each programmervisible segment register, as shown by Figure 4-4. Each descriptor cache register holds a 32-bit base address, a 32-bit segment limit, and the other necessary segment attributes. ( r______________________ -JA~ __________________________ ~~ Descriptor Registers (Loaded Automatically) ~ 15 a Selector Six 16-bit segment registers hold segment selector values identifying the currently addressable memory segments. In protected mode, each segment may range in size from one byte up to the entire linear and physical address space of the machine, 4 Gbytes (2 32 bytes). In real address mode, the maximum segment size is fixed at 64 Kbytes (2 16 bytes). Physical Base Address SegmentUmtt Other Segment Attributes from Descriptor CS-~------------~----------~~_+~~t_;_~~_;_1 ss-~------------_r----------~r-r_;__t_;--t_;__T_1_1 DS-~------------;_----------;-~_+~--r_;_~_T_;_1 ES-~------------;_----------;-~_+~--r_;_~-_T-_;-_1 FS-~------------;_----------;-~_+~--t_;_~-_T-_;_1GS-L-____________~~__________-L~__L_~~_ _~~-_L-~_-~ 242202-11 Figure 4-4. Intel486™ Processor Segment Registers and Associated Descriptor Cache Registers I 2-49 Intel486TM PROCESSOR FAMILY When a selector value is loaded into a segment register, the associated descriptor cache register is automatically updated with the correct information. In Real Mode, only the base address is updated directly (by shifting the selector value four bits to the left), because the segment maximum limit and attributes are fixed in Real Mode. In Protected Mode, the base address, the limit, and the attributes are all updated per the contents of the segment descriptor indexed by the selector. 4.2.3 SYSTEM LEVEL REGISTERS Figure 4-5 illustrates the system level registers, which are the control operation of the on-chip cache, the on-chip floating point unit (on the Intel486 DX, Inte1DX2, and IntelDX4 processors) and the segmentation and paging mechanisms. These registers are only accessible to programs running at privilege level 0, the highest. privilege level. The system level registers include three control registers and four segmentation base registers. The three control registers are CRO, CR2 and CR3. CR1 is reserved for future Intel processors. The four segmentation base registers are the Global Descriptor Table Register (GDTR), the Interrupt Descriptor Table Register (IDTR), the Local Descriptor Table Register (LDTR) and the Task State Segment Register (TR). Whenever a memory reference occurs, the segment descriptor cache register associated with the segment being used is automatically involved with the memory reference. The 32-bit segment base address becomes a component of the linear address calculation, the 32-bit limit is used for the limit-check operation, and the attributes are checked against the type of memory reference requested. 31 1611 5 24 123 0 817 CRD Page Faun Unear Address Register CR2 I Page Directory Base Register CR3 CR4 47 32·Bit Linear Base Address 1615 Umtt ~: : I= = = = = = = = = = = = :I= = :=:=: =:=:= ~1 System Segment Registers ~ 15 ,--_ _ _ _- , 0 TR lDTR I~ __________ Descriptor Registers (loaded Automatically) ~ ________ 32-Btt Linear Base Address -JA~ ________________ 20-Bit Segment limtt ~ ____ ~\ Attributes I I II Selector 1------\ Selector 242202-12 Figure 4·5. System Level Registers 2-50 I Intel486TM PROCESSOR FAMILY III indicates Intel Reserved; Do not define. 242202-13 NOTE: See Section 4.2.7 "Compatibility". Figure 4-6. Control Register 0 4.2.3.1 Control Registers • On·Chip Cache (Table 4-3) Control Register 0 (CRO) • On-Chip Floating Point Unit: NE, TS, EM, TS (Tables 4-4, 4-5, and 4-6). (Also applies for Intel486 SX and IntelSX2 processors.) CRO, shown in Figure 4-6, contains 10 bits for control and status purposes. The function of the bits in CRO can be categorized as follows: • Intel486 Processor Operating Modes: PG, PE (Table 4-2) Control Modes: CD, NW • Alignment Check Control: AM • Supervisor Write Protect: WP Table 4-2. Intel486™ Processor Operating Modes Mode PG PE 0 0 REAL Mode. Exact 8086 processor semantics, with 32-bit extensions available with prefixes. 0 1 Protected Mode. Exact 80286 processor semantics, plus 32-bit extensions through both prefixes and "default" prefix setting associated with code segment descriptors. Also, a sub-mode is defined to support a virtual 8086 processor within the context of the extended 80286 processor protection model. 1 0 UNDEFINED. Loading CRO with this combination of PG and PE bits will raise a GP fault with error code O. 1 1 Paged Protected Mode. All the facilities of Protected mode, with paging enabled underneath segmentation. Table 4-3. On-Chip Cache Control Modes CD NW 1 1 Cache fills disabled, write-through and invalidates disabled. Operating Mode , 1 0 Cache fills disabled, write-through and invalidates enabled. 0 1 INVALID. If CRO is loaded with this configuration of bits, a GP fault with error code is raised. 0 0 Cache fills enabled, write-through and invalidates enabled. I 2-51 Intel486TM PROCESSOR FAMILY The low-order 16 bits of CRO are also known as the Machine Status Word (MSW) , for compatibility with the 80286 processor protected mode. LMSW and SMSW (load and store MSW) instructions are taken as special aliases of the load and store CRO operations, where only the low-order 16 bits of CRO are involved. The LMSW and SMSW instructions in the Intel486 processor work in an identical fashion to the LMSW and SMSW instructions in the 80286 processor (i.e., they only operate on the low-order 16 bits of CRO and ignores the new bits). New Intel486 processor operating systems should use the MOV CRO, Reg instruction. NOTE: All Intel386 and Intel486 processor CRO bits, except for ETand NE, are upwardly compatible with the 80286 processor, because they are in register bits not defined in the 80286 .processor. For strict compatibility with the 80286 processor, the load machine status word (LMSW) instruction is defined to not change the ET or NE bits. The defined CRO bits are described below. PG (Paging Enable, bit 31) PG bit is used to indicate whether paging is enabled (PG = 1) or disabled (PG = 0). (See Table 4-2.) CD (Cache Disable, bit 30) The CD bit is used to enable the on-chip cache. When CD = 1, the cache will not be filled on cache misses. When CD = 0, cache fills may be performed on misses. (See Table 4-3.) The state of the CD bit, the cache enable input pin (KEN#), and the relevant page cache disable (PCD) bit determine if a line read in response to' a cache miss will be installed in the cache. A line is installed in the cache only if CD = 0 and KEN # and PCD are both zero. The relevant PCD bit comes from either the page table entry, page directory entry or control register 3. (Refer to section 7.6, "Page Cacheability.") CD is set to one after RESET. NW (Not Write-Through, bit 29) The NW bit enables on-chip cache writethroughs and write-invalidate cycles (NW=O). 2-52 When NW = 0, all writes, including cache hits, are sent out to the pins. Invalidate cycles are enabled when NW = O. During an invalidate cycle a line will be removed from the cache if the invalidate address hits in the cache. (See Table 4-3.) AM WP When NW = 1, write-throughs and write-invalidate cycles are disabled. A write will not be sent to the pins if the write hits in the cache. With NW = 1 the only write cycles that reach the external bus are cache misses. Write hits with NW = 1 will never update main memory. Invalidate cycles are ignored when NW=1. (Alignment Mask, bit 18) The AM bit controls whether the .alignment check (AC) bit in the flag register (EFLAGS) can allow an alignment fault. AM = 0 disables the AC bit. AM = 1 enables the AC bit. AM = 0 is the Intel386 processor compatible mode. Intel386 processor software may load incorrect data into the AC bit in the EFLAGS register. Setting AM = 0 will prevent AC faults from occurring before the Intel486 processor has created the AC interrupt service routine. (Write Protect, bit 16) WP protects read-only pages from supervisor write access. The Intel386 processor allows a read-only page to be written from privilege levels 0-2. The Intel486 processor are compatible with the Intel386 processor when WP = O. WP = 1 forces a fault on a write to a read-only page from any privilege level. Operating systems with Copy-onWrite features can be supported with the WP bit. (Refer to section 6.4.3 "Page Level Protection (R/W, U/S Bits).") NOTE: Refer to Tables 4-4, 4-5, and 4-6 for values and interpolation of NE, EM, TS, and MP bits, in addition to the sections below. NE (Numerics Exception, bit 5) Intel486 SX and IntelSX2 Processor NE Bit For Intel486 SX and IntelSX2 processors, interrupt 7 will be generated upon encountering any floating point instruction regardless of the value of the NE bit. It is recommended that NE= 1 for normal operation of the Intel486 processor. I Intel486™ PROCESSOR FAMIL V Intel486 OX, IntelOX2 and IntelOX4 Processor NE Bit For Intel486 DX, Inte1DX2, and IntelDX4 processors, the NE bit controls whether unmasked floating point exceptions (UFPE) are handled through interrupt vector 16 (NE = 1) or through an external interrupt (NE = 0). NE = 0 (default at reset) supports the DOS operating system error reporting scheme from the 8087, Intel287 and . Intel387 math coprocessors. In DOS systems, math coprocessor errors are reported via external interrupt vector 13. DOS uses interrupt vector 16 for an operating system call. (Refer to sections 9.2.15, "Numeric Error Reporting (FERR #, IGNNE#)," and 10.2.14 "Floating Point Error Handling.") For any UFPE, the floating point error output pin (FERR#) will be driven active. For NE = 0, the Intel486 DX, IntelDX2 and IntelDX4 processors work in conjunction with the ignore numeric error input (IGNNE#) and the FERR# output pins. When a UFPE occurs and the IGNNE# input is inactive, the Intel486 DX, Inte1DX2, and IntelDX4 processors freeze immediately before executing the next floating point instruction. An external interrupt controller will supply an interrupt vector when FERR # is driven active. The UFPE is ignored if IGNNE# is active and floating point execution continues. NOTE: The freeze does not take place if the next instruction is one of the control instructions FNCLEX, FNINIT, FNSAVE, FNSTENV, FNSTCW, FNSTSW, FNSTSW AX, FNENI, FNDISI and FNSETPM. The freeze does occur if the next instruction is WAIT. For NE = 1, any UFPE will result in a software interrupt 16, immediately before executing the next non-control floating point or WAIT instruction. The ignore numeric error input (IGNNE#) signal will be ignored. TS (Task Switch, bit 3) Intel486 SX and IntelSX2 Processor TS Bit For Intel486 SX and IntelSX2 processors, the TS bit is set whenever a task switch operation is performed. Execution of floating point instructions with TS = 1 will cause a I EM Device Not Available (DNA) fault (trap vector 7). With MP = 0, the value of TS bit is a don't care for the WAIT instructions, i.e., these instructions will not generate trap 7. Intel486 OX, Inte1DX2, and IntelOX4 Processor TS Bit For Intel486 DX, Inte1DX2, and IntelDX4 processors, the TS bit is set whenever a task switch operation is performed. Execution of floating point instructions with TS = 1 will cause a Device Not Available (DNA) fault (trap vector 7). If TS = 1 and MP = 1 (monitor coprocessor in CRO), a WAIT instruction will cause a DNA fault. (Emulate Coprocessor, bit 2) Intel486 SX and IntelSX2 Processor EM Bit For Intel486 SX and IntelSX2 processors, the EM bit should be set to one. This will cause the Intel486 SX and IntelSX2 processors to trap via interrupt vector 7 (Device Not Available) to a software exception handier whenever it encounters a floating point instruction. If EM bit is 0 for the Intel486 SX and IntelSX2 processors, the system will hang. (See Tables 4-4 and 4-5.) Intel486 OX, Inte10X2, and IntelOX4 Processor EM Bit For the Intel486 DX, Inte1DX2, and IntelDX4 processors, the EM bit determines whether floating point instructions are trapped (EM = 1) or executed. If EM = 1, all floating point instructions will cause fault 7. If EM = 0, the on-chip floating point will be used. NOTE: WAIT instructions are not affected by the state of EM. (See Tables 4-4 and 4-6.) MP (Monitor Coprocessor, bit 1) Intel486 SX and IntelSX2 Processor MP Bit For Intel486 SX and IntelSX2 processors, the MP bit must be set to zero (MP = 0). The MP" bit is used in conjunction with the TS bit to determine if WAIT instructions should trap. For MP = 0, the value of TS is a don't care for these type of instructions. (See Tables 4-4 and 4-5.) 2-53 Intel486TM PROCESSOR FAMILY Intel486 DX, Inte1DX2, and IntelDX4 Processor MP Bit For the Intel486 OX, IntelDX2, and IntelDX4 processors, the MP is used in conjunction with the TS bit to determine if WAIT instructions cause fault 7. (See Table 4-6.) The TS bit is set to 1 on task switches by the Intel486 OX, IntelDX2, and IntelDX4 processors. Floating point instructions are not affected by the state of the MP bit. It is rec" om mended that the MP bit be set to one for normal processor operation. PE (Protection Enable, bit 0) The PE bit enables the segment based protection mechanism if PE = 1 protection is enabled. When PE = 0 the Intel486 processor operates in REAL mode, with segment based protection disabled, and addresses formed as in an 8086 processor. (Refer to Table 4-2.) Table 4-4. Recommended Values of NE, EM, TS, and MP Bits in CRO Register for Intel486TM SX and IntelSX2TM Processors CRO Bit Instruction Type NE 1 EM 1 TS 0 MP 0 FP Trap7 WAIT Execute 1 1 1 0 Trap7 Execute Table 4-5. Recommended Values of the Floating Point Related Bits for Allintel486TM Processors CRO Bit Intel486 SX and IntelSX2TM Processors Intel486 DX, InteIDX2™, and IntelDX4TM Processo~s . EM 1 0 MP 0 NE 1 1 0, for DOS Systems 1, for User-Defined Exception Handler Table 4-6. Interpretation of Different Combinations of the EM, TS and MP Bits for All Intel486TM Processors CRO Bit , Instruction Type EM TS MP Floating Point Wait 0 0 0 Execute Execute 0 0 1 Execute Execute 0 1 0 Exception 7 Execute 0 1 1 Exception 7 Exception 7 1 0 0 Exception 7 Execute 1 0 1 Exception 7 Execute 1 1 0 Exception 7 Execute 1 1 1 Exception 7 Exception 7 NOTE: For Intel486 OX, IntelOX2™ and IntelDX4TM processors, if MP= 1 and TS= 1, the processor will generate a trap 7 so that the system software can save the floating point status of the old task. 2-54 I Intel486TM PROCESSOR FAMILV a 31 leA2 Page Fault Unear Address Register 12 31 4 3 0 CR3 Page Directory Base Register 31 543210 CR4 III indicates Intel Reserved; Do not define. 242202-14 NOTE: See section 4.2.7, "Compatibility." Figure 4-7. Control Registers 2, 3 and 4 Control Register 1 (CR1) CR1 is reserved for use in future Intel processors. Control Register 2 (CR2) CR2, shown in Figure 4·7, holds the 32-bit linear address that caused the last page fault detected. The error code pushed onto the page fault handler's stack when it is invoked provides additional status information on this page fault. Control Register 3 (eR3) CR3, shown in Figure 4-7, contains the physical base address of the page directory table. The page directory is always page aligned (4 Kbyte-aligned). This alignment is enforced by only storing bits 12-31 in CR3. In the Intel486 processor, CR3 contains two bits, page write·through (PWT) (bit 3) and page cache disable (PCD) (bit 4). The page table entry (PTE) and page directory entry (POE) also contain PWT and PCD bits. PWT and PCD control page cacheability. When a page is accessed in external memory, the I state of PWT and PCD are driven out on the PWT and PCD pins. The source of PWT and PCD can be CR3, the PTE or the POE. PWT and PCD are sourced from CR3 when the POE is being updated. When paging is disabled (PG = 0 in CRO), PCD and PWT are assumed to be 0, regardless of their state in CR3. A task switch through a task state segment (TSS) which changes the values in CR3, or an explicit load into CR3 with any value, will invalidate all cached page table entries in the translation lookaside buffer (TLB). The page directory base address in CR3 is a physical address. The page directory can be paged out while its associated task is suspended, but the operating system must ensure that the page directory is resident in physical memory before the task is dispatched. The entry in the TSS for CR3 has a physical address, with no provision for a present bit. This means that the page directory for a task must be resident in physical memory. The CR3 image in a TSS must point to this area, before the task can be dispatched through its TSS. 2-55 Intel486TM PROCESSOR FAMILY Control Register 4 (CR4) System Address Registers: GDTR andlDTR CR4, shown in Figure 4-7, contains bits that enable virtual mode extensions and protected mode virtual interrupts. The GOTR and 10TR hold the 32-bit linear-base address and 16-bit limit of the GOT and lOT, respectively. . VME (Virtual-8086 Mode Extensions, bit a of CR4) Because the GOT and lOT segments are global to all tasks in the system, the GOT and lOT are defined by 32-bit linear addresses (subject to page translation if paging is enabled) and 16-bit limit values. Setting this bit to 1 enables support for a virtual interrupt flag in virtual-8086 mode. This feature can improve the performance of virtual-8086 applications by eliminating the overhead of faulting to a virtual-8086 monitor for emulation of certain operations. (Refer to Appendix A, "Advanced Features.") System Segment Registers: LDTR and TR The LOTR and TR hold the 16-bit selector for the LOT descriptor and the TSS descriptor, respectively. PVI (Protected-Mode Virtual Interrupts, bit 1 of CR4) Setting this bit to 1 enables support for a virtual interrupt flag in protected mode. This feature can enable some programs designed for execution at privilege level 0 to execute at privilege level 3. (Refer to Appendix A, "Advanced Features.") PSE (Page Size Extensions, bit 4 of CR4) Setting this bit to 1 enables 4-Mbyte pages. (Refer to Appendix A, "Advanced Features.") NOTE: Features described in CR4 (VME, PVI, and PSE) in the CPUIO Feature Flag should be qualified with the CPUIO instruction. The CPUIO instruction and CPUIO Feature Flag are specific to particular models in the Intel486 processor family. (Refer to Appendix B, "Feature Determination.") 4.2.3.2 System Address Registers Four sp'ecial registers are defined to reference the tables or segments supported by the 80286, Inte1386, and Intel486 processors' protection model. These tables or segments are: GOT (Global Oescrip-' tor Table), lOT (Interrupt Descriptor Table), LOT (Local Descriptor Table), TSS (Task State Segment). The addresses of these tables and segments are stored in speCial registers, the System Address and System Segment Registers, illustrated in Figure 4-5. These registers are named GOTR, 10TR, LOTR and TR respectively. Section 6, "Protected Mode Architecture," describes how to use these registers. 2-56 o Because the LOT and TSS segments are task specific segments, the LOT and TSS are defined by selector values stored in the system segment registers. NOTE: A programmer-invisible segment descriptor register is asSociated with each system segment register. 4.2.4 FLOATING POINT REGISTERS Figure 4-8 shows the floating point register set The on-chip FPU contains eight data registers, a tag word, a control register, a status register, an instruction pointer and a data pointer. The operation of the Intel486 OX, Inte10X2, and IntelOX4 processor on-chip floating point unit is exactly the same as the Intel387 math coprocessor. Software written for the Intel387 math coprocessor will run on the on-chip floating point unit (FPU) without any modifications. 4.2.4.1 Floating Point Data Registers Floating point computations use the Intel486 OX, Inte10X2, and IntelOX4 processor FPU data registers. These eight 80-bit registers provide the equivalent capacity of twenty 32-bit registers. Each of the eight data registers is divided into "fields" corresponding to the FPU's extended-precision data type. I Intel486TM PROCESSOR FAMILV Tag Field 79 RO 78 Sign o o 64 63 Exponent Significand R1 R2 Instructions may address the data registers either implicitly or explicitly. Many instructions operate on the register at the TOP of the stack. These instructions implicitly address the register at which TOP points. Other instructions allow the programmer to explicitly specify which register to use. This explicit register addressing is also relative to TOP. r---+-----+-------~ R3 R4 R5 RS R7 ~--~----+-------~ r---+-----+-------~ 1r5~ ________~O Control Register Status Register TOP by one. Like other Intel486 OX, Inte10X2, and IntelOX4 processor stacks in memory, the FPU register stack grows "down" toward lower-addressed registers. 4.2.4.2 Floating Point Tag Word 47__________0, r Instruction Pointer Data Pointer Tag Word 242202-15 Figure 4-8. Floating Point Registers The FPU's register set can be accessed either as a stack, with instructions operating on the top one or two stack elements, or as a fixed register set, with instructions operating on explicitly designated registers. The TOP field in the status word identifies the current top-of-stack register. A "push" operation decrements TOP by one and loads a value into the new top register. A "pop" operation stores the value from the current top register and then increments The tag word marks the content of each numeric data register, as shown in Figure 4-9. Each two-bit tag represents one of the eight data registers. The principal function of the tag word is to optimize the FPU's performance and stack handling by making it possible to distinguish between empty and nonempty register locations. It also enables exception handlers to check the contents of a stack location without the need to perform complex decoding of the actual data. 4.2.4.3 Floating Point Status Word The 16-bit status word reflects the overall state of the FPU. The status word is shown in Figure 4-1 0 and is located in the status register. 242202-16 NOTE: The index i of tag (i) is not top-relative. A program typically uses the "top" field of Status Word to determine which tag (i) field refers to logical top of stack. TAG VALUES: 00 = Valid 01 = Zero 10 = QNaN, SNaN, Infinity, Denormal and Unsupported Formats 11= Empty Figure 4-9. Floating Point Tag Word I 2-57 Intel486TM PROCESSOR FAMILY r---------------------------------------BUSy r--...--.---------------------------- TOP Of STACK POINTER r-+-+-+---,--.--,---------------------- CONDITION CODE ERROR SU ...... ARy STATUS STACK fLAG - - - - - - - -..... EXCEPTION flAGS: PRECISION ------------------' UNDERfLOW OVERfLOW ZERO DIVIDE DENOR ... ALlZED OPERAND -----------------------~ INVALID OPERATION 242202-17 ES is set if any unmasked exception bit is set; cleared otherwise. See Table 4-7 for interpretation of condition code. TOP values: 000 = Register 0 is Top of Stack 001 = Register 1 is Top of Stack • • • 111 = Register 7 is Top of Stack For definitions of exceptions, refer to the section entitled "Exception Handling". NOTES: The B bit (Busy, bit 15) is included for 8087 compatibility. The B bit reflects the contents of the ES bit (bit 7 of the status word). Bits 13-11 (TOP) point to the FPU register that is the current top-of-stack. The four numeric condition code bits, CO-C3, are similar to the flags in EFLAGS. Instructions that perform arithmetic operations update CO-C3 to reflect the outcome. The effects of these instructions on the condition codes are summarized in Table 4-7 through Table 4-10, Figure 4-10_ Floating Point Status Word 2-58 I Intel486TM PROCESSOR FAMILY Table 4-7. Floating Point Condition Code Interpretation Instruction CO (S) I C3 (Z) C1 (A) C2(C) FPREM. FPREM1 Three least significant bits of quotient (See Table 4-8.) FCOM. FCOMP. FCOMPP. FTST. FUCOM. FUCOMP. FUCOMPP. FICOM. FICOMP Result of comparison (see Table 4-9) Zero orO/U# Operand is not comparable FXAM Operand class (see Table 4-10) Sign orO/U# Operand class 02 I 00 010rO/U# Reduction 0= complete 1 = incomplete FCHS. FABS. FXCH. FINCTOP. FDECTOP. Constant loads. FXTRACT. FLD. FILD. FBLD. FSTP (ext real) UNDEFINED Zero orO/U# UNDEFINED FIST. FBSTP. FRNDINT. FST. FSTP. FADD. FMUL. FDIV. FDIVR. FSUB. FSUBR. FSCALE. FSORT. FPATAN. F2XM1. FYL2X. FYL2XP1 UNDEFINED Roundup or O/U # UNDEFINED FPTAN. FSIN. FCOS. FSINCOS UNDEFINED Roundup or O/U #. if C2 = 1 Reduction 0= complete 1 = incomplete FLDENV. FRSTOR FINIT FLDCW. FSTENV. FSTCW. FSTSW. FCLEX. FSAVE Each bit loaded from memory Clears these bits UNDEFINED NOTES: O/U# When both IE and SF bits of status word are set. indicating a stack exception, this bit distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0). Reduction If FPREM or FPREM1 produces a remainder that is less than the modulus. reduction is complete. When reduction is incomplete the value at the top of the stack is a partial remainder, which can be used as input to further reduction. For FPTAN, FSIN. FCOS, and FSINCOS. the reduction bit is set if the operand at the top of the stack is too large. In this case the original operand remains at the top of the stack. Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the instruction was upward. UNDEFINED Do not rely on finding any specific value in these bits. (See Section 4.2.7. "Compatibility.") I 2-59 Intel486™ PROCESSOR FAMIL V Table 4-8. Condition Code Interpretation after FPREM and FPREM1 Instructions I Condition Code Interpretation after FPREM and FPREM1 C2 C3 C1 CO 1 X X X Q1 QO Q2 QMOD8 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 2 3 4 0 Incomplete Reduction: further interaction required for complete reduction Complete Reduction: CO, C3, and C1 contain the three least-significant bits of the quotient 5 6 7 Table 4-9. Condition Code Resulting from Comparison Order C3 C2 CO TOP > Operand 0 0 0 TOP < Operand 0 0 1 TOP 1 0 0 1 1 1 = Operand Unordered Table 4-10. Condition Code Defining Operand Class 2-60 C3 C2 C1 CO Value at TOP 0 0 0 0 + Unsupported 0 0 0 1 + NaN 0 0 1 0 - Unsupported 0 0 1 1 - NaN 0 1 0 0 + Normal 0 1 0 1 + Infinity 0 1 1 0 - Normal 0 1 1 1 - Infinity 1 0 0 0 +0 1 0 0 1 + Empty 1 0 1 0 -0 1 0 1 1 - Empty 1 1 0 0 + Denormal 1 1 1 0 - Denormal I Intel486™ PROCESSOR FAMILY Bit 7 is the error summary (ES) status bit. The ES bit is set if any unmasked exception bit (bits 0-5 in the status word) is set; ES is clear otherwise. The FERR # (floating point error) signal is asserted when ES is set. Bit 6 is the stack flag (SF). This bit is used to distinguish invalid operations due to stack overflow or underflow. When SF is set, bit 9 (C1) distinguishes between stack overflow (C1 = 1) and underflow (C1 =0). Table 4-11 shows the six exception flags in bits 0-5 of the status word. Bits 0-5 are set to indicate that the FPU has detected an exception while executing an instruction. The six exception flags in the status word can be individually masked by mask bits in the FPU control word. Table 4-11 lists the exception conditions, and their causes in order of precedence. Table 4-11 also shows the action taken by the FPU if the corresponding exception flag is masked. An exception that is not masked by the control word will cause three things to happen: the corresponding exception flag in the status word will be set, the ES bit in the status word will be set and the FERR # output signal will be asserted. When the Intel486 OX, IntelDX2, or IntelDX4 processor attempts to execute another floating pOint or WAIT instruction, exception 16 occurs or an external interrupt happens if the NE = 1 in control register O. The exception condition must be resolved via an interrupt service routine. The FPU saves the address of the floating point instruction that caused the exception and the address of any memory operand required by that instruction in the instruction and data pointers. (See section 4.2.4.4, "Instruction and Data Pointers. ") Note that when a new value is loaded into the status word by the FLDENV (load environment) or FRSTOR (restore state) instruction, the value of ES (bit 7) and its reflection in the B bit (bit 15) are not derived from the values loaded from memory. The values of ES and B are dependent upon the values of the exception flags in the status word and their corresponding masks in the control word. If ES is set in such a case, the FERR # output of the Intel486 OX, IntelDX2, or IntelDX4 processor is activated immediately. Table 4-11. FPU Exceptions Exception Cause Default Action (if exception Is masked) Invalid Operation Operation on a signaling NaN, unsupported format, indeterminate form (0· 00, 0/0, (+ 00) + (- 00), etc.), or stack overflow/u.nderflow (SF is also set). Result is a quiet NaN, integer indefinite, or BCD indefinite Denormalized Operand At least one of the operands is denormalized, i.e., it has the smallest exponent but a non-zero significand. Normal processing continues Zero Divisor The divisor is zero while the dividend is a non-infinite, non-zero number. Result is 00 Overflow The result is too large in magnitude to fit in the specified format. Result is largest finite value or 00 Underflow The true result is non-zero but too small to be represented in the specified format, and, if underflow exception is masked, denormalization causes loss of accuracy. Result is denormalized or zero Inexact Result (Precision) The true result is not exactly representable in the specified format (e.g., 1/3); the result is rounded according to the rounding mode. Normal processing continues· I 2-61 Intel486TM PROCESSOR FAMILY 4.2.4.4 Instruction and Data Pointers Because the FPU operates in parallel with the ALU (in the Intel486 OX, IntelOX2 and IntelOX4 processors the arithmetic and logic unit (ALU) consists of the base architecture registers), any errors detected by the FPU may be reported after the ALU has executed the floating point instruction that caused it. To allow identification of the failing numeric instruction, the Intel486 OX, IntelOX2, and IntelOX4 processors contain two pointer registers that supply the address of the failing numeric instruction and the address of its numeric memory operand (if appropriate). The instruction and data pointers are provided for user-written error handlers. These registers are accessed by the FLOENV (load environment), FSTENV (store environment), FSAVE (save state) and FRSTOR (restore state) instructions. Whenever the Intel486 OX, IntelOX2, and IntelOX4 processors decode a new floating point instruction, it saves the instruction (including any prefixes that may be 2-62 present), the address of the operand (if present) and the opcode. The instruction and data pointers appear in one of four formats depending on the operating mode of the Intel486 OX, IntelOX2, and IntelOX4 processors (protected mode or real-address mode) and depending on the operand-size attribute in effect (32-bit operand or 16-bit operand). When the Intel486 OX, IntelOX2, or IntelOX4 processor is in the virtual-86 mode, the real address mode formats are used. The four formats are shown in Figure 4-11 through Figure 4-14. The floating point instructions FLOENV, FSTENV, FSAVE and FRSTOR are used to transfer these values to and from memory. Note that the value of the data pointer is undefined if the prior floating point instruction did not have a memory operand. NOTE: The operand size attribute is the 0 bit in a segment descriptor. I Intel486TM PROCESSOR FAMILY 32-8it Protected Mode Format 31 23 15 o 7 I I Intel Reserved Control Word o Intel Reserved Status Word 4 Intel Reserved Tag Word 8 c IP Offset 0000 1 OPCOCE 10.. 0 10 CS Selector 14 CataOperand Offset Intel Reserved 18 Operand Selector 242202-18 Figure 4-11. Protected Mode FPU Instructions and Data Pointer Image in Memory, 32-8it Format 31 23 0000 1 32-8it Real Addrass Mode Format 15 ...1. Intel Reserved Control Word Intel Reserved Status Word 4 Intel Reserved Tag Word 8 Intel Reserved Instruction Pointer 15 ..0 c Instruction Pointer 31 .. 16 J oj 1 Operand 31 .. 16 o OPCOCE 10.. 0 10000 10 14 Operand Pointer 15.. 0 Intel Reserved 0000 o 7 I 00000000 18 242202-19 Figure 4-12. Real Mode FPU Instruction and Data Pointer Image in Memory, 32-81t Format I 2-63 Intel486TM PROCESSOR FAMILY ~6·Blt 15 Protected Mode Format 16-Blt Real Address Mode and Vlrtual·8086 Mode Format 0 7 o Control Word Status Word 2 Tag Word 4 IP Offset 6 CS Selector 8 . Operand Offset A Operand Selector 7 15 o ContriliWord o Status Word 2 Tag Word 4 Instruction PoInllir 15..0 8 IP19.1610 I 8 OPCOOE 10•.0 A Operand Pointer 15••0 C DP19.1810 1000 00000000 C 242202-20 242202-21 Figure 4~13_ Protected Mode FPU Instruction and Data Pointer Image In Memory, 16-Blt Format Figure 4-14. Real Mode FPU Instruction and Data Pointer Image in Memory, 16·Bit Format RESERVEO RESERVED" ROUNOING CONTROL PRECISION CONTROL 1.17 I ;X : 1I +1+Ix: 0 5 x x x x I: 1~ I: I! 1~ 1~ I RESERVEO " "0" ArTER RESET OR FlNIT; CHANGEABLE UPON LOADING THE CONTROL WORD (CW). PROGRAWS WUST IGNORE THIS BIT. EXC EPTION WASKS: PRECISION UNDER,LOW OVER,LOW ZERO OIVIDE DENORWALIZED OPERAND INVALID OPERATION 242202-22 Precision Control 00-24 bits (single precision) 01-(reserved) 10-53 bits (double precision) 11-64 bits (extended precision) Rounding Control OO-Round to nearest or even 01-Round down (toward _00) 10-Round up (toward + co) 11-Chop (truncate toward zero) NOTE: See section 4.2.7 "Compatibility," for RESERVED bits. Figure 4-15. FPU Control Word 2-64 I Intel486TM PROCESSOR FAMILY 4.2.4.5 FPU Control Word Debug Registers The FPU provides several processing options that are selected by loading a control word from memory into the control register. Figure 4-15 shows the format and encoding of fields in the control word. The low·order byte of the FPU control word configures the FPU error and exception masking. Bits 0-5 of the control word contain individual masks for each of the six exceptions that the FPU recognizes. The high·order byte of the control word configures the FPU operating mode, including precision and rounding. RC (Rounding Control, bits 10-11) RC bits provide for directed rounding and true chop, as well as the unbiased round to nearest even mode specified in the IEEE standard. Rounding control affects only those instructions that perform rounding at the end of the operation (and thus can generate a precision exception); namely, FST, FSTP, FIST, all arithmetic instructions (except FPREM, FPREM1, FXTRACT, FABS and FCHS), and all transcendental instructions. PC (Precision Control, bits 8-9) PC bits can be used to set the FPU internal operating precision of the significand at less than the default of 64 bits (extended precision). This can be useful in providing compatibility with early generation arithmetic processors of smaller precision. PC affects only the instructions ADD, SUB, DIV, MUL, and SQRT. For all other instructions, either the precision is determined by the opcode or extended precision is used. 4.2.5 DEBUG AND TEST REGISTERS 4.2.5.1 Debug Registers The six programmer accessible debug registers (Figure 4-16) provide on-chip support for debugging. Debug registers DRO-3 specify the four linear breakpoints. The Debug control register DR7, is used to set the breakpoints and the Debug Status Register, DR6, displays the current state of the breakpoints. The use of the Debug registers is described in section 12, "Debugging Support." Unear Breakpoint Address 0 ORO Unear Breakpoint Address 1 OR1 Unl[lar Breakpoint Address 2 OR2 Unear Breakpoint Address 3 OR3 Intel Reserved, Do Not Define OR4 Intel Reserved, 00 Not Define OR5 Breakpoint Status OR6 Breakpoint Control OR7 Test Registers Cache Test Data TR3 ~------------------~ Cache Test Status TR4 ~------------------~ Cache Test Control TR5 ~--------------~ TLB Test Control TR6 ~------------------~ TLB Test Status ~------------------~ TLB = Translation Lookaside Buffer TR7 242202-23 Figure 4·16. Debug and Test Registers 4.2.5.2 Test Registers The Intel486 processor contains five test registers. The test registers are shown in Figure 4-16. TR6 and TR7 are used to control the testing of the translation look- aside buffer. TR3, TR4 and TR5 are used for testing the on-chip cache. The use of the test registers is discussed in section 11, "Testability." 4_2.6 REGISTER ACCESSIBILITY There are a few differences regarding the accessibility of the registers in Real and Protected Mode. Table 4-12 summarizes these differences. (See section 6, "Protected Mode Architecture.") 4.2.6.1 FPU Register Usage In addition to the differences listed in Table 4-12, Table 4-13 summarizes the differences for the onchip FPU. I 2-65 Intel486TM PROCESSOR FAMILY Table 4-12. Register Usage Register Use In Real Mode '. Use in Protected Mode Use In Virtual 8086 Mode Load Store Load Store Load Store General Registers Yes Yes Yes Yes Yes Yes Segment Register Yes Yes Yes Yes Yes Yes IOPL Yes Yes Yes Yes IOPU1) Flag Register Yes Yes PL = 0(2) PL = 0 No Yes GOTR Yes Yes PL = 0 Yes No Yes IOTR Yes Yes PL = 0 Yes No Yes LOTR No No PL = 0 Yes No No . Control Registers TR No No PI.. = 0 Yes No No Debug Registers Yes Yes PL = 0 PL = 0 No No Test Registers Yes Yes PL = 0 PL = 0 No No NOTES; 1. 10PL: The PUSHF and POPF instructions are made I/O Privilege Level sensitive in Virtual 8086 Mode. 2 .. PL = 0: The registers can be accessed only when the current privilege level is zero. . Table 4-13. FPU Register Usage Differences Register Use In Real Mode Use in Protected Mode Use In Virtual 8086 Mode Load Store Load Store Load FPU Data Registers Yes Yes Yes Yes Yes Yes FPU Control Registers Yes Yes Yes Yes Yes Yes FPU Status Registers Yes Yes Yes Yes Yes Yes FPU Instruction Pointer Yes Yes Yes Yes Yes Yes FPU Data Pointer Yes Yes Yes Yes Yes Yes 4.2.7 COMPATIBILITY VERY IMPORTANT NOTE: COMPATiBILITY WITH FUTURE PROCESSORS In the preceding register descriptions, note certain Intel486 processor register bits are Intel reserved. When reserved bits are called out, treat them as fully undefined. This Is essential for your· software compatibility with future processorsl Follow the guidelines below: 1. Do not depend on the states of any undefined bits when testing the values of . defined register bits. Mask them out when testing. 2-66 Store 2. Do not depend on the states of any undefined bits when storing them to memory or another register. 3. Do not depend on the ability to retain information written into any undefined bits. . 4. When loading registers, always load the undefined bits as zeros. 5. However, registers that have been pre.vlously stored may be reloaded without masking. I Intel486™ PROCESSOR FAMILY Depending upon the values of undefined register bits will make your software dependent upon the unspecified Intel486 processor handling of these bits. Depending on undefined values risks making your software Incompatible with future processors that define usages for the Intel486 processor-undefined bits. AVOID ANY SOFTWARE DEPENDENCE UPON THE STATE OF UNDEFINED INTEL486 PROCESSOR REGISTER BITS. 4.3 Instruction Set The Intel486 processor instruction set can be divided into the following categories of operations: • Data Transfer • • • • • • • Arithmetic Shift/Rotate String Manipulation Bit Manipulation Control Transfer High Level Language Support Operating System Support • Processor Control The Intel486 processor instructions are listed in section 13, "Instruction Set Summary." All Intel486 processor instructions operate on either 0, 1, 2 or 3 operands; where an operand resides in a register, in the instruction itself or in memory. Most zero operand instructions (e.g., CLI, STI) take only one byte. One operand instructions generally are two bytes long. The average instruction is 3.2·bytes long. Because the Intel486 processor has a 32-byte instruction queue, an average of 10 instructions will be prefetched. The use of two operands permits the following types of common instructions: • Register to Register • • • • • I Memory to Register Memory to Memory Immediate to Register Register to Memory Immediate to Memory The operands can be either 8-, 16-, or 32·bits long. As a general rule, when executing 32-bit code, operands are 8 or 32 bits; when executing existing 80286 or 8086 processor code (16-bit code), operands are 8 or 16 bits. Prefixes can be added to all instructions that override the default length of the operands (Le., use 32-bit operands for 16-bit code, or 16-bit operands for 32-bit code). 4.3.1 FLOATING POINT INSTRUCTIONS In addition to the instructions listed above, the Inte1486, Inte1DX2, and IntelDX4 processors have the following floating point instructions. Note that all floating point unit instruction mnemonics begin with an F. • Floating Point • Floating Point Control 4.4 Memory Organization Memory on the Intel486 processor is divided up into 8-bit quantities (bytes), 16-bit quantities (words), and 32-bit quantities (dwords). Words are stored in two consecutive bytes in memory with the low-order byte at the lowest address, the high order byte at the high address. Dwords are stored in four consecutive bytes in memory with the low-order byte at the lowest address, the high-order byte at the highest address. The address of a word or dword is the byte address of the low-order byte. In addition to these basic data types, the Intel486 processor supports two larger units of memory: pages and segments. Memory can be divided up into one or more variable-length segments, which can be swapped to disk or shared between programs. Memory can also be organized into one or more 4-Kbyte pages. Both segmentation and paging can be combined, gaining the advantages of both systems. The Intel486 processor supports both pages and segments in order to provide maximum flexibility to the system designer. Segmentation and paging are complementary. Segmentation is useful for organizing memory in logical modules, and as such is a tool for the application programmer, while pages are useful for the system programmer for managing the physical memory of a system. 2-67 Intel486TM PROCESSOR FAMILY 4.4.1 ADDRESS SPACES The Intel486 processor has three distinct address spaces: logical, linear, and physical. A logical address (also known as a virtual address) consists of a selector and an offset. A selector is the contents of a segment register. An offset is formed by summing all of the addressing components (BASE, INDEX, DISPLACEMENT) discussed in section 4.6.3 "32-Bit Memory Addressing Modes," into an effective address. Because each task on the Intel486 processor has a maximum of 16K (2 14 -1) selectors, and offsets can be 4 Gbytes (2 32 bits), this gives a total of 246 bits or 64 terabytes of logical address space per task. The programmer sees this virtual address space. The segmentation unit translates the logical address space into a 32-bit linear address space. If the paging unit is not enabled then the 32-bit linear address corresponds to the physical address. The paging unit translates the linear address space into the physical address space. The physical. address is what appears on the address pins. The primary difference between Real Mode and Protected Mode is how the segmentation unit performs the translation of the logical address into the linear address. In Real Mode, the segmentation unit shifts the selector left four bits and adds the result to the offset to form the linear address. While in Protected Mode every selector has a linear base address associated with it. The linear base address is stored in one of two operating system tables (i.e., the Local Descriptor Table or Global Descriptor Table). The selector's linear base address is added to the offset to form the final linear address. Figure 4-17 shows the relationship between the various address spaces. Ef!ective Addr!lSs Calculation o 31 Physical· Memory A31-A2 32 Physical '--_ _ _- - I Address '--_ _ _--I L Index Segment Register 242202-24 Figure 4-17. Address Translation 2-68 I Intel486TM PROCESSOR FAMILY 4.4.2 SEGMENT REGISTER USAGE The main data structure used to organize memory is the segment. On the Intel486 processor, segments are variable sized blocks of linear addresses which have certain attributes associated with them. There are two main types of segments: code and data. The segments are of variable size and can be as small as 1 byte or as large as 4 Gbytes (2 32 bytes). In order to provide compact instruction encoding, and increase Intel486 processor performance, instructions do not need to explicitly specify which segment register is used. A default segment register is automatically chosen according to the rules of Table 4-14. In general, data references use the selector contained in the OS register; Stack references use the SS register and Instruction fetches use the CS register. The contents of the Instruction Pointer provide the offset. Special segment override prefixes allow the explicit use of a given segment register, and override the implicit rules listed in Table 4-14. The override prefixes also allow the use of the ES, FS and GS segment registers. There are no restrictions regarding the overlapping of the base addresses of any segments. Thus, all 6 segments could have the base address set to zero and create a system with a 4-Gbyte linear address space. This creates a system where the virtual address space is the same as the linear address space. Further details of segmentation are discussed in section 6.0, "Protected Mode Architecture." 4.5 I/O Space The Intel486 processor has two distinct physical address spaces: Memory and I/O. Generally, peripherals are placed in I/O space although the Intel486 processor also supports memory-mapped peripherals. The I/O space consists of 64 Kbytes, it can be divided into 64K 8-bit ports, 32K 16-bit ports, or 16K 32-bit ports, or any combination of ports which add up to less than 64 Kbytes. The 64K I/O address space refers to physical memory rather than linear address, because I/O instructions do not go through the segmentation or paging hardware. The M/IO# pin acts as an additional address line thus allowing the system designer to easily determine which address space the processor is accessing. I Table 4-14. Segment Register Selection Rules Implied (Default) Segment Use Segment Override Prefixes Possible Code Fetch CS None Destination of PUSH, PUSHF, INT, CALL, PUSHA Instructions SS None Source of POP, POPA, POPF, IRET, RET instructions SS None Destination of STOS, MOVS, REP STOS, REP MOVS Instructions (Di is Base Register) ES None Type of Memory Reference Other Data References, with Effective Address using Base Register of: [EAX] [EBX] [ECX] [EDX] [ESi] [ED I] [EBP] [ESP] OS OS· OS OS OS OS SS SS All The I/O ports are accessed via the IN and OUT I/O instructions, with the port address supplied as an immediate 8-bit constant in the instruction or in the OX register. All 8- and 16-bit port addresses are zero extended on the upper address lines. The I/O instructions cause the MilO # pin to be driven low. I/O port addresses 00F8H through OOFFH are reserved for use by Intel. I/O instruction code is cacheable. I/O data is not cacheable. I/O transfers (data or code) can be bursted. 2-69 Intel486™ PROCESSOR FAMILV 4.6 Addressing Modes 4.6.1 ADDRESSING MODES OVERVIEW The Intel486 processor provides a total of 11 addressing modes for instructions to specify operands. The addressing modes are optimized to allow the efficient execution of high-level languages such as C and FORTRAN, and they cover the vast majority of data references needed by high-level languages. 4.6.2 REGISTER AND IMMEDIATE MODES The following two addressing modes provide fOr instructions that operate on register or immediate operands: • Register Operand Mode: The operand is located in one of the 8-, 16- or 32-bit general registers. • Immediate Operand Mode: The operand is included in the instruction as part of the opcode. combinations, because the effective address calculation is pipe lined with the execution of other instructions. The one exception is the simultaneous use of Base and Index components, which requires one additional clock. As shown in Figure 4-18, the effective address (EA) of an operand is cal.culated according to the following formula: EA = Base Reg + (Index Reg • Scaling) + Displacement Direct Mode: The operand's offset is contained as part of the instruction as an 8-, 16- or 32-bit displacement. Example: INC Word PTR [500] Register Indirect Mode: A BASE register contains the address of the operand. Example: MOV [ECX) , EDX 4.6.3 32-BIT MEMORY ADDRESSING MODES The remaining. modes provide a mechanism for specifying the effective address of an operand. The linear address consists of two components: the segment base address and an effective address. The effective address is calculated by using combinations of the following four address elements: Based Mode: A BASE register's contents is added to a DISPLACEMENT to form the operand's offset. Example: MOV ECX, [EAX+24] Index Mode: An INDEX register's contents is added to a DISPLACEMENT to form the operand's offset. • DISPLACEMENT: An 8-, or 32-bit immediate value, following the instruction. Example: ADD EAX, TABLE[ESI] • BASE: The contents of any general purpose register. The base registers are generally used by compilers to point to the start of the local variable area. Scaled Index Mode: An INDEX register's contents is multiplied by a scaling factor which is added to a DISPLACEMENT to form the operand's offset. • INDEX: The contents of any general purpose register except for ESP. The index registers are used to access the elements of an array, or a string of characters. Example: IMUL EBX, TABLE[ESI*4),7 • SCALE: The index register's value can be mUltiplied by a scale factor, either 1, 2, 4 or 8. Scaled index mode is especially usefuUor accessing arrays or structures. Combinations of these 4 components make up the 9 additional addressing modes. There is no performance penalty for using any of these addressing Based Index Mode: The contents of a BASE register is added to the contents of an INDEX register to form the effective address of an operand. Example: MOV EAX, [ESI) [EBX) Based S.caled Index Mode: The contents of an INDEX register is multiplied by a SCALING factor and the result is added to the contents of a BASE register to obtain the operand's offset. Example: MOV ECX, [EDX*8) [EAX) 2-70 I Intel486TM PROCESSOR FAMILY SEGMENT REGISTER BASE REGISTER SS GS FS ES DS INDEX REGISTER DISPLACEMENT + 1 + - - - - - - 1 (IN INSTRUCTION) EFFECTIVE ADDRE SS SEGMENT LIMIT / '\ LINEAR ADDRESS DESCRIPTOR REGISTERS ~ TARGET ADDRESS SS GS FS ES DS ACCESS RIGHTS SELECTED SEGMENT CS LIMIT BASE ADDRESS ----.-~ SEGMENT BASE ADDRESS 242202-25 Figure 4-18. Addressing Mode Calculations Based Index Mode with Displacement: The contents of an INDEX Register and a BASE register's contents and a DISPLACEMENT are all summed together to form the operand offset. Example: ADD EDX, [ESIj [EBP + OOFFFFFOH] Based Scaled Index Mode. with Displacement: The contents of an INDEX register are multiplied by a SCALING factor, the result is added to the contents of a BASE register and a DISPLACEMENT to form the operand's offset. Example: MOV [EBP+80] EAX, LOCALTABLE[EDI*4] 4.6.4 DIFFERENCES BETWEEN 16- AND 32-BIT ADDRESSES In order to provide software compatibility with 80286 and 8086 processors, the Intel486 processor can execute 16-bit instructions in Real and Protected Modes. The processor determines the size of the instructions it is executing by examining the D bit in the CS segment Descriptor. If the D bit is 0 then all operand lengths and effective addresses are assumed to be 16 bits long. If the D bit is 1 then the default length for operands and addresses is 32 bits. In Real Mode the default size for operands and addresses is 16-bits. Regardless of the default precision of the operands or addresses, the Intel486 processor is able to execute either 16- or 32-bit instructions. This is I 2-71 Intel486™ PROCESSOR FAMILY specified via the use of override prefixes. Two prefixes, the Operand Size Prefix and the Address Length Prefix, override the value of the 0 bit on an individual instruction basis. These prefixes are automatically added by Int~1 assemblers. Example: The Intel486 processor is executing in Real Mode and the programmer needs to access the EAX registers. The assembler code for this might be MOV EAX, 32-bit MEMORYOP, ASM486 Macro Assembler automatically determines that an Operand Size Prefix is needed and generates it. Example: The 0 bit is 0, and the programmer wishes to use Scaled Index addressing mode to access an array. The Address Length Prefix allows the use of MOV OX, TABLE[ESI·2]. The assembler uses an Address Length Prefix because, with 0 = 0, the default addressing mode is 16-bits. . Example: The 0 bit is 1, and the program wants to store a 16-bit quantity. The Operand Length Prefix is used to specify only a 16-bit value; MOV MEM16, OX. The OPERAND LENGTH and Address Length Prefixes can be applied Separately or in combination to any instruction. The Address Length Prefix does not allow addresses over 64 Kbytes to be accessed in Real Mode. A memory address which exceeds FFFFH will result in a General Protection Fault. An Address Length Prefix only allows the use of the additional Intel486 processor addressing modes. When executing 32-bit code, the Intel486 processor uses either 8-, or 32-bit displacements, and any register can be used as base or index registers. When executing 16-bit code, the displacements are either 8, or 16 bits, and the base and index register conform to the 80286 processor model. Table 4-15 illustrates the differences. Table 4-15. BASE and INDEX Registers for 16- and 32-Bit Addresses 4.7 Data Formats 4.7.1 DATA TYPES The Intel486 processor can support a wide-variety of data types. In the following descriptions, the processor consists of the base architecture registers. 4.7.1.1 Unsigned Data Types Byte: Unsigned 8-bit quantity Word: Unsigned 16-bit quantity Dword: Unsigned 32-bit quantity The least significant bit (LSB) in a byte is bit 0, and \ the most significant bit is 7. 4.7.1.2 Signed Data Types All signed data types assume 2's complement notation. The signed data types contain two fields, a sign bit and a magnitude. The sign bit is the most significant bit (MSB). The number is negative if the sign bit is 1. If the sign bit is 0, the number is positive. The magnitude field consists of the remaining bits in the number. (Refer to Figure 4-19.) 8-bit Integer: Signed 8-bit quantity 16-bit Integer: 32-bit Integer: 64-bit Integer: Signed 16-bit quantity Signed 32-bit quantity Signed 64-bit quantity The integer core of the Intel486 processors only support 8-, 16- and 32-bit integers. (See section 4.7.1.4, "Floating Point Data Types.") 4.7.1.3 BCD Data Types 16-Bit Addressing 32-Bit Addressing The Intel486 processor supports packed and unpacked binary coded decimal (BCD) data types. A packed BCD data type contains two digits per byte, the lower digit is in bits 0-3 and the upper digit in bits 4-7. An unpacked BCD data type contains 1 digit per byte stored in bits 0-3. BASE REGISTER BX,BP Any 32-bit GP Register The Intel486 processor supports 8-bit packed and unpacked BCD data types. (Refer to Figure 4-19.) INDEX REGISTER SI,DI Any 32-bit GP Register Except ESP SCALE FACTOR none 1,2,4,8 DISPLACEMENT 0,8,16 bits 0,8,32 bits 2-72 I Intel486TM PROCESSOR FAMILY 4.7.1.4 Floating Point Data Types In addition to the base registers, the Intel486 DX, IntelDX2, and IntelDX4 processors' on-chip floating point unit consists of the floating point registers. The floating point unit data type contain three fields: sign, significand and exponent. The sign field is one bit and is the MSB of the floating point number. The number is negative if the sign bit is 1. If the sign bit is 0, the number is positive. The significand gives the significant bits of the number. The exponent field contains the power of 2 needed to scale the significando (Refer to Figure 4-19.) Only the FPU supports floating point data types. 23-bit significand and Single Precision Real: 8-bit exponent. 32 bits total. 52-bit significand and Double Precision Real: 11-bit exponent. 64 bits total. Extended Precision Real: 64-bit significand and 15-bit exponent. 80 bits total. Floating Point Unsigned Data Types The on-chip FPU does not support unsigned data types. (Refer to Figure 4-19.) Floating Point BCD Data Types The on-chip FPU only supports 80-bit packed BCD . data types. 4.7.1.5 String Data Types A string data type is a contiguous sequence of bits, bytes, words or dwords. A string may contain between 1 byte and 4 Gbytes. (Refer to Figure 4-20.) String data types are only supported by the CPU section of the Intel486 processor. Byte String: Contiguous sequence of bytes. Contiguous sequence of words. Word String: Dword String:· Contiguous sequence of dwords. A set of contiguous bits. In the InBit String: tel486 processor bit strings can be up to 4-gigabits long. 4.7.1.6 ASCII Data Types The Intel486 processor supports ASCII (American Standard Code for Information Interchange) strings and can perform arithmetic operations (such as addition and division) on ASCII data. The Intel486 processor can only operate on ASCII data. (Refer to Figure 4-20.) Floating Point Signed Data Types The on-chip FPU only supports 16-, 32- and 64-bit integers. I 2-73 Intel486TM PROCESSOR FAMILY / Supported by Base Registers \, DataFonnat Supported by ,/ FPU I , ! Byta X Least Significent Byte ~ Range I Precision 7 ! 0-255 017 017 017 017 017 017 017 017 017 t= 8 bits ~ 15 Word ·X i 0-64K 1Sbits 32 bits 31 X II G-4G B-Bit Integer X I 102 HI-Bit Integer i X XI - 0 t ~I_F= 8 bits Sign II! 15 104 - =--U i Tsq,BIt 31 X Xl 109 32 bits 1019 64 bits 0-9 1 Digit 0-9 2 Digits i =.... ! 64-BII Integer Xl 0 tS91Blt 63 0 Li tS9111! I B-Bit Unpacked BCD X 8-BIt Packed BCD X I , On.BCDDIgII .... 1¥o i BO-Bil Packed BCD ! I Double Precision Real I I i Extended Precision Real .1 X!±10 t38 24 bits , !53 bits I I I i. i XI±10'- i t ,. Xi±10·308 7 0 7 0 T"'acDllIgIII""~L ! itSVIBit i I L 179 72 18 Digits I I Ignond Xi±10· 18 1 Single Precision Real 0 =-'Ii 16b~s ! ! 32-Bit Integer 0 L i Oword 0 ! I 64 bits ,52 : I-Exp·1 63 i I Sign BItT 179 I ! -Exp. tSVIIit 63 1 --- 23 31 0 I I-Exp-I _ _ TS9IBIt 0 0 242202-26 Figure 4-19.lnteI486™ Processor Data Types 2-74 I Intel486™ PROCESSOR FAMILV String Data Types Address A.N Byte f""I String~ I A+2N+1 Word I String IS I,31 A+4N+3 Dword String A+4N+2 I A.4N A+4N+1 I N 01 A + 268,435,455 * o1 A.7 « A.3 17 A.6 A.3 A o 17 o 17 .10 .7 ., 0 +2,147,483,647 f A-I A.• ff 17 o 17 o 17 oI A., A 0I 01 A- 268,435,458 A-3 A-' A ol,s o131 A., A., A.• A.3 A.' A.S 1I ·I,s oI ·131 A•• o 17 A.2N N o I « 17 * f - 2,147,483,841 ASCII Data Types ASCIID Character 7 0 242202-27 Figure 4·20. String and ASCII Data Types Least Significant Byte 1 Data Format 1..1_ _--'----L---'----''--'--'--..I...-..I...-...............I----'I 47 I 31 0 I 48-Bit Pointer '--_...L-_ Selector Offset _ _----'1 31 32-Bit Pointer I 0 Offset 1 '-------' 242202-28 Figure 4·21. Pointer Data Types 4.7.1.7 Pointer Data Types A pointer data type contains a value that gives the address of a piece of data. Intel486 processors I support the following two types of pointers (see Figure 4-21): • 48-bit Pointer: 16-bit selector and 32-bit offset • 32-bit Pointer: 32-bit offset 2-75 Intel486TM PROCESSOR FAMILY 4.7.2 LITTLE ENDIAN vs. BIG ENDIAN DATA FORMATS 4.8 Interrupts The Intel486 processors, as well as all other members of the Intel architecture, use the "Iittle·endian" method for storing data types that are larger than one byte. Words are stored in two consecutive bytes in memory with the low-order byte at the lowest address and the high order byte at the high address. Dwords are stored in four consecutive bytes in memory with the low-order byte at the lowest address and the high order byte at the highest address. The address of a word or dword data item is the byte address of the low-order byte. 4.8.1 INTERRUPTS AND EXCEPTIONS Figure 4-22 illustrates the differences between the big-endian and little-end ian formats for dwords. The 32 bits of data are shown with the low order bit number~d bit 0 a~d the high order bit numbered 32. Bigendlan data IS stored with the high-order bits at the lowest addressed byte. Little-endian data is stored with the high-order bits in the highest addressed byte. The Intel486 processor has the following two in~ structions that can convert 16- or 32-bit data between the two byte orderings: • BSWAP (byte swap) handles 4-byte values • XCHG (exchange) handles 2-byte values m+3 31 I m+2 1615 8 7 I I I I 0 Dword in Little-Endlan Memory Format m 31 m m+1 24 23 m+1 I m+3 m+2 2423 16 15 8 7 I I I Dword In Big-Endian Memory Format 0 I Interrupts and exceptions alter the normal program flow, in order to handle external events, to report errors or exceptional conditions. The difference between interrupts and exceptions is that interrupts are used to handle asynchronous external events while exceptions handle instruction faults. Although a program can generate a software interrupt via an INT N instruction, the Intel486 processors treat software interrupts as exceptions. Hardware interrupts occur as the result of an external event and are classified into two types: maskable or non-maskable. Interrupts are serviced after the execution of the current instruction. After the interrupt handler is finished servicing the interrupt, execution proceeds with the instruction immediately after the interrupted instruction. Sections 4.8.3, "Maskable Interrupt," and 4.8.4, "Non-Maskable Interrupt," discuss the differences between Maskable and Non-Maskable interrupts. Exceptions are classified as faults, traps, or aborts, . depending on the way they are reported, and wheth~r o~ not restart of the instruction causing the exception IS supported. Faults are exceptions that are detected and serviced before the execution of the faulting instruction. A fault would occur in a virtual memory system when the processor referenced a page or a segment that was not present. The operatIng system would fetch the page or segment from disk, and then the Intel486 processor would restart the instruction. Traps are exceptions that are reported immediately after the execution of the instruction that caused the problem. User defined interrupts are examples of traps. Aborts are exceptions that do not permit the precise location of the instruction causing the exception to be determined. Aborts are used to report severe errors, such as a hardware error or illegal values in system tables. 242202-29 Figure 4-22. Big vs. Little Endian Memory Format 2-76 Thus, when an interrupt service routine has been completed, execution proceeds from the instruction immediately following the interrupted instruction. On the other hand, the return address from an exception fault routine will always point at the instruction causing the exception and include any leading instruction prefixes. Tables 4-16 and 4-17 summarize the possible interrupts for Intel486 processors and shows where the return address points. I Intel486TM PROCESSOR FAMILY Intel486 processors can handle up to 256 different interrupts and/or exceptions. In order to service the interrupts, a table with up to 256 interrupt vectors must be defined. The interrupt vectors are simply pointers to the appropriate interrupt service routine. In Real Mode (see section 5.0, "Real Mode Architecture"), the vectors are 4-byte quantities, a Code Segment plus a 16-bit offset; in Protected Mode, the interrupt vectors are 8-byte quantities, which are put in an Interrupt Descriptor Table. (See section 6.2.3.4, "Interrupt Descriptor Table.") Of the 256 possible interrupts, 32 are reserved for use by Intel, the remaining 224 are free to be used by the system designer. 4.8.2 INTERRUPT PROCESSING When an interrupt occurs, the following actions happen. First, the current program address and the Flags are saved on the stack to allow resumption of the interrupted program. Next, an 8-bit vector is supplied to the Intel486 processor which identifies the appropriate entry in the interrupt table. The table contains the starting address of the interrupt service routine. Then, the user supplied interrupt service routine is executed. Finally, when an IRET instruction is executed the old Intel486 processor state is restored and program execution resumes at the appropriate instruction. instructions contain or imply the vector; maskable hardware interrupts supply the 8-bit vector via the interrupt acknowledge bus sequence. Non-Maskable hardware interrupts are assigned to interrupt vector 2. 4.8.3 MASKABLE INTERRUPT Maskable interrupts are the most common way used by the Intel486 processor to respond to asynchronous external hardware events. A hardware interrupt occurs when the INTR is pulled high and the Interrupt Flag bit (IF) is enabled. The Intel486 processor only responds to interrupts between instructions, (REPeat String instructions, have an "interrupt win· dow," between memory moves, which allows interrupts during long string moves). When an interrupt occurs, the Intel486 processor reads an 8-bit vector supplied by the hardware which identifies the source of the interrupt, (one of 224 user defined interrupts). The exact nature of the interrupt sequence is discussed in section 10.2.10, "Interrupt Acknowledge." The IF bit in the EFLAG registers is reset when an interrupt is being serviced. This effectively disables servicing additional interrupts during an interrupt service routine. However, the IF may be set explicitly by the interrupt handler, to allow the nesting of interrupts. When an IRET instruction is executed, the original state of the IF is restored. The 8·bit interrupt vector is supplied to the Intel486 processor in several different ways: exceptions supply the interrupt vector internally; software INT I 2-77 Intel486TM PROCESSOR FAMILY Table 4-16. Interrupt Vector Assignments Function Interrupt Number Instruction that Can Cause Exception Return Address Points to Faulting Instruction Type Divide Error 0 DIV,IDIV YES FAULT Debug Exception 1 Any instruction YES TRAP' NMllnterrupt 2 INT 2 or NMI NO NMI One Byte Interrupt 3 INT NO TRAP Interrupt on Overflow 4 INTO NO TRAP Array Bounds Check 5 BOUND YES FAULT Invalid OP-Code 6 Any illegal instruction YES FAULT YES Device Not Available 7 ESC, WAIT Double Fault 8 Any instruction that can generate an exception Intel Reserved 9 Invalid TSS 10 JMP, CALL, IRET, INT YES FAULT Segment Not Present 1 Segment Register Instructions YES FAULT FAULT ABORT Stack Fault 12 Stack References YES FAULT General Protection Fault 13 Any Memory Reference YES FAULT Page Fault 14 Any Memory Access or Code Fetch YES FAULT Intel Reserved 15 Unaligned Memory Access YES FAULT INTn NO TRAP Alignment Check Interrupt 17 Intel Reserved 18-31 Two Byte Interrupt 0-255 • Some debug exceptions may report both traps on the prevIous instruction, and faults on the next instruction. Table 4-17. FPU Interrupt Vector Assignments Function Interrupt Number Floating Point Error 16 2-78 Instruction Which Can Cause Exception Floating Point, WAIT Return Address Points to Faulting Instruction Type YES FAULT I Intel486™ PROCESSOR FAMILY 4.8.4 NON-MASKABLE INTERRUPT Non-maskable interrupts provide a method of servicing very high priority interrupts. A common example of the use of a non-maskable interrupt (NMI) would be to activate a power failure routine or SMI# to activate a power saving mode. When the NMI input is pulled high, it causes an interrupt with an internally supplied vector value of 2. Unlike a normal hardware interrupt, no interrupt acknowledgment sequence is performed for an NMI. While executing the NMI servicing procedure, the Intel486 processor will not service further NMI requests until an interrupt return (IRET) instruction is executed or the processor is reset (RSM in the case of SMI#). If NMI occurs while currently servicing an NMI, its presence will be saved for serviCing after executing the first IRET instruction. The IF bit is cleared at the beginning of an NMI interrupt to inhibit further INTR interrupts. 4.8.5 SOFTWARE INTERRUPTS A third type of interrupt/exception for the Intel4B6 processor is the software interrupt. An INT n instruction causes the processor to execute the interrupt service routine pointed to by the nth vector in the interrupt table. A special case of the two byte software interrupt INT n is the one byte INT 3, or breakpoint interrupt. By inserting this one byte instruction in a program, the user can set breakpoints in his program as a debugging tool. A final type of software interrupt is the single step interrupt. It is discussed in section 12.2, "SingleStep Trap." 4.8.6 INTERRUPT AND EXCEPTION PRIORITIES Interrupts are externally-generated events. Maskable Interrupts (on the INTR input) and Non-Maskable Interrupts (on the NMI input or SMI# input) are recognized at instruction boundaries. When more than one interrupt or external event are both recognized at the same instruction boundary, the Intel4B6 processor invokes the highest priority routine first. (See list below.) If, after the NMI service routine has been invoked, maskable interrupts are still enabled, then the Intel4B6 processor will invoke the appropriate interrupt service routine. I Priority for Servicing External Events for All Intel486 Processors Except the Write-Back Enhanced IntelDX2 Processor 1. RESET /SRESET 2. FLUSH# 3. SMI# 4. NMI 5.INTR 6. STPCLK# NOTE: STPCLK # will be recognized while in an interrupt service routine or an SMM handler. For the Write-Back Enhanced IntelDX2 processor, the priority of servicing external events is modified from the standard Intel4B6 processor. The list below shows the priority for write-back enhanced mode. Priority for Servicing External Events for the Write-Back Enhanced IntelDX2 Processor 1. RESET 2. FLUSH# 3. SRESET 4. SMI# 5. NMI 6.INTR 7. STPCLK# Exceptions are internally-generated events. Exceptions are detected by the Intel4B6 processor if, in the course of executing an instruction, the Intel4B6 processor detects a problematic condition. The IntelDX4 processor then immediately invokes the appropriate exception service routine. The state of the Intel486 processor is such that the instruction causing the exception can be restarted. If the exception service routine has taken care of the problematic condition, the instruction will execute without causing the same exception. It is possible for a single instruction to generate several exceptions (for example, transferring a single operand could generate two page faults if the operand location spans two "not present" pages). However, only one exception is generated upon each attempt to execute the instruction. Each exception service routine should correct its corresponding exception, and restart the instruction. In this manner, exceptions are serviced until the instruction executes successfully. 2-79 Intel486TM PROCESSOR FAMILY As the Intel486 processor executes instructions, it follows a consistent cycle in checking for exceptions. Consider the case of the Intel486 processor having just completed an instruction. It then performs the checks listed in Table 4-18 before reaching the point where the next instruction is complet- ed. This cycle is repeated as each instruction is executed, and occurs in parallel with instruction decoding and execution. Checking for EM, TS, or FPU error status only occurs for processors with on-chip floating point units. Table 4-18. Sequence of Exception Checking Sequence Description 1 Check for Exception 1 Traps from the instruction just completed (single-step via Trap Flag, or Data Breakpoints set in the Debug Registers). 2 Check for Exception 1 Faults in the next instruction (Instruction Execution Breakpoint set in the Debug Registers for the next instruction). 3 Check for external NMI and INTR. 4 Check for Segmentation Faults that prevented fetching the entire next instruction (exceptions 11 or 13). 5 Check for Page Faults that prevented fetching the entire next instruction (exception 14). 6 Check for Faults decoding the next instruction (exception 6 if illegal opcode; exception 6 if in Real Mode or in Virtual 8086 Mode and attempting to execute an instruction for Protected Mode only (see section 6.5.4, "Protection and 1/0 Permission Bitmap"); or exception 13 if instruction is longer than 15 bytes, or privilege violation in Protected Mode (Le., not at 10PL or at CPL=O). 7 If WAIT opcode, check if TS= 1 and MP= 1 (exception 7 if both are 1). 8 If opcode for Floating Point Unit, check if EM = 1 or TS = 1 (exception 7 if either are 1). 9 If opcode for Floating Point Unit (FPU), check FPU error status (exception 16 if error status is asserted). 10 Check in the following order for each memory reference required by the instruction: a. Check for Segmentation Faults that prevent transferring the entire memory quantity (exceptions 11, 12, 13). b. Check for Page Faults that prevent transferring the entire memory quantity (exception 14). NOTE: The order stated supports the concept of the paging mechanism being "underneath" the segmentation mechanism. Therefore, for any given code or data reference in memory, segmentation exceptions are generated before paging exceptions are generated. 2-80 I Intel486™ PROCESSOR FAMILY 4.8.7 INSTRUCTION RESTART 4.8.8 DOUBLE FAULT The Intel486 processor fully supports restarting all instructions after faults. If an exception is detected in the instruction to be executed (exception categories 4 through 10 in Table 4-18), the Intel486 processor invokes the appropriate exception service routine. A Double Fault (exception 8) results when the Intel486 processor attempts to invoke an exception service routine for the segment exceptions (10, 11, 12 or 13), but in the process of doing so, detects an exception other than a Page Fault (exception 14). The Intel486 processor is in a state that permits restart of the instruction, for all cases except the following. An instruction causes a task switch to a task whose Task State Segment is partially "not present." (An entirely "not present" TSS is restartable.) Partially present TSSs can be avoided either by keeping the TSSs of such tasks present in memory, or by aligning TSS segments to reside entirely within a single 4K page (for TSS segments of 4 Kbytes or less). ' A Double Fault (exception 8) will also be generated when the Intel486 processor attempts to invoke the Page Fault (exception 14) service routine, and detects an exception other than a second Page Fault. In any functional system, the entire Page Fault service routine must remain "present" in memory. NOTE: Such cases are easily avoided by proper design of the operating system. When a Double Fault occurs, the Intel486 processor invokes the exception service routine for exception 8. 4.8.9 FLOATING POINT INTERRUPT VECTORS Several interrupt vectors of the Intel486 DX, Inte1DX2, and IntelDX4 processors are used to report exceptional conditions while executing numeric programs in either real or protected mode. Table 419 shows these interrupts and their causes. Table 4-19. Interrupt Vectors Used by FPU Interrupt Number Cause of Interrupt 7 A Floating Point instruction was encountered when EM or TS of the Intel486 DX, Inte1DX2, and IntelDX4 processor control register zero (CRO) was set. EM = 1 indicates that software emulation of the instruction is required. When TS is set, either a Floating Point or WAIT instruction causes interrupt 7. This indicates that the current FPU context may not belong to the current task. 13 The first word or doubleword of a numeric operand is not entirely within the limit of its segment. The return address pushed onto the stack of the exception handler points at the Floating Point instruction that caused the exception, including any prefixes. The FPU has not executed this instruction; the instruction pointer and data pointer register refer to a previous, correctly executed instruction. 16 The previous numerics instruction caused an unmasked exception. The address of the faulty instruction and the address of its operand are stored in the instruction pointer and data pointer registers. Only Floating Point and WAIT instructions can cause this interrupt. The Intel486 DX, Inte1DX2, and IntelDX4 processors return address pushed onto the stack of the exception handler points to a WAIT or Flqating Point instruction (including prefixes). This instruction can be restarted after clearing the exception condition in the FPU. The FNINIT, FNCLEX, FNSTSW, FNSTENV, and FNSAVE instructions can not cause this interrupt. I 2-81 Intel486™ PROCESSOR FAMILY 5.0 REAL MODE ARCHITECTURE 5.1 Introduction When the Intel486 processor is reset or powered up, it is initialized in Real Mode. Real Mode has the same base architecture as the 8086 processor, except that it allows. access to the 32-bit register set of the Intel486 processor. The Intel486 processor addressing mechanism, memory size and interrupt handling are identical to those of Real Mode on the 80286 processor. All of the Intel486 processor instructions are available in Real Mode (except those instructions listed in section 6.5.4, "Protection and I/O Permission Bitmap"). The default operand size in Real Mode is 16 bits, as in the 8086 processor. In order to use the 32-bit registers and addressing modes, override prefixes must be used. Also, the segment size on the Intel486 processor in Real Mode is 64 Kbytes, forcing 32-bit effective addresses to have a value less than OOOOFFFFH. The primary purpose of Real Mode is to enable Protected Mode Operation. The LOCK prefix on the Intel486 processor, even in Real Mode, is more restrictive than on the 80286 processor. This is due to the addition of paging on the Intel486 processor in Protected Mode and Virtual 8086 Mode. Paging makes it impossible to guarantee that repeated string instructions can be LOCKed. The Intel486 processor can not require that all pages holding the string be physically present in memory. Hence, a Page Fault (exception 14) might have to be taken during the repeated string instruction. Therefore, the LOCK prefix can not be supported during repeated string instruQtions. Intel486 processor. The LOCK prefix can be used at any privilege level, but only on the instruction forms listed above. Table 5-1. Instruction Forms Where LOCK Prefix Is Legal Operands (Dest, Source) Opcode BIT Test and SET/RESET /COMPLEMENT Mem, Reg/immed XCHG Reg, Mem CHG Mem, Reg ADD, OR, ADC, SBB, AND, SUB,XOR Mem, Reg/immed NOT, NEG, INC, DEC Mem CMPXCHG, XADD Mem, Reg 5.2 Memory AddreSSing In Real Mode the maximum memory size is limited to 1 megabyte. (See Figure 5-1.) Thus, only address lines A2-A 19 are active. (Exception, after RESET address lines A20-A31 are high during CS-relative memory cycles until an intersegment jump or call is executed. See section 9.5, "Reset and Initialization".) 19 t.lAX LIMIT fiXED AT 64K IN REAL MODE Table 5-1 lists the only instruction forms where the LOCK prefix is legal on the Intel486 processor. 1 '4K An exception 6 will be generated if a LOCK prefix is placed before any instruction form or opcode not listed above. The LOCK prefix allows indivisible read/modify/write operations on memory operands using the instructions above. For example, even the ADD Reg, Mem is not LOCKable, because the Mem operand is not the destination (and therefore no memory read/modify/operation is being performed). Because,on the Intel486 processor, repeated string instructions are not LOCKable, it is not possible to LOCK the bus for a long period of time. Therefore, the LOCK prefix is not 10PL-sensitive on the 2-82 SELECTED SEGMENT J '-------' ----~I-----+---'SEGMENT BASE 242202-30 Figure 5-1_ Real Address Mode Addressing Because paging is not allowed in Real Mode, the linear addresses are the same as the physical addresses. Physical addresses are formed in Real Mode by adding the contents of the appropriate segment register, which is shifted left by four bits to an effective address. This addition results in a physi- I Intel486™ PROCESSOR FAMILV cal address from OOOOOOOOH to 0010FFEFH. This is compatible with 80286 Real Mode. Because segment registers are shifted left by 4 bits, Real Mode segments always start on 16-byte boundaries. All segments in Real Mode are exactly 64-Kbytes long, and may be read, written, or executed. The Intel486 processor will generate an exception 13 if a data operand or instruction fetch occurs past the end of a segment (i.e., if an operand has an offset greater than FFFFH, for example, a word with a low byte at FFFFH and the high byte at OOOOH). Segments may be overlapped in Real Mode. Thus, if a particular segment does not use all 64 Kbytes, another segment can be overlaid on top of the unused portion of the previous segment. This allows the programmer to minimize the amount of physical memory needed for a program .. 5.3 Reserved Locations There are two fixed areas in memory which are reserved in Real address mode: system initialization area. and the interrupt table area. Locations OOOOOH through 003FFH are reserved for interrupt vectors. Each one of the 256 possible interrupts has a 4-byte jump vector reserved for it. Locations FFFFFFFOH through FFFFFFFFH are reserved for system initialization. 5.4 Interrupts Many of the exceptions shown in Table 4-16 and discussed in section 4.8.3, "Maskable Interrupt," are not applicable to Real Mode operation, in particular exceptions 10, 11, 14, 17, which do not happen in Real Mode. Other exceptions have slightly different meanings in Real Mode; Table 5-2 identifies these exceptions. 5.5 Shutdown and Halt The HALT instruction stops program execution and prevents the Intel486 processor from using the local bus until restarted. Either NMI, INTR with interrupts enabled (IF= 1), or RESET will force the Intel486 processor out of halt. If interrupted, the saved CS:IP will point to the next instruction after the HLT. As in the case of protected mode, the shutdown will occur when a severe error is detected that prevents further processing. In Real Mode, shutdown can occur under two conditions, as follows: • An interrupt or an exception occurs (exceptions 8 or 13) and the interrupt vector is larger than the Interrupt Descriptor Table (i.e., there is not an interrupt handler for the interrupt) . • A CALL, INT or PUSH instruction attempts to wrap around the stack segment when SP is not even (i.e., pushing a value on the stack when SP = 0001 resulting in a stack segment greater than FFFFH). An NMI input can bring the processor out of shutdown if the Interrupt Descriptor Table limit is large enough to contain the NMI interrupt vector (at least 0017H) and the stack has enough room to contain the vector and flag information (i.e., SP is greater than 0005H). If these conditions are not met, the Intel486 processor is unable to execute the NMI and executes another shutdown cycle. In this case, the Intel486 processor remains in the shutdown and can only exit via the RESET input. Table 5-2. Exceptions with Different Meanings in Real Mode (see Table 4-17) Function Interrupt Number Related Instructions Return Address Location Interrupt table limit too small 8 INT Vector is not within table limit Before Instruction CS, OS, ES, FS, GS Segment overrun exception 13 Word memory reference beyond offset = FFFFH. An attempt to execute past the end of CS segment. Before Instruction SS Segment overrun exception 12 Stack Reference beyond offset = FFFFH Before Instruction I 2-83 Intel486TM PROCESSOR FAMILY 6. l' Addressing Mechanism 6.0 PROTECTED MODE ARCHITECTURE The complete capabilities of the Intel486 processor are unlocked when the Intel486 processor operates in Protected Virtual Address Mode (Protected Mode). Protected Mode vastly increases the linear address space to four Gbytes (2 32 bytes) and allows the running of virtual memory programs of almost unlimited size (64 terabytes or 246 bytes). In addition Protected Mode allows the Intel486 processor to run all of the existing 8086, 80286 and.lntel386 processor software, while providing a sophisticated memory management and a hardware-assisted protection mechanism. Protected Mode allows the use of additional instructions especially optimized for supporting multitasking operating systems. The base architecture of the Intel486 processor remains the same, the registers, instructions, and addressing modes described in the previous sections are retained. The main difference between Protected Mode and Real Mode from a programmer's view is the increased address space and a different addressing mechanism. Like Real Mode, Protected Mode uses two components to form the logical address, a 16-bit selector is used to determine the linear base address of a segment, the base address is added to a 32-bit effective address to form a 32-bit linear address. The linear address is then either used as the 32-bit physical address, or if paging is enabled the paging mechanism maps the 32-bit linear address into a 32-bit physical address. The difference between the two modes lies in calculating the base address. In Protected Mode the selector is used to specify an index into an operating system defined table. (See Figure 6-1.) The table contains the 32-bit base address of a given segment. The physical address is formed by adding the base address obtained from the table to the offset. Paging provides an additional memory management mechanism which operates only in Protected Mode. Paging provides a means of managing the very large segments of the Intel486 processor. As such, paging operates beneath segmentation. The paging mechanism translates the protected linear address which comes from the segmentation unit into a physical address. Figure 6-2 shows the complete Intel486 processor addressing mechanism with paging enabled. 48/32 BIT POINTER SEGMENT LIMIT ~ MEMORY OPERAND SELECTED SEGMENT ACCESS RIGHTS LIMIT BASE ADDRESS SEGMENT DESCRIPTOR SEGMENT BASE ADDRESS 242202-31 Figure 6·1. Protected Mode Addressing 2-84 I Intel486™ PROCESSOR FAMILY 48 Bit Pointer /' Segment 15 - I 31 , ""Offset I 0 Access Rights Limit "'+1 Base Address Segment Descriptor Physical Address ! 1 1 I i 1-++E8-.. 32 InteI486'" Processor Paging Mechanism Linear Address . I 4K Bytes f 4K Bytes , Physical Address Page Frame Address . .~ 4K Bytes Memory Operand Physical Page: 4K Bytes i I , 1 1 4K Bytes tI • I 4K Bytes 4K Bytes 242202-32 Figure 6·2. Paging and Segmentation 6.2 Segmentation RPL: Requester Privilege Level-The privilege level of the original supplier of the selector. RPL is determined by the least two significant bits of a selector. 6.2.1 SEGMENTATION INTRODUCTION Segmentation is one method of memory management. Segmentation provides the basis for protection. Segments are used to encapsulate regions of memory which have common attributes. For example, all of the code of a given program could be contained in a segment, or an operating system table may reside in a segment. All information about a segment is stored in an B-byte data structure called a descriptor. All of the descriptors in a system are contained in tables recognized by hardware. 6.2.2 TERMINOLOGY The following terms are used throughout the discussion of descriptors, privilege levels and protection: PL: Privilege Level-One of the four hierarchical privilege levels. Level 0 is the most privileged level and level 3 is the least privileged. More privileged levels. are numerically smaller than less privileged levels. I DPL: Descriptor Privilege Level-This is the least privileged level. at which a task may access that descriptor (and the segment associated with that descriptor). Descriptor Privilege Level is determined by bits 6:5 in the Access Right Byte of a descriptor. CPL: Current Privilege Level-The privilege level. at which a task is currently executing, which equals the privilege level of the code segment being executed. CPL can also be determined by examining the lowest 2 bits of the CS register, except for conforming code segments. EPL: Effective Privilege Level-The effective privilege level is the least privileged of the RPL and DPL. Because smaller privilege level values indicate greater privilege, EPL is the numerical maximum of RPL and DPL. Task: One instance of the execution of a program. Tasks are also referred to as processes. 2-85 Intel486™ PROCESSOR FAMIL V 6.2.3.2 Global Descriptor Table 6.2.3 DESCRIPTOR TABLES 6.2.3.1 Descriptor Tables Introduction The descriptor tables define all of the segments which are used in an Intel486 processor system. (See Figure 6-3.) There are three types of tables on the Intel486 processor which hold descriptors: the Global Oescriptor Table, Local Oescriptor Table, and the Interrupt Oescriptor Table. All of the tables are· variable length memory arrays. They can range in size between 8 bytes and 64 Kbytes. Each table can hold up to 8192 8-byte descriptors. The upper 13 bits of a selector are used as an index into the descriptor table. The tables have registers associated with them which hold the 32-bit linear base address, and the 16-bit limit of each table. Each of the tables has a register associated with it, the GOTR, LDTR, and the IOTR (see Figure 6-3). The LGOT, LLOT, and L10T instructions, load the base and limit of the Global, Local, and Interrupt Oescriptor Tables, respectively, into the appropriate register. The SGOT, SLOT, and SlOT store the base and limit values. These tables are manipulated by the operating system. Therefore, the load descriptor table instructions are privileged instructions. LDTR 15 0 32 PROGRAM INVISIBLE AUTOMATICALL Y LOADED FROM LOT DESCRIPTOR lOT LIMIT -------------- IDTR 0 The first slot of the Global Oescriptor Table corresponds to the null selector and is not used. The null selector defines a null pointer value. 6.2.3.3 Local Descriptor Table LOTs contain descriptors which are associated with a given task. Generally, operating systems are designed so that each task has a separate LOT. The LOT may contain only code, data, stack, task gate, and call gate descriptors. LOTs provide a mechanism for isolating a given task's code and data segments from the rest of the operating system, while the GOT contains descriptors for segments which are common to all tasks. A segment cannot be accessed by a task if its segment descriptor does. not exist in either the current LOT or the GOT. This provides both isolation and protection for a task's segments, while still allowing global data to be shared among tasks. Unlike the 6-byte GOT or lOT registers which contain a base address and limit, the visible portion of the LOT register contains only a 16-bit selector. This selector refers to a Local Oescriptor Table descriptor in the GOT. 6.2.3.4 Interrupt Descriptor Table 0 GOT LIMIT GDTR 242202-33 Figure 6-3. Descriptor Table Registers 2-86 The Global Oescriptor Table (GOT) contains descriptors which are possibly available to all of the tasks in a system. The GOT can contain any type of segment descriptor except for descriptors which are used for servicing interrupts (i.e., interrupt and trap descriptors). Every Intel486 processor system contains a GOT. Generally the GOT contains code and data segments used by the operating systems and task state segments, and descriptors for the LOTs in a system. The third table needed for Intel486 processor systems is the Interrupt OescriptorTable. (See Figure 64.) The lOT contains the descriptors which point to the location of up to 256 interrupt service routines. The lOT may contain only task gates, interrupt gates, and trap gates. The lOT should be at least 256 bytes in size in order to hold the descriptors for the 32 Intel Reserved Interrupts. Every interrupt used by a system must have an entry in the lOT. The lOT entries are referenced via INT instructions, external interrupt vectors, and exceptions. (See section 4.8, "Interrupts.") I Intel486™ PROCESSOR FAMILY , ~ MEMORY 6.2.4.2 Intel486 Processor Code, Data Descriptors (S= 1) ~ GATE FOR INTERRUPT Nn GATE FOR INTERRUPT #"·1 " I " ··· s~ CPU INTERRUPT DESCRIPTOA TABLE (lDT) GATE FOR 0 ~ INTERRUPT #1 GATE FOR INTERRUPT #0 lOT BASE 0 ~ ~ 242202-34 Figure 6·4. Interrupt Descriptor Table Register Use 6.2.4 DESCRIPTORS 6.2.4.1 Descriptor Attribute Bits The object to which the segment selector points to is called a descriptor. Descriptors are eight-byte quantities that contain attributes about a given region of linear address space (Le., a segment). These attributes include the 32-bit base linear address of the segment, the 20-bit length and granularity of the segment, the protection level, read, write or execute privileges, the default size of the operands (16-bit or 32-bit), and the type of segment. All of the attribute information about a segment is contained in 12 bits in the segment descriptor. All segments on the Intel486 processor have three attribute fields in common: the P bit, the DPL bit, and the S bit. The Present P bit is 1 if the segment is loaded in physical memory. If P = 0, any attempt to access this segment will cause a not present exception (exception 11). The Descriptor Privilege Level DPL is a two-bit field that specifies the protection level 0-3 associated with a segment. The Intel486 processor has two main categories of segments: system segments and non-system segments (for code and data). The segment S bit in the segment descriptor determines if a given segment is a system segment or a code or data segment. If the S bit is 1, the segment is either a code or data segment. If it is 0, the segment is a system segment. I Figure 6-5 shows the general format of a code and data descriptor and Table 6-1 illustrates how the bits in the Access Rights Byte are interpreted. The Access Rights Bytes is bits 24-31 associated with the segment limit. Code and data segments have several descriptor fields in common. The accessed A bit is set whenever the processor accesses a descriptor. The A bit is used by operating systems to keep usage statistics on a given segment. The G bit, or granularity bit, specifies if a segment length is byte-granular or page-granular. Intel486 processor segments can be one megabyte long with byte granularity (G = 0) or four gigabytes with page granularity (G = 1), (Le., 220 pages each page is 4 Kbytes in length). The granularity is totally unrelated to paging. An Intel486 processor system can consist of segments with byte granularity, and page granularity, whether or not paging is enabled. The executable E bit tells if a segment is a code or data segment. A code segment (E = 1, S = 1) may be execute-only or execute/read as determined by the Read R bit. Code segments are execute only if R = 0, and execute/read if R = 1. Code segments may never be written into. NOTE: Code segments may be modified via aliases. Aliases are writeable data segments which occupy the same range of linear address space as the code segment. The D bit indicates the default length for operands and effective 'addresses. If D= 1 then 32-bit operands and 32-bit addressing modes are assumed. If D = 0 then 16-bit operands and 16-bit addressing modes are assumed. Therefore all existing 80286 code segments will execute on the Intel486 processor assuming the D bit is set O. Another attribute of code segments is determined by the conforming C bit. Conforming segments, C = 1, can be executed and shared by programs at different privilege levels. (See section 6.3, "Protection.") 2-87 Intel486TM PROCESSOR FAMILY o BYte 31 Address Segment Lilli 15...0 Segment Base 15...0 Base 31 ...24 G 0 0 AVL Lim~ 19... 16 P DPL I S Type A o Base 23 ...16 +4 I l 242202-35 BASE LIMIT P DPL S TYPE A ,G D Base Address of the segment The length of the segment Present Bit 1 = Present, 0 = Not Present Descriptor Privilege Level 0-3 Segment Descriptor 0 = System Descriptor, 1 = Code or Data Segment Descriptor Type of Segment Accessed Bit Granularity Bit 1 = Segment length is page granular, o= Segment length is byte granular Default Operation Size (recognized in code segment descriptors only) 1 = 32-bit segment, 0 = 16-bit, segment Bit must be zero (0) for compatibility with future processors Available field for user or OS o AVL NOTE: In a maximum-size segment (i.e., a segment with G = 1 and segment limit 19...0 = FFFFFH), the lowest 12 bits of the segment base should be zero (i.e., segment base 11 ...OOO=OOOH). Figure 6-5. Segment Descriptors Table 6-1. Access Rights Byte Definition for Code and Data Descriptions Bit Position 7 6-5 4 Name Present (P) Descriptor Privilege Level (DPL) Segment Descriptor (S) Function P= 1 p=o Segment is mapped into physical memory. No mapping to physical memory exits, base and limit are not used. Segment privilege attribute used'in privilege tests. S=1 S=O Code or Data (includes stacks) segment descriptor. System Segment Descriptor or Gate Descriptor. If Data Segment (S = 1, E = 0) 3 2 1 Executable (E) Expansion Direction (ED) Writeable (W) E=O ED = 0 ED = 1 W=Q W= 1 3 2 Executable (E) Conforming (C) E=1 C=1 1 Readable (R) R=Q R=1 Q Accessed (A) A=Q A=1 Descriptor type is data segment: Expand up segment, offsets must be :5: limit. Expand down segment, offsets must be > limit. Data segment may not be written into. Data segment may be written into. If Code Segment (S = 1, E = '1) 2-88 Descriptor type is code segment: Code segment may only be executed when CPL ~ DPL and CPL remains unchanged. Code segment may not be read. Code segment may be read. Segment has not been accessed. Segment selector has been loaded into segment register or used by selector test instructions. I Intel486™ PROCESSOR FAMILY Segments identified as data segments (E = 0, S = 1) are used for two types of Intel486 processor segments: stack and data segments. The expansion direction (ED) bit specifies if a segment expands downward (stack) or upward (data). If a segment is a stack segment all offsets must be greater than the segment limit. On a data segment all offsets must be less than or equal to the limit. In other words, stack segments start at the base linear address plus the maximum segment limit and grow down to the base linear address plus the limit. On the other hand, data segments start at the base linear address and expand to the base linear address plus limit. 6.2.4.3 System Descriptor Formats System segments describe information about operating system tables, tasks, and gates. Figure 6-6 shows the general format of system segment descriptors, and the various types of system segments. Intel486 processor system descriptors contain a 32-bit base linear address and a 20-bit segment limit. 80286 system descriptors have a 24-bit base address and a 16-bit segment limit. 80286 system descriptors are identified by the upper 16 bits being all zero. 6.2.4.4 LOT Descriptors (S = 0, TYPE = 2) The write W bit controls the ability to write into a segment. Data segments are read-only if W = o. The stack segment must have W = 1. LDT descriptors (S = 0, TYPE = 2) contain information about Local Descriptor Tables. I.DTs contain a table of segment descriptors, unique to a particular task. Because the instruction to load the LDTR is only available at privilege level 0, the DPL field is ignored. LDT descriptors are only allowed in the Global Descriptor Table (GDT). The B bit controls the size of the stack pointer register. If B = 1, then PUSHes, POPs, and CALLs all use the 32-bit ESP register for stack references and assume an upper limit of FFFFFFFFH. If B = 0, stack instructions all use the 16-bit SP register and assume an upper limit of FFFFH. . 31 o 16 Base 31 ... 24 o Segment Umit 15 ... 0 Segment Base 15... 0 G 0 0 0 Limit 19 ... 16 P DPL I 0 Base 23 ... 16 Type I I Byte Address +4 I 242202-36 Type o 1 2 3 4 5 6 7 Defines Invalid Available 80286 TSS LDT Busy 80286 TSS 80286 call gate Task Gate (for 80286, Intel486TM processor task) 80286 interrupt gate 80286 trap gate 8 9 A B C D E F Invalid Available Intel486 processor TSS Undefined (Intel Reserved) Busy Intel486 processor TSS Intel486 processor call gate Undefined (Intel Reserved) Intel486 processor Intel486 processor Figure 6-6. System Segment Descriptors I 2-89 Intel486TM PROCESSOR FAMILY 6.2~4.5 of indirection between the source and destination of the control transfer. This indirection allows the processor to automatically perform protection checks. It also allows system designers to control entry points to the operating system. Call gates are used to change privilege levels (see section 6.3, "Protection"), task gates are used to perform a task switch, and interrupt and trap gates are used to specify interrupt service routines. TSS Descriptors (S=O, TYPE = 1,3,9,8) A Task State Segment (TSS) descriptor contains in· formation about the location, size, and privilege level of a Task State Segment (TSS). A TSS in turn is a special fixed format segment which contains all the state information for a task and a linkage field to permit nesting tasks. The TYPE field is used to indicate whether the task is currently BUSY (i.e., on a chain of active tasks) or the TSS is available. The TYPE field also indicates if the segment contains an 80286 processor TSS or an Intel486 processor TSS. The Task Register (TR) contains the selector which points to the current Task State Segment. Figure 6-7 shows the format of the four types of gate descriptors. Call gates are primarily used to transfer program control to a more privileged level. The call gate descriptor consists of three fields: the access byte, a long pointer (selector and offset) which points to the start of a routine and a word count which specifies how many parameters are to be copied from the caller's stack to the stack of the called routine. The word count field is only used by call gates when there is a change inthe privilege level, other types of gates ignore the word count field. 6.2.4.6 Gate Descriptors (S=O, TYPE=4-7, C, F) Gates are used to control access to entry points within the target code segment. The various types of gate descriptors are call gates, task gates, Interrupt gates, and trap gates. Gates provide a level 24 31 P DPL I ,Byte Address o Offset 15 ... 0 Selector Offset 31 ... 16 o 5 B 16 a Type 0 I I I 0 0 Word Count +4 4 ... 0 242202-37 Gate Descriptor Fields Name Type Value Description 4 80286 call gate 5 Task gate (for 80286 or Intel486 processor task) 6 80286 interrupt gate . 7 80286 trap gate C Intel486TM processor call gate Intel486 processor interrupt gate E Intel486 processor trap gate F 0 Descriptor contents are not valid P 1 Descriptor contents are valid DPL-Ieast privileged level at which a task may access the gate. WORD COUNT 0-31-the number of parameters to copy from caller's stack to the called procedure's stack. The parameters are 32-bit quantities for Intel486 processor gates, and 16-bit quantities for 80286 gates. DESTINATION 16-bit Selector to the target code segment, SELECTOR selector or Selector to the target task state segment for task gate DESTINATION offset Entry pOint within the target code segment OFFSET 16-bit 80286 32-bit Intel486 processor Figure 6-7. Gate Descriptor Formats 2-90 I Intel486TM PROCESSOR FAMILY Intel486 processor supports all of the 80286 segment descriptors. Figure 6-8 shows the general format of an 80286 system segment descriptor. The only differences between 80286 and Intel486 processor descriptor formats are that the values of the type fields, and the limit and base address fields have been expanded for the Intel486 processor. The 80286 system segment descriptors contained a 24-bit base address and 16-bit limit, while the Intel486 processor system segment descriptors have a 32-bit base address, a 20-bit limit field, and a granularity bit. Interrupt and trap gates use the destination selector and destination offset fields of the gate descriptor as a pointer to the start of the interrupt or trap handler routines. The difference between interrupt gates and trap gates is that the interrupt gate disables interrupts (resets the IF bit), while the trap gate does not. Task gates are used to switch tasks. Task gates may only refer to a task state segment. (See section 6.3.6, "Task Switching.") Therefore, only the destination selector portion of a task gate descriptor is used, and the destination offset is ignored. Exception 13 is generated when a destination selector does not refer to a correct descriptor type, i.e., a code segment for an interrupt, trap or call gate, a TSS for a task gate. By supporting 80286 system segments the Intel486 processor is able to execute 80286 application programs on an Intel486 processor operating system. This is possible because the Intel486 processor automatically understands which descriptors are 80286-style descriptors and which descriptors are Intel486 processor-style descriptors. In particular, if the upper word of a descriptor is zero, then that descriptor is an 80286-style descriptor. The access byte format is the same for all gate descriptors. P = 1 indicates that the gate contents are valid. P = 0 indicates the contents are not valid and causes exception 11 if referenced. DPL is the descriptor privilege level and specifies when this descriptor may be used by a task. (See se.ction 6.3, "Protection.") The S field, bit 4 of the access rights byte, must be 0 to indicate a system control descriptor. The type field specifies the descriptor type as indicated in Figure 6-7. The only other differences between 80286-style descriptors and Intel486 processor descriptors is the interpretation of the word count field of call gates and the B bit. The word count field specifies the number of 16-bit quantities to copy for 80286 call gates and 32-bit quantities for Intel486 processor call gates. The B bit controls the size of PUSHes when using a call gate; if B = 0 PUSHes are 16 bits, if B = 1 PUSHes are 32 bits. 6.2.4.7 Differences Between Intel486 Processor and 80286 Descriptors In order to provide operating system compatibility between 80286 and Intel486 processors, the o 31 Segment Base 15 ... 0 o Segment Limit 15 ... 0 Intel Reserved SettaO P DPL I Type S I Byte Address I Base 23 ... 16 +4 I 242202-38 BASE LIMIT P DPL S TYPE Base Address of the segment The length olthe segment Present Bit: 1 = Present, 0 = Not Present Descriptor Privilege Level 0-3 System Descriptor: 0 = System, 1 = User Type of Segment Figure 6-8. 80286 Code and Data Segment Descriptors I 2-91 Intel486TM PROCESSOR FAMILY 6.2.4.8 Selector Fields A selector in ProteCted Mode has three fields: Local or Global Descriptor Table Indicator (TI), Descriptor Entry Index (Index), and Requester (the selector's) Privilege Level (RPL) as shown in Figure 6-9. The TI bits select one of two memory-based tables of descriptors (the Global Descriptor Table or the Local Descriptor Table). The Index selects one of 8K descriptors in the appropriate descriptor table. The RPL bits allow high speed testing of the selector's privilege attributes. 6.2.4.9 Segment Descriptor Cache In addition to the selector value, every segment register has a segment descriptor cache register associated with it. Whenever a segment register's contents are changed, the 8-byte descriptor associated with that selector is automatically loaded (cached) on the chip. Once loaded, all references to that segment use the cached descriptor information instead of reaccessing the descriptor. The contents of the descriptor cache are not visible to the programmer. Because descriptor caches only change when a segment register is changed, programs that modify the descriptor tables must reload the appropriate segment registers after changing a descriptor's value. 6.2.4.10 Segment Descriptor Register Settings The contents of the segment descriptor cache vary depending on the mode the Intel486 processor is operating in. When operating in Real Address Mode, the segment base, limit, and other attributes within the segment cache registers are defined as shown in Figure 6-10. For compatibility with the 8086 architecture, the base is set to sixteen times the current selector value, the limit is fixed at OOOOFFFFH, and the attributes are fixed so as to indicate the segment is present and fully usable. In Real Address Mode, the internal "privilege level" is always fixed to the highest level, level 0, so liD and other privileged opcodes may be executed. SELECTOR 43210 101 0 ---- 0 10 111 I~II R~L 1 15 SEGMENT REGISTER . . INDEX N TABLE INDICATOR TI-1 TI-O DESCRIPTOR NUMBER NC I 6 6 5 5 4 4 ~ :D:E~CR:IPi6R) 3 2 2 1 1 0 0 i LOCAL DESCRIPTOR TABLE NULL GLOBAL DESCRIPTOR TABLE 242202-39 Figure 6-9. Example Descriptor Selection 2-92 I Intel486TM PROCESSOR FAMILY DESCRIPTOR CACHE REGISTER CONTENTS SEGMENT 32 - BIT BASE (UPDATED DURING SELECTOR LOAD INTO SEGMENT REGISTER) 32-BIT LIMIT OTHER ATTRIBUTES (FIXED) (FIXED) CONFORMING PRIVILEGE STACK SIZE - - - - - - - - - - - - - - - - - - - - - - - - , EXECUTABLE - - - - - - - - - - - - - - - - - - - - - - - , WRITEABLE - - - - - - - - - - - - - - - - - - - - - - - . READABLE - - - - - - - - - - - - - - - - - - - - - , EXPANSION DIRECTION GRANULARITY ACCESSED ~:~v~_~~~~ ~~V:L ~~:E ___ ____________ . :I~I~ ___ } 1 1 _11 __________ _ - CS 16X CURRENT CS SELECTOR OOOOFFFFH Y 0 Y B U Y Y Y SS 16X CURRENT SS SELECTOR OOOOFFFFH Y 0 Y B U Y Y N W OS 16X CURRENT OS SELECTOR OOOOFFFFH ES 16X CURRENT ES SELECTOR OOOOFFFFH Y 0 Y 0 FS 16X CURRENT FS SELECTOR OOOOFFFFH Y 0 Y B U Y Y N Y B U Y Y N Y B U Y Y N GS 16X CURRENT GS SELECTOR OOOOFFFFH Y 0 Y B U Y Y N N - - - - - 242202-40 • Except the 32-bit CS base is initialized to FFFFFOOOH after reset until first intersegment control transfer (Le., intersegment CALL, or intersegment JMP, or IND. (See Figure 6-12 for an example.) Key: Y = N = no = privilege level 0 o 1 2 3 U = yes privilege level 1 = privilege level 2 = privilege level 3 D = expand down B = byte granularity P = page granularity W = push/pop 16-bit words F = push/pop 32-bit dwords = does not apply to that segment cache register = expand up Figure 6·10_ Segment Descriptor Caches for Real Address Mode (Segment Limit and Attributes Are Fixed) When operating in Protected Mode, the segment base, limit, and other attributes within the segment cache registers are defined as shown in Figure 6-11. In Protected Mode, each of these fields are defined according to the contents of the segment descriptor indexed by the selector value loaded into the segment register. When operating in a Virtual 8086 Mode within the Protected Mode, the segment base, limit, and other I attributes within the segment cache registers are defined as shown in Figure 6-12. For compatibility with the 8086 architecture, the base is set to sixteen times the current selector value, the limit is fixed at OOOOFFFFH, and the attributes are fixed so as to indicate the segment is present and fully usable. The virtual program executes at lowest privilege level, level 3, to allow trapping of all IOPL-sensitive instructions and level-O-only instructions. 2-93 Intel486TM PROCESSOR FAMILY DESCRIPTOR CACHE REGISTER CONTENTS SEGMENT 32 - BIT BASE 32 - BIT LIMIT OTHER ATTRIBUTES (UPDATED DURING SELECTOR LOAD INTO SEGMENT REGISTER) (UPDATED DURING SELECTOR LOAD INTO SEGMENT REGISTER) (UPDATED DURING SELECTOR LOAD INTO SEGMENT REGISTER) CONFORMING PRIVILEGE STACK SIZE - - - - - ' - - - - - - - - - - - - - - - - - - . . . ; . . - . . , EXECUTABLE - - - - - - - - - - - - - - - - - - - - - - - - , WRITEABLE READABLE - - - - - - - - - - - - - - - - - - - - . . . . , - - - , EXPANSION DIRECTION ' ' "' ' ' ~:~J~;:.':':'. !~" ........... ,'~'~ .1 ...... t CS BASE PER SEG DESCR LIMIT PER SEG DESCR SS BASE PER SEG DESCR LIMIT PER SEG DESCR OS BASE PER SEG OESCR LIMIT PER SEG OESCR ES BASE PER SEG OESCR LIMIT PER SEG OESCR FS BASE PER SEG OESCR LIMIT PER SEG DESCR GS BASE PER SEG. DESCR LIMIT PER SEG DESCR 111......... _ - P d P d d d d d N Y d d d r w N d P d P d d d d d d N d d d d d N P d P d d d d d d N - - - d d d d d N - - d - 242202-41 Key: Y N d p = fixed yes = fixed no . = per segment descriptor = per segment descriptor; descriptor must indicate "present" to avoid exception 11 (exception 12 in case of 55) . per segment descriptor, but descriptor must indicate "readable'~ to avoid exception 13 (special case for 55) = per segment descriptor, but descriptor must indicate "writeable" to avoid exception 13 (special case for 55) = does not apply to that segment cache register = w Figure 6-11. Segment Descriptor Caches for Protected Mode (Loaded per Descriptor) 2-94 I Intel486TM PROCESSOR FAMILY SEGMENT DESCRIPTOR CACHE REGISTER CONTENTS 32 ~ BIT BASE (UPDATED DURING SELECTOR LOAD INTO SEGMENT REGISTER) 32-BIT LIMIT OTHER ATTRIBUTES (FIXED) (FIXED) CONFORMING PRIVILEGE STACK SIZE - - - - - - - - - - - - - - - - - - - - - - - , EXECUTABLE - - - - - - - - - - - - - - - - - - - - - - - , ---------------'! 111 WRITEABLE - - - - - _ : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : , - , READABLE - - - EXPANSION DIRECTION GRANULARITY ACCESSED PRIVILEGE LEVEL PRESENT BASE LIMIT '" .. ------------------------------------------------ CS 16X CURRENT CS SELECTOR OOOOFFFFH Y 3 Y B U Y Y Y SS 16X CURRENT SS SELECTOR OOOOFFFFH Y 3 y DS 16X CURRENT DS SELECTOR OOOOFFFFH Y 3 Y B U Y Y N ES 16X CURRENT ES SELECTOR OOOOFFFFH y 3 Y B U Y Y N FS 16X CURRENT FS SELECTOR OOOOFFFFH Y 3 Y B U Y Y N - B U Y Y N W N - - - - - - Y B U y y N - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - GS 16X CURRENT GS SELECTOR Key: Y = yes N = no o = privilege level 1 = privilege level 2 = privilege level 3 = privilege level U = expand up 0 1 2 3 o = B P W F = byte granularity = OOOOFFFFH Y 3 242202-42 expand down page granularity = push/pop i6-bit words = push/pop 32-bit dwords = does not apply to that segment cache register Figure 6-12. Segment Descriptor Caches for Virtual 8086 Mode within Protected Mode (Segment Limit and Attributes are Fixed) 6.3 Protection 6.3.1 PROTECTION CONCEPTS The Intel486 processor has four levels of protection which are optimized to support the needs of a multitasking operating system to isolate and protect user programs from each other and the operating system. The privilege levels control the use of privileged instructions, I/O instructions, and access to segments and segment descriptors. Unlike traditional processor-based systems where this protection is achieved only through the use of complex external hardware and software the Intel486 processor provides the I protection as part of its integrated Memory Management Unit. The Intel486 processor offers an additional type of protection on a page basis, when paging is enabled (See section 6.4.3, "Page Level Protection.") The four-level hierarchical privilege system is illustrated in Figure 6-13. It is an extension of the user / supervisor privilege mode commonly used by minicomputers and, in fact, the user/supervisor mode is fully supported by the Intel486 processor paging mechanism. The privilege levels (PL) are numbered o through 3. Level 0 is the most privileged or trusted level. 2-95 Intel486TM PROCESSOR FAMILY 6.3.3.2 Selector Privilege (RPL) CPU ENFORCED SOFTWARE INTERFACES HIGH SPEED OPERATING SYSTEM INTERFACE 242202-43 Figure 6-13. Four-Level Hierarchical Protection 6.3.2 RULES OF PRIVILEGE The Intel486 processor controls access to both data and procedures between levels of a task, according to the following rules. • Data stored in a segment with privilege level p can be accessed only by code executing at a privilege level at least as privileged as p. • A code segment/procedure with privilege level p can only be called by a task executing at the same or a lesser privilege level than p. 6.3.3 PRIVILEGE LEVELS 6.3.3.1 Task Privilege At any point in time, a task on the Intel486 processor always executes at one of the four privilege levels. The Current Privilege Level (CPL) specifies the task's privilege level. A task's CPL may only be changed by control transfers through gate descriptors to a code segment with a different privilege level. (See section 6.3.4, "Privilege Level Transfers.") Thus, an application program running at PL = 3 may call an operating system routine at PL = 1 (via a gate) which would cause the task's CPL to be set to 1 until the operating system routine was finished. 2-96 The privilege level of a selector is specified by the RPL field. The RPL is the two least significant bits of the selector. The selector's RPL is only used to establish a less trusted privilege level than the current privilege level for the use of a segment. This level is called the task's effective privilege level (EPL). The EPL is defined as being the least privileged (Le., numerically larger) level of a task's CPL and a selector's RPL. Thus, if selector's RPL = 0 then the CPL always specifies the privilege level for making an access using the selector. On the other hand if RPL = 3 then a selector can only access segments at level 3 regardless of the task's CPL. The RPL is most commonly used to verify that pointers passed to an operating system procedure do not access data that is of higher privilege than the procedure that originated the pointer. Because the originator of a selector can specify any RPL value, the Adjust· RPL (ARPL) instruction is provided to force the RPL bits to the originator's CPL. 6.3.3.3 1/0 Privilege and 1/0 Permission Bitmap The 110 privilege level (IOPL, a 2~bit field in the EFLAG register) defines the least privileged level at which 110 instructions can be unconditionally performed. liD instructions can be unconditionally performed when CPL :?: 10PL. (The I/O instructions are IN, OUT, INS, OUTS, REP INS, and REP OUTS.) When CPL > 10PL, and the current task is associated with a 286 TSS, attempted liD instructions cause an exception 13 fault. When CPL > 10PL, and the current task is associated with an Intel486 processor TSS, the liD Permission Bitmap (part of an Intel486 processor TSS) is consulted on whether 110 to the port is allowed, or an exception 13 faultis to be generated instead. For diagrams of the liD Permission Bitmap, refer to Figure 6-14 and Figure 6-15. For further information on how the liD Permission Bitmap is used in Protected Mode or in Virtual 8086 Mode, refer to section 6.5.4, "Protection and liD Permission Bitmap." The liD privilege level (IOPL) also affects whether several other instructions can be executed or cause an exception 13 fault instead. These instructions are called "IOPL-sensitive" instructions and they are CLI and STI. (Note that the LOCK prefix is not 10PLsensitive on the Intel486 processor.) I Intel486™ PROCESSOR FAMILY 31 16 15 )j 0 "AlOKLINK 551 4 8 C 10 552 18 550 14 Ie' (;1{3 20 EFLAGS 24 28 2C AA ECX 30 34 EOX 38 3C 40 44 BIT_MAP_OFFSET must be 5 OFFFH ~va'iIIlI. 5C 58 ~ .64 : - - - - ; : ",6ft-- Debug Trap Bl <. inside InteI486 N Pnx:esso< TSS 0655 46 147 " 40 39 32 1~7~~1 Li~;;' 96 '"' P"lXilIOPL, the IF bit cannot be changed by a new value POPed into (or otherwise loaded into) the EFLAGS register; the IF bit merely remains unchanged and no exception is generated. This pointer verification prevents the common problem of an application at PL = 3 calling a operating systems routine at PL = 0 and passing the operating system routine a "bad" pointer which corrupts a data structure belonging to the operating system. If the operating system routine uses the ARPL instruction to ensure that the RPL of the selector has no greater privilege than that of the caller, then this problem can be avoided. 6.3.~.4 6.3.3.5 Descriptor Access Privilege Validation The Intel486 processor provides several instructions to speed pointer testing and help maintain system integrity by verifying that the selector value refers to an appropriate segment. Table 6-2 summarizes the selector validation procedures available for the Intel486 processor. There are basically two types of segment accesses: those involving code segments such as control transfers, and those involving data accesses. Determining the ability of a task to access a segment involves the type. of segment to be accessed, the instruction used, the type of descriptor used and CPL, RPL, and DPL as described .above. Table 6-2. Pointer Test Instructions Instruction Operands ARPL Selector, Register Adjust Requested Privilege Level: adjusts the RPL of the selector to the numeric maximum of current selector RPL value and the RPL value in the register. Set zero flag if selector RPL was changed. VERR Selector VERify for Read: sets the zero flag if the segment referred to by the selector can be read. VERW Selector VERify for Write: sets the zero flag if the segment referred to by the selector can be written. LSL Register, Selector Load Segment Limit: reads the segment limit into the register if privilege rules and descriptor type allow. Set zero flag if successful. LAR Register, Selector Load Access Rights: reads the descriptor access rights byte into the register if privilege rules allow. Set zero flag if successful. 2-98 Function I Intel486™ PROCESSOR FAMILY Any time an instruction loads data segment registers (OS, ES, FS, GS) the Intel486 processor makes protection validation checks. Selectors loaded in the OS, ES, FS, GS registers must refer only to data segments or readable code segments. The data access rules are specified in section 6.3.2, "Rules of Privilege." The only exception to those rules is readable conforming code segments which can be accessed at any privilege level. Finally the privilege validation checks are performed. The CPL is compared to the EPL and if the EPL is more privileged than the CPL an exception 13 (general protection fault) is generated. The rules regarding the stack segment are slightly different than those involving data segments. Instructions that load selectors into SS must refer to data segment descriptors for writeable data segments. The OPL and RPL must equal the CPL. All other descriptor types or a privilege level violation will cause exception 13. A stack not present fault causes exception 12. Note that an exception 11 is us~d for a not-present code or data segment. 6.3.4 PRIVILEGE LEVEL TRANSFERS Inter-segment control transfers occur when a selector is loaded in the CS register. For a typical system most of these transfers are simply the result of a call or a jump to another routine. There are five types of control transfers which are summarized in Table 6-3. Many of these transfers result in a privilege level transfer. Changing privilege levels is done only via control transfers, by using gates, task switches, and interrupt or trap gates. Control transfers can only occur if the operation which loaded the selector references the correct descriptor type. Any violation of these descriptor usage rules will cause an exception 13 (e.g., JMP through a . call gate, or IRET from a normal subroutine call). In order to provide further system security, all control transfers are also subject to the privilege rules. Table 6-3. Descriptor Types Used for Control Transfer Operation Types Descriptor Referenced Intersegment within the same privilege level JMP, CALL, RET, IRET* Code Segment GOT/LOT Intersegment to the same or higher privilege level CALL Call Gate GOT/LOT Interrupt within task may change CPL Interrupt Instruction, Exception, External Interrupt Trap or Interrupt Gate lOT Intersegment to a lower privilege level (changes task CPL) RET,IRET(1) Code Segment GOT/LOT CALL, JMP Task State Segment GOT CALL, JMP Task Gate GOT/LOT IRET(2) Interrupt Instruction, Exception, External Interrupt Task Gate lOT Control Transfer Types Task Switch Descriptor Table NOTES: 1. NT (Nested Task bit of flag register) = 0 2. NT (Nested Task bit of flag register) = 1 I 2-99 Intel486TM PROCESSOR FAMILY The privilege rules require that: • Privilege level transitions can only occur via gates. • JMPs can be made to a non-conforming code segment with the same privilege or to a conforming code.segment with greater or equal privilege. • CALLs can be made to a non-conforming code segment with the same privilege or via a gate to a more privileged level. • Interrupts handled within the task obey the same privilege rules as CALLs. • Conforming Code segments are accessible by privilege levels which are the same or less privileged than the conforming-code segment's DPL. • Both the requested privilege level (RPL) in the selector pointing to the gate and the task's CPL must be of equal or greater privilege than the gate's DPL. • The code segment selected in the gate must be the same or more privileged than the task's CPL. • Return instructions that do not switch tasks can only return control to a code segment with same or less privilege. • Task switches can be performed by a CALL, JMP, or INT which references either a task gate or task state segment who's DPL is less privileged or the same privilege as the old task's CPL. Any control transfer that changes CPL within a task causes a change of stacks as a result of the privilege level change. The initial values of SS:ESP for privilege levels 0, 1, and 2 are retained in the task state segment. (See section 6.3.6, "Task Switching.") During a JMP or CALL control transfer, the new stack pointer is loaded into the SS and ESP registers and the previous stack pOinter is pushed onto the new stack. When RETurning to the original privilege level, use of the lower-privileged stack is restored as part of the RET or IRET instruction operation. For subroutine calls that pass parameters on the stack and cross privilege levels, a fixed number of words (as specified in the gate's word count field) are copied from the previous stack to the current stack. The inter-segment RET instruction with a stack adjustment value will correctly restore the previous stack pointer upon return. 2-100 6.3.5 CALL GATES Gates provide protected, indirect CALLs. One of the major uses of gates is to provide a secure method of privilege transfers within a task. Because the operating system defines all of the gates in a system, it can ensure that all gates only allow entry into a few trusted procedures (such as those which allocate memory, or perform I/O). Gate descriptors follow the data access rules of privilege; that is, gates can be accessed by a task if the EPL, is equal to or more privileged than the gate descriptor's DPL. Gates follow the control transfer rules of privilege and therefore may only transfer control to a more privileged level. Call Gates are accessed via a CALL instruction and are syntactically identical to calling a normal subroutine. When an inter-levellntel486 processor call gate is activated, the following actions occur. 1. Load CS:EIP from gate check for validity 2. SS is pushed zero-extended to 32 bits 3. ESP is pushed 4. Copy Word Count 32-bit parameters from the old stack to the new stack 5. Push Return address on stack The procedure is identical for 80286 Call gates, except that 16-bit parameters are copied and 16-bit registers are pushed. Interrupt Gates and Trap gates work in a similar fashion as the call gates, except there is no copying of parameters. The only difference between Trap and Interrupt gates is that control transfers through an Interrupt gate disable further interrupts (Le., the IF bit is set to 0), and Trap gates leave the interrupt status unchanged. 6.3.6 TASK SWITCHING .A very important attribute of any multitasking/multiuser operating system is its ability to rapidly switch between tasks or processes. The.lntel486 processor directly supports this operation by providing a task switch instruction in hardware. The Intel486 processor task switch operation saves the entire state of the machine (all of the registers, address space, and a link to the previous task), loads a new execution state, performs protection checks, and commences I Intel486TM PROCESSOR FAMILY execution in the new task, in about 10 microseconds. Like transfer of control via gates, the task switch operation is invoked by executing an intersegment JMP or CALL instruction which refers to a Task State Segment (TSS), or a task gate descriptor in the GOT or LOT. An INT n instruction, exception, trap, or external interrupt may also invoke the task switch operation if there is a task gate descriptor in the associated lOT descriptor slot. The TSS descriptor points to a segment.(see Figure 6-14) containing the entire Intel486 processor execution state while a task gate descriptor contains a TSS selector. The Intel486 processor supports both 80286 and Intel486 processor style TSSs. Figure 616 shows an 80286 TSS. The limit of an Intel486 processor TSS must be greater than 0064H (002BH for an 80286 TSS), and can be as large as 4 Gbytes. In the additional TSS space, the operating system is free to store additional information such as the reason the task is inactive, time the task has spent running, and open files belong to the task. 15 0 BACK LINK SELECTOR TO TSS 0 SP FOR CPL 0 2 SS FOR CPL 0 4 SP FOR CPL 1 6 SS FOR CPL 1 8 SP FOR CPL 2 A SS FOR CPL 2 C INITIAL STACKS FOR CPL O. 1. 2 10 AX 12 CX 14 OX 16 BX 18 SP lA BP IC SI IE 01 20 ES SELECTOR 22 CS SELECTOR 24 SS SELECTOR 26 OS SELECTOR 28 TASK'S LOT SELECTOR ,., AVAILABLE Several bits in the flag register and machine status word (CRO) give information about the state of a task which are useful to the operating system. The Nested Task (NT) (bit 14 in EFLAGS) controls the function of the IRET instruction. If NT = 0, the IRET instruction performs the regular return; when NT = 1, IRET performs a task switch operation back to the previous task. The NT bit is set or reset in the following fashion: When a CALL or INT instruction initiates a task switch, the new TSS will be marked busy and the back link field of the new TSS set to the old TSS selector. The NT bit of the new task is set by CALL or INT initiated task switches. An interrupt that does not cause a task switch will clear NT. (The NT bit will be restored after execution of the interrupt handler) NT may also be set or cleared by POPF or IRET instructions. The Intel486 processor task state segment is marked busy by changing the descriptor type field from TYPE 9H to TYPE BH. An 80286 TSS is marked busy by changing the descriptor type field from TYPE 1 to TYPE 3. Use of a selector that references a busy task state segment causes an exception 13. IP (ENTRY POINT) FLAGS Each task must have a TSS associated with it. The current TSS is identified by a special register in the Intel486 processor called the Task State Segment Register (TR). This register contains a selector referring to the task state segment descriptor that defines the current TSS. A hidden base and limit register associated with TR are loaded whenever TR is loaded with a new selector. Returning from a task is accomplished by the IRET instruction. When IRET is executed, control is returned to the task which was interrupted. The current executing task's state is saved in the TSS and the old task state is restored from its TSS. CURRENT TASK STATE The Virtual Mode (VM) bit 17 is used to indicate if a task, is a virtual 8086 task. If VM = 1, then the tasks will use the Real Mode addressing mechanism. The virtual 8086 environment is only entered and exited via a task switch. (See section 6.5, "Virtual 8086 Environment. ") The T bit in the Intel486 processor TSS indicates that the processor should generate a debug exception when switching to a task. If T = 1, upon entry to a new task, a debug exception 1 will be generated. 2A ~ VI 242202-46 Figure 6-16. 80286 TSS I 2-101 Intel486™ PROCESSOR FAMILY 6.3.6.1 Floating Point Task Switching The FPU's state is not automatically saved when a task switch occurs, because the incoming task may not use the FPU. The Task Switched (TS) Bit (bit 3 in the CRO) helps deal with the FPU's state in a multitasking environment. Whenever the Intel OverDrive processors switch tasks, they set the TS bit. The Intel OverDrive processors detect the first use of a processor extension instruction after a task switch and causes the processor extension not available exception 7. The exception handler for exception 7 may then decide whether to save the state of the FPU. A processor extension not present exception (7) will occur when attempting to execute a Floating Point or WAIT instruction if the Task Switched and Monitor coprocessor extension bits are both set (Le., TS = 1 and MP = 1). 15 SS GS 0 G:§] G:§] 3, ... ' ~~==--1 HFFFFFF' ~="';;;:'::';';:';;;;:"'-I FFFFFFFO INITIALIZATION ROUTINES a..:;.;:':';;';=;;;;:";'::::"'-I 000001181 00000110 "":':;;;;";;:;;';:;;;;:":':';;"'-1 00000108 GOT '-I~:':;:;';~::';':::::........j 00000100 INTERRUPT DESCRIPTORS (32) '--JL-..._ _ _- - ' + t IDT 00000000 242202-47 6.3.7 INITIALIZATION AND TRANSITION TO PROTECTED MODE Because the Intel486 processor begins executing in Real Mode immediately after RESET it is necessary to initialize the system tables and registers with the appropriate values. The GDT and IDT registers must refer to a valid GDT and IDT. The IDT should be at least 256-bytes long, and GDT must contain descriptors for the initial code, and. data segments. Figure 6-17 shows the tables and Figure 6-18 the descriptors needed for a simple Protected Mode Intel486 processor system. It has a single code and single data/stack segment each four-Gbytes long and a single privilege level PL = O. The actual method of enabling Protected Mode is to load CRO with the PE bit set, via the MOV CRO, R/M instruction. This puts the Intel486 processor in Protected Mode. After enabling Protected Mode, the next instruction should execute an intersegment JMP to load the CS register and flush the instruction decode queue. The final step is to load all of the data segment registers with the initial selector values. 2-102 Figure 6-17. Simple Protected System An alternate approach to entering Protected Mode which is especially appropriate for multitasking operating systems, is to use the built in task-switch to load all of the registers. In this case the GDT would contain two TSS descriptors in addition to the code and data descriptors needed for the first task. The first JMP instruction in Protected Mode would jump to the TSS causing a task .switch and loading all of the registers with the values stored in the TSS. Because a task switch saves the state of the current task in a task state segment, the Task State Segment Register should be initialized to point to a valid TSS descriptor. 6.4 Paging 6.4.1 PAGING CONCEPTS Paging is another type of memory management useful for virtual memory multitasking operating systems. Unlike segmentation which modularizes programs and data into variable length segments, paging divides programs into multiple uniform size pages. Pages bear no direct relation to the logical structure of a program. While segment selectors can be considered the logical "name" of. a program module or data structure, a page most likely corresponds to only a portion of a module or data structure. I Intel486TM PROCESSOR FAMILY 2 Data Descriptor Base 31 ... 24 OO(H) G 1 D 1 0 0 Limit 19.16 F(H) Code Descriptor G 1 D 1 0 1 0 0 LI Segment Base 15... 0 011B(H) Base 31 ... 24 OO(H) 0 1 1 0 Base 23 .•. 16 OO(H) I I Segment Base 15... 0 FFFF(H) 0 0 Limit 19.16 F(H) 1 0 0 1 I I Segment Base 15 ... 0 01IB(H) 1 0 1 II 1 Base 23 ... 16 OO(H) Segment Base 15 ... 0 FFFF(H) NULL DESCRIPTOR o 31 24 16 15 B o 242202-48 Figure 6-18. GDT Descriptors for Simple System By taking advantage of the locality of reference displayed by most programs, only a small number of pages from each active task need be in memory at anyone moment. 6.4.2 PAGING ORGANIZATION 6.4.2.1 Page Mechanism The Intel486 processor uses two levels of tables to translate the linear address (from the segmentation unit) into a physical address. There are three components to the paging mechanism of the Intel486 processor: the page directory, the page tables, and the page itself (page frame). All memory-resident elements of the Intel486 processor paging mechanism are the same size, namely, 4 Kbytes. A uniform size for all of the elements simplifies memory allocation and reallocation schemes, because there is no problem with memory fragmentation. Figure 6-19 shows how the paging mechanism works. 6.4.2.2 Page Descriptor Base Register CR2 is the Page Fault Linear Address register. It holds the 32-bit linear address which caused the last page fault detected. CR3 is the Page Directory Physical Base Address Register. It contains the physical starting address of the Page Directory. The lower 12 bits of CR3 are I always zero to ensure that the Page Directory is always page aligned. Loading it via a MOV CR3 reg instruction causes the Page Table Entry cache to be flushed, as will a task switch through a TSS that changes the value of CRO. (See section 6.4.5, "Translation Lookaside Buffer.") 6.4.2.3 Page Directory The Page Directory is 4-Kbytes long and allows up to 1024 Page Directory Entries. Each Page Directory Entry contains the address of the next level of tables, the Page Tables and information about the page table. The contents of a Page Directory Entry are shown in Figure 6-20. The upper 10 bits of the linear address (A22-A31) are used as an index to select the correct Page Directory Entry. 6.4.2.4 Page Tables Each Page Table is 4 Kbytes and holds up to 1024 Page Table Entries. Page Table Entries contain the starting address of the page frame and statistical information about the page. (See Figure 6-21.) Address bits A12-A21 are used as an index to select one of the 1024 Page Table Entries. The 20 upperbit page frame address is concatenated with the lower 12 bits of the linear address to form the physical address. Page tables can be shared between tasks and swapped to disks. 2-103 intel® Intel486™ PROCESSOR FAMILY Two Level Paging Schedule 31 22 Directory Linear Address I , 10 I- 12 Table I 0 Offset J I 10 I- 12" User Memory l Inte1486"" Processor O~ 31 31 CRO 0 I cp- CR1 CR2 CR3 31 Address 0 0) causes an exception 13 fault: LIDT; MOV DRn,reg; MOV reg,DRn; LGDT; MOV TRn,reg; MOV reg,TRn; LMSW; MOV CRn,reg; MOV reg,CRn. CLTS; HLT; 2-109 Intel486™ PROCESSOR FAMILY Several instructions, particularly those applying to the multitasking model and protection model, are available only in Protected Mode. Therefore, attempting to execute the following instructions in Real Mode or in Virtual 8086 Mode generates an exception 6 fault: LTR; LLDT; LAR; LSL; ARPL. STR; SLDT; VERR; VERW; The instructions that are 10PL-sensitive in Protected Mode are: IN; STI; OUT; eLI INS; OUTS; REP INS; REP OUTS; In Virtual 8086 Mode, a slightly different set of instructions are made 10PL-sensitive. The following instructions are 10PL-sensitive in Virtual 8086 Mode: INT n; PUSHF; POPF; STI; eLI; IRET The PUSHF, POPF, and IRET instructions are 10PLsensitive in Virtual 8086 Mode only. This provision allows the IF flag (interrupt enable flag) to be virtualized to the Virtual 8086 Mode program. The INT n $oftware interrupt instruction is also 10PL-sensitive in Virtual 8086 Mode. Note, however, that the INT 3 (opcode OCCH), INTO, and BOUND instructions are not 10PL-sensitive in Virtual 8086 mode (they aren't 10PL sensitive in Protected Mode either). Note that the 110 instructions (IN, OUT, INS, OUTS, REP INS, and REP OUTS) are not 10PL-sensitive in Virtual 8086 mode. Rather, the 110 instructions become automatically sensitive to the 1/0 Permission Bitmap contained in the Intel486 processor Task State Segment. The 110 Permission Bitmap, automatically used by the Intel486 processor in Virtual 8086 Mode, is illustrated by Figure 6-14 and Figure 6-15. The 110 Permission Bitmap can be viewed as a 0-64 Kbit string, which begins in memory at offset BiLMap_Offset in the current TSS. BiLMap_ Offset must be s DFFFH so the entire bit map and the byte FFH which follows the bit map are all at 2-110 offsets s FFFFH from the TSS base. The 16-bit pointer BiLMap_Offset (15:0) is found in the word beginning at offset 66H (102 decimal) from the TSS base, as shown in Figure 6-14. Each bit in the 110 Permission Bitmap corresponds to a Single byte-wide 110 port, as illustrated in Figure 6-14. If a bit is 0,110 to the corresponding byte-wide port can occur without generating an exception. Otherwise the 110 instruction causes an exception 13 fault. Because every byte-wide 110 port must be protectable, all bits corresponding to a word-wide or dword-wide port must be 0 for the word-wide or dword-wide 110 to be permitted. If all the referenced bits are 0, the 110 will be allowed. If any referenced bits are 1, the attempted 110 will cause an exception 13 fault. Due to the use of a pointer to the base of the 110 Permission Bitmap, the bitmap may be located anywhere within the TSS, or may be ignored completely by pointing the BiLMap_Offset (15:0) beyond the limit of the TSS segment. In the same manner, only a small portion of the 64K 110 space need have an associated map bit, by adjusting the TSS limit to truncate the bitmap. This eliminates the commitment of 8K of memory when a complete bitmap is not required, while allowing the fully general case if desired. Example of Bitmap for 1/0 Ports 0-255: Setting 'the TSS limit to {biLMap_Offset + 31 + 1°'l [0' see note below] will allow a 32-byte bitmap for the 110 ports #0-255, plus a terminator byte of all 1's [0' see note below]. This allows the I/O bitmap to control 110 Permission to 110 port 0-255 while causing an exception 13 fault on attempted 110 to any 110 port 80256 through 65,565. "IMPORTANT IMPLEMENTATION NOTE: Beyond the last byte of 110 mapping information in the 110 Permission Bitmap must be a byte containing all 1's. The byte of all 1's must be within the limit of the Intel486 processor TSS segment (see Figure 6-14). 6.5.5 INTERRUPT HANDLING In order to fully support the emulation of an 8086 machine, interrupts in Virtual 8086 Mode are handled in a unique fashion. When running in Virtual Mode all interrupts and exceptions involve a priVilege change back to the host Intel486 processor operating system. The Intel486 processor operating system determines if the interrupt comes from a I Intel486TM PROCESSOR FAMILY Protected Mode application or from a Virtual Mode program by examining the VM bit in the EFLAGS image stored on the stack. When a Virtual Mode program is interrupted and execution passes to the interrupt routine at level 0, the VM bit is cleared. However, the VM bit is still set in the EFLAG image on the stack. The Intel486 processor operating system in turn handles the exception or interrupt and then returns control to the 8086 program. The Intel486 processor operating system may choose to let the 8086 operating system handle the interrupt or it may emulate the function of the interrupt handler. For example, many 8086 operating system calls are accessed by PUSHing parameters on the stack, and then executing an INT n instruction. If the IOPL is set to a then all INT n instructions will be intercepted by the Intel486 processor operating system. The Intel486 processor operating system could emulate the 8086 operating system's call. Figure 6-25 shows how the Intel486 processor operating system could intercept an 8086 operating system's call to "Open a File." An Intel486 processor operating system can provide a Virtual 8086 Environment which is totally transparent to the application software via intercepting and then emulating 8086 operating system's calls, and intercepting IN and OUT instructions. 6.5.6 ENTERING AND LEAVING VIRTUAL 8086 MODE Virtual 8086 mode is entered by executing an IRET instruction (at CPL=O), or Task Switch (at any CPL) to an Intel486 processor task whose Intel486 processor TSS has a FLAGS image containing a 1 in the VM bit position while the Intel486 processor is executing in Protected Mode. That is, one way to enter Virtual 8086 mode is to switch to a task with an Intel486 processor TSS that has a 1 in the VM bit in the EFLAGS image. The other way is to execute a 32-bit IRET instruction at privilege level 0, where the stack has a 1 in the VM bit in the EFLAGS image. POPF does not affect the VM bit, even if the Intel486 processor is in Protected Mode or level 0, and so I cannot be used to enter Virtual 8086 Mode. PUSHF always pushes a a in the VM bit, even if the Intel486 processor is in Virtual 8086 Mode, so that a program cannot tell if it is executing in REAL mode, or in Virtual 8086 mode. The VM bit can be set by executing an IRET instruction only at privilege level 0, or by any instruction or Interrupt which causes a task switch in Protected Mode (with VM = 1 in the new FLAGS image), and can be cleared only by an interrupt or exception in Virtual 8086 Mode. IRET and POPF instructions executed in REAL mode or Virtual 8086 mode will not change the value in the V~ bit. The transition out of virtual 8086 mode to Intel486 processor protected mode occurs only on receipt of an interrupt or exception (such as due to a sensitive instruction). In Virtual 8086 mode, all interrupts and exceptions vector through the protected mode IDT, and enter an interrupt handler in protected Intel486 processor mode. That is, as part of interrupt processing, the VM bit is cleared. Because the matching IRET must occur from level 0, if an Interrupt or Trap Gate is used to field an interrupt or exception out of Virtual 8086 mode, the Gate must perform an inter-level interrupt only to level o. Interrupt or Trap Gates through conforming segments, or through segments with DPL> 0, will raise a GP fault with the CS selector as the error code. 6.5.6.1 Task Switches To and From Virtual 8086 Mode Tasks which can execute in virtual 8086 mode must be described by a TSS with the new Intel486 processor format (TYPE 9 or 11 descriptor). A task switch out of virtual 8086 mode will operate exactly the same as any other task switch out of a task with an Intel486 processor TSS. All of the programmer visible state, including the FLAGS register with the VM bit set to 1, is stored in the TSS. The segment registers in the TSS will contain 8086 segment base values rather than selectors. 2-111 Intel486TM PROCESSOR FAMILY 1. 8086 Application makes 'Open File Call," causes General Protection Fault. 2. Virtual 8086 Monitor intercepts call. Calls Inte1486 processor as. as 3. Intel486 processor opens files. Returns control to 8086 as. as 4. 8086 returns control to Application. (Transparent to Application.) File Open Routines 242202-52 Figure 6·25. Virtual 8086 Environment Interrupt and Call Handling A task switch into a task described by an Intel486 processor TSS will have an additional check to determine if the incoming task should be resumed in virtual 8086 mode. Tasks described by 80286 format TSSs cannot be resumed in virtual 8086 mode, so, no check is required there (the FLAGS image in 80286 format TSS has only the low order 16 FLAGS bits). Before loading the segment register images from an Intel486 processor TSS, the FLAGS image is loaded, so that the segment registers are loaded from the TSS image as 8086 segment base values. The task is now ready to resume in virtual 8086 execution mode. 6.5.6.2 Transitions Through Trap and Interrupt Gates, and IRET A task switch is one way to enter or exit virtual 8086 mode. The other method is to exit through a Trap or Interrupt gate, as part of handling an interrupt, and 2-112 to enter as part of executing' an IRET instruction. The transition out must use an Intel486 processor Trap Gate (Type 14), or Intel486 processor Interrupt Gate (Type 15), which must point to a non-conforming level 0 segment (OPL = 0) in order to permit the trap handler to IRET back to the Virtual 8086 program. The Gate must point toa non-conforming level 0 segment to perform a level switch to level 0 so that the matching IRET can change the VM bit. Intel486 processor gates must be used, because 80286 gates save only the low 16 bits of the FLAGS register, so that the VM bit will riot be saved on transitions through the 80286 gates. Also, the 16-bit IRET (presumably) used to terminate the 80286 interrupt handler will pop only the lower 16 bits from FLAGS, and will not affect the VM bit. The action taken for an Intel486 processor Trap or Interrupt gate if an interrupt occurs while the task is executing in virtual 8086 mode is given by the following sequence. I Intel486™ PROCESSOR FAMllV 1. Save the FLAGS register in a temp to push later. . Turn off the VM and TF bits, and if the interrupt is serviced by an Interrupt Gate, turn off IF also. 2. Interrupt and Trap gates must perform a level switch from 3 (where the VM86 program executes) to level 0 (so IRET can return). This process involves a stack switch to the stack given in the TSS for privilege level O. Save the Virtual 8086 Mode SS and ESP registers to push in a later step. The segment register load of SS will be done as a Protected Mode segment load, because the VM bit was turned off above. 3. Push the 8086 segment register values onto the new stack, in the order: GS, FS, OS, ES. These are pushed as 32-bit quantities, with undefined values in the upper 16 bits. Then load these 4 registers with null selectors (0). 4. Push the old 8086 stack pointer onto the new stack by pushing the SS register (as 32-bits, high bits undefined), then pushing the 32-bit ESP register saved above. 5. Push the 32-bit FLAGS register saved in step 1. 6. Push the old 8086 instruction pointer onto the new stack by pushing the CS register (as 32-bits, high bits undefined), then pushing the 32-bit EIP register. 7. Load up the new CS:EIP value from the interrupt gate, and begin execution of the interrupt routine in protected Intel486 processor mode. The transition out of virtual 8086 mode performs a level change and stack switch, in addition to chang. ing back to protected mode. In addition, all of the 8086 segment register images are stored on the stack (behind the SS:ESP image), and then loaded with null (0) selectors before entering the interrupt handler. This will permit the handler to safely save and restore the OS, ES, FS, and GS registers as 80286 selectors. This is needed so that interrupt handlers which don't care about the mode of the interrupted program can use the same prolog and epilog code for state saving (Le., push all registers in prolog, pop all in epilog) regardless of whether or not a 'native' mode or Virtual 8086 mode program was interrupted. Restoring null selectors to these registers before executing the IRET will not cause a trap in the interrupt handler. Interrupt routines which expect values in the segment registers, or return values in segment registers will have to obtain/return values from the 8086 register images pushed onto the new stack. They will need to know the mode of the interrupted program in order to know where to find/return segment registers, and also to know how to interpret segment register values. I The IRET instruction will perform the inverse of the above sequence. Only the extended Intel486 processor IRET instruction (operand size=32) can be used, and must be executed at level 0 to change the VM bit to 1. 1. If the NT bit in the FLAGs register is on, an intertask return is performed. The current state is stored in the current TSS, and the link field in the current TSS is used to locate the TSS for the interrupted task which is to be resumed. Otherwise, continue with the following sequence. 2. Read the FLAGS image from SS:8[ESP] into the FLAGS register. This will set VM to the value active in the interrupted routine. 3. Pop off the instruction pOinter CS:EIP. EIP is popped first, then a 32-bit word is popped which contains the CS value in the lower 16 bits. If VM = 0, this CS load is done as a protected mode segment load. If VM = 1, this will be done as an 8086. segment load. 4. ESP register by 4 to bypass the FLAGS image which was "popped" in step 1. 5. If VM = 1, load segment registers ES, OS, FS, and GS from memory locations SS:[ESP+81, SS: [ESP + 12], SS: [ESP + 16], and SS:[ESP+ 20], respectively, where the new value of ESP stored in step 4 is used. Because VM = 1, these are done as 8086 segment register loads. Else if VM = 0, check that the selectors in ES, OS, FS, and GS are valid in the interrupted routine. Null out invalid selectors to trap if an attempt is made to access through them. 6. If (RPL(CS) > CPL), pop the stack pointer SS:ESP from the stack. The ESP register is popped first, followed by 32-bits containing SS in the lower 16 bits. If VM = 0, SS is loaded as a protected mode segment register load. If VM = 1, an 8086 segment register load is used. 7. Resume execution ofthe interrupted routine. The VM bit in the FLAGS register (restored from the interrupt routine's stack image in step 1) determines whether the Intel486 processor resumes the interrupted routine in Protected mode of Virtual 8086 mode. 2-113 Intel486TM PROCESSOR FAMILY 7;0 ON-CHIP CACHE 7.1 Cache Organization All members of the Intel486 processor family, except the IntelDX4 processor, contain an on-chip 8-Kbyte cache. (See section 7.1.2, "lntelDX4 Processor OnChip Cache," for the IntelDX4 processor cache organization.) The cache is software transparent to maintain binary compatibility with previous generations of the Intel Architecture. The on-chip cache is a unified code and data cache. The cache is used for both instruction and data accesses and acts on physical addresses. (See section 7.1.2 for IntelDX4 processor details). The on-chip cache has been designed for maximum flexibility and performance. The cache has several operating modes offering flexibility during program execution and debugging. Memory areas can be defined as non-cacheable by software and external hardware. Protocols for cache line invalidations and replacement are implemented in hardware, easing system design. IntelDX4TM Processor Only --I2~~~it j.- TO T1 S r-16-Byte Line SiZ~ The cache organization is 4-way set associative and each line is 16-bytes wide. The eight Kbytes of cache memory are logically organized as 128 sets, each containing four lines. The. cache memory is physically split into four 2-Kbyte bloc'ks, each containing 128 lines. (See Figure 7-1.) There are 128 21-bit tags associated with each 2-Kbyte block. There is a valid bit for each line in the cache. Each line in the cache is either valid or not valid. There are no provisions for partially valid lines. All Other Intel486™ Processors -.J21-BitL r-16-Byte Line·Siz~ ToTa91 B .~.T. Tags L:.::_LL --*---*-- ···~ 2K Bytes 128 Sets OB o B OB o B OB o B . L3 LRU~ 4 Valid---.J I Bits ~ Bits I ·I-r L 3 LRU~ 4 Valid---.J I Bits ~ Bits I 256 ....._ _ _ _ _ _...I~ ·1"Ta"" Sets '-----_....... --*-242202-53 Figure 7-1. On-Chip Cache Physical Organization 2-114 I Intel486TM PROCESSOR FAMILY For all Intel486 processors except the Write-Back Enhanced Inte1DX2, the on-chip cache is writethrough only. All writes will drive an external write bus cycle in addition to writing the information to the internal cache if the write was a cache hit. A write to an address "not contained in the internal cache will only be written to external memory. Cache allocations are not made on write misses. The Write-Back Enhanced IntelDX2 processor supports two modes of operation with respect to internal cache configurations: Standard Bus Mode (writethrough cache) and Enhanced Bus Mode (writeback cache). Standard Bus Mode operation for the Write-Back Enhanced IntelDX2 is the same as the write-through cache for all other Intel486 processors. (See section 7.1.1, "Write-Back Enhanced IntelDX2 Processor Cache" and other write-back enhanced sections below for write-back cache information.) 7.1.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE The Write-Back Enhanced IntelDX2 processor implements a unified cache, with a total cache size of 8 Kbytes. The processor's on-chip cache supports a modified MESI (modifiedfexclusivefsharedfinvalid) write-back cache consistency protocol. The Write-Back Enhanced IntelDX2 processor internal cache is configurable as write-back or writethrough on a line by line basis, provided the cache is enabled for write-back operation. The cache is enabled for write-back operation by driving the WBf WT # pin to a high state for at least two clocks before and two clocks after the falling edge of the RESET. Cache write-back and invalidations can be initiated by hardware or software. Protocols for cache consistency and line replacement are implemented in hardware to ease system design. Once the cache configuration is selected, the Write-Back Enhanced IntelDX2 processor will continue to operate in the selected configuration and can only be changed to a different configuration by starting the RESET process again. Assertion of SRESET will not change the operating mode of the processor. WB/WT # has an internal pull down; If WB/WT # is unconnected, the processor will be in Standard Bus Mode, i.e., the on-chip cache is write-through. Table 7-1 lists the two modes of operation and the differences between the two modes. I Unless specifically noted, the following sections apply to the Write-Back Enhanced IntelDX2 in standard Bus Mode (Write-Through Cache) and all other Intel486 processors. 7.1.2 INTELDX4 PROCESSOR CACHE The IntelDX4 processor contains a 16-Kbyte writethrough cache. The 16 Kbytes of cache memory are logically organized as 256 sets, each containing four lines. The cache memory is physically split into four 4-Kbyte blocks, each containing 256 liries. (See Figure 7-1.) There are 256 20-bit tags associated with each 2-Kbyte block. All other details listed in section 7.1 for the 8-Kbyte on-chip cache also apply to the IntelDX4 on-chip cache. 7_2 Cache Control Control of the cache is provided by the CD and NW bits in CRO. CD enables and disables the cache. NW controls memory write-through and invalidates. The CD and NW bits define four operating modes of the on-chip cache as given in Table 7-2. These modes provide flexibility in how the on-chip cache is used. CD=1, NW=1 The cache is completely disabled by setting CD = 1 and NW = 1 and then flushing the cache. This mode may be useful for debugging programs where it is important to see all memory cycles at the pins. Writes that hit in the cache will not appear on the external bus. It is possible to use the on-chip cache as fast static RAM by "pre-loading" certain memory areas into the cache and then setting CD = 1 and NW = 1. Pre-loading can be done by careful choice of memory references with the cache turned on or by use of the testability functions. (See section 11.2, "On-Chip Cache Testing.") When the cache is turned off, the memory mapped by the cache is "frozen" into the cache because fills and invalidates are disabled. 2-115 Intel486TM PROCESSOR FAMIL V Table 7-1. Write-Back Enhanced IntelDX2™ Processor WB/WT# Initialization State of WB/WT # at Falling Edge of RESET Effect on Write-Back Enhanced IntelDX2TM Processor Operation WB/WT# = LOW Processor is in Standard Bus Mode (Write-Through Cache) 1. When FLUSH # is asserted, the internal cache will be invalidated in one system elK. 2. No Special FLUSH # Acknowledge Cycles appear on the bus after the assertion of the FLUSH # pin. 3. All write-back specific inputs are ignored (INV, WB/WT#) 4. SRESET does not clear the 5MBASE register. It behaves much like a RESET (invalidating the on-chip cache and resetting the CRO register, for example). SRESET is NOT an interrupt. WB/WT# = HIGH Processor is in Enhanced Bus Mode (Write-Back Cache) 1. Write backs will be performed when a cache flush is requested (via the FLUSH # pin or the WBINVD instruction). The system must watch for the FLUSH # special cycles to determine the end of the flush. 2. The special FLUSH # Acknowledge Cycles will appear on the bus after the assertion of the FLUSH # and after all the cache write backs (if any) are completed on the bus. 3. WB/WT # is a sampled on a line by line basis to determine the state of a line to be allocated in the cache (as a Write Through (S state) or as Write Back (E state». 4. The WB/WT# and INV inputs are no longer ignored. HITM# .and CACHE# will be driven during appropriate bus cycles. 5. PLOCK # is always driven inactive. 6. SRESET is an interrupt. SRESET does not reset the 5MBASE register or flush the on-chip cache. The CRO register gets the same values as after RESET with the exception of the CD and NW bits. These two bits retain their previous status. (See section 9.2.18.4, "Soft Reset (SRESET)" and Table 3-7 for details on SRESET for write-back enhanced mode.) CD=O, the same as if the KEN # pin was strapped HIGH disabling cache fills. Write-throughs and invalidates may still occur to keep the cache valid. This mode is useful if the software must disable the cache for a short period oftime, and then re-enable it without flushing the original contents. NW=1 CD=O, Invalid. If CRO is loaded with this bit configuration, a General Protection fault with error code of a will occur. NW=O Table 7-2. Cache Operating Modes CD NW 1 1 Cache fills disabled, writethrough and invalidates disabled 1. a Cache fills disabled, writethrough and invalidates enabled ° ° CD=1, 1 Operating Mode INVALID. If CRO is loaded with this configuration of bits, a GP fault with error ~ode of is raised. ° ° Cache fills enabled, writethrough and invalidates enabled NW=O This is the normal operating mode. Completely disabling· the cache is a two-step process. First, CD and NW must be set to 1, and then the cache must be flushed. If the cache is not flushed, cache hits on reads will still occur and data will be read from the cache. Cache fills are disabled but write-throughs and invalidates are enabled. This mode is 2-116 I Intel486™ PROCESSOR FAMILV 7.2.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE CONTROL AND OPERATING MODES The Write· Back Enhanced IntelDX2 processor retains the usage of CRO.CD and CRO.NW, in which the 1,1 state forces a cache-off condition after RESET, and the 0,0 state is the normal run state. Table '7 -3 defines these control bits when the cache is enabled for write-back operation. Table 7-3 is also valid when the cache is in write-back mode and some lines are in a write-through state. CD=1, NW=1 The 1,1 state is best used when no lines are allocated, which occurs naturally after RESET (but not SRESET), but must be forced (e.g., by instruction WBINVD) if entered during normal operation. In these cases, the Write-Back Enhanced IntelDX2 processor will operate as if it had no cache at all. If the 1,1 state is exited, lines that are allocated as write back will be written back upon a snoop hit or replacement cycle. Lines that were allocated as write-through (and later modified while in the 1,1 state) will never appear on the bus. CD=1, NW=O The only difference from the normal 0,0 "run" state is that new line fills (and the line replacements that result from capacity limitations) do not occur. This causes the contents of the cache to be locked in, unless lines are invalidated using snoops. 7.3 Cache Line Fills Any area of memory can be cached in the Intel486 processor. Non-cacheable portions of memory can be defined by the external system or by software. The external system can inform the Intel486 processor that a memory address is non-cacheabl.e by returning the KEN # pin inactive during a memory access. (Refer to section 10.2.3, "Cacheable Cycles. ") Software can prevent certain pages from being cached by setting the PCD bit in the page table entry. A read request can be generated from program operation or by an instruction pre-fetch. The data will be supplied from the on-chip cache if a cache hit occurs on the read address. If the address is not in the cache, a read request for the data is generated on the external bus. If the read request is to a cacheable portion of memory, the Intel486 processor initiates a cache line fill. During a line fill a 16-byte line is read into the Intel486 processor. Cache line fills will only be generated for read misses. Write misses will never cause . a line in the internal cache to be allocated. If a cache hit occurs on a write, the line will be updated. Cache line fills can be performed over 8- and 16-bit buses using the dynamic bus sizing feature. Refer to section 10.1.2, "Dynamic Data Bus Sizing" for a description of dynamic bus sizing and section 10.2.3, "Cacheable Cycles" for further information on cacheable cycles. Table 7-3. Write-Back Enhanced IntelDX2™ Processor Write-Back Cache Operating Modes CRO CD,NW READ HIT READ MISS WRITE HIT(1) WRITE MISS Snoops 1,1 (state after reset) read cache read bus (no fill) write cache (no write-through) write bus not accepted 1,0 read cache read bus (no fill) write cache, write bus if S write bus normal operation 0,1 This is a fault-protected disallowed state. A GP(O) will occur if an attempt is made to load CRO with this state. 0,0 (state DURING normal operation) read cache read bus, line fill write cache, write bus if S write bus normal operation NOTE: 1. Normal MESI state transitions occur on write hits in all legal states. I 2-117 Intel486TM PROCESSOR FAMILY snooped line is in the on-chip cache, the line is invalidated. Snoop cycles are described in detail in the "Bus Functional Description" section. 7.4 Cache Line Invalidations The Intel486 processor contain both a hardware and software mechanism for invalidating lines in its internal cache. Cache line invalidations are needed to keep the Intel486 processor cache contents consistent with external memory. The Write-Back Enhanced IntelDX2 processor has control mechanisms (including snooping) for writing back the modified write-back lines and invalidating the cache. There are special bus cycles associated with write-backs and invalidation. All of the WriteBack Enhanced IntelDX2 processor special cycles require acknowledgment by RDY # or BRDY #. During the special cycles, the addresses shown in the Table 7-4 are driven onto the address bus and the data bus is left undefined. Refer to section 10.2.8, "Invalidate Cycles" for further information on cache line invalidations. 7.4.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR SNOOP CYCLES AND WRITE-BACK MODE INVALIDATION In Enhanced bus mode, tne Write-Back Enhanced IntelDX2 processor performs invalidations differently than other Intel486 processors. Snoop Cycles are initiated by the system to determine if a line is present in the cache, and what the state is. Snoop cycles may further be classified as Inquire cycles or Invalidate cycles. Inquire cycles are driven to the WriteBack Enhanced IntelDX2 processor when another bus master initiates a memory read cycle, to determine if the processor cache contains the latest data. If the snooped line is in the Write-Back Enhanced IntelDX2 processor cache and has the most recent information, the processor must schedule a write back of the data. Inquire cycles are driven with INV = "0". Invalidate cycles are driven to the WriteBack Enhanced IntelDX2 processor when the other bus master initiates a memory write cycle to determine if the Write-Back Enhanced IntelDX2 processor cache contains the snooped line. The Invalidate cycles are driven with INV = "1 ", so that if the 7.5 Cache Replacement When a line needs to be placed in its internal cache the Intel486 processor first checks to see if there is a non-valid line in the set that can be replaced. If all four lines in the set are valid, a pseudo least-recentIy-used mechanism is used to determine which line should be replaced. A valid bit is associated with each line in the cache. When a line needs to be placed in a set, the four valid bits are checked to see if there is a non-valid line that can be replaced. If a non-valid line is found, that line is marked for replacement. The four lines in the set are labeled 10, 11, 12, and 13. The order in which the valid bits are checked during an invalidation is 10, 11, 12 and 13. All valid bits are cleared when the processor is reset or when the cache is flushed. Table 7-4. Encoding of the Special Cycles for Write· Back Cache Cycle Name MilO # D/C# W/R# BE3#-BEO# A4-A2 Write-Back' 0 0 1 0111 000 First Flush Ack Cycle' 0 0 1 0111 001 Flush' 0 0 1 1101 000 Second Flush Ack Cycle' 0 0 1 1101 001 Shutdown 0 0 1 1110 000 HALT 0 0 1 1011 000 Stop Grant Ack Cycle 0 0 1 1011 100 • For the Write-Back Enhanced IntelDX2 processor only. FLUSH dard Mode. Refer to appropriate sections. 2-118 IS present on all Intel486 processors, but differs for Stan- I Intel486™ PROCESSOR FAMIL V Replacement in the cache is handled by a pseudo least recently used (LRU) mechanism when all four lines in a set are valid. Three bits, 80, B1 and B2, are defined for each of the 128 sets in the cache. These bits are called the LRU bits. The LRU bits are updated for every hit or replace in the cache. If the most recent access to the set was to 10 or 11, BO is set to 1. BO is set to 0 if the most recent access was to 12 or 13. If the most recent access to 10:11 was to 10, B1 is set to 1, else B1 is set to o. If the most recent access to 12:13 was to 12, B2 is set to 1, else 82 is set to O. The pseudo LRU mechanism works in the following manner. When a line must be replaced, the cache will first select which of 10:11 and 12:13 was least recently used. Then the cache will determine which of the two lines was least recently used and mark it for replacement. This decision tree is shown in Figure 7-2. All four lines in the set valid? ~ Replace . non-valid line Yes I t 60 = a? Yes: 10 or 1 1 ; ! \ N O : 12 or 13 least recently least recently used used 61 = a? 62 = a? yeAo yeAo Replace 10 Replace 11 Replace 12 Replace 13 242202-54 Figure 7-2. On-Chip Cache Replacement Strategy 7.6 Page Cacheability Two bits for cache control, PWT and PCD, are defined in the page table and page directory entries. The state of these bits are driven out on the PWT and PCD pins during memory access cycles. I The PWT bit controls the write policy for second level caches used with the Intel486 processor. Setting PWT = 1 defines a write-through policy for the current page while PWT = 0 defines the possibility of write-back. The state of PWT is ignored internally by the Intel486 processor for on-chip cache in write through mode. The PCD bit controls cacheability on a page by page basis. The PCD bit is internally AND'ed with the KEN # signal to control cacheability on a cycle by cycle basis (see Figure 7-3). PCD=Q enables caching while PCD = 1 forbids it. Note that cache fills are enabled when PCD=O AND KEN# =0. This logical . AND is implemented physically with a NOR gate. The state of the PCD bit in the page table entry is driven on the PCD pin when a page in external memory is accessed. The state of the PCD pin informs the· external system of the cacheability of the requested information. The external system then returns KEN # telling the Intel486 processor if the area is cacheable. The Intel486 processor initiates a cache line fill if PCD and KEN # indicate that the requested information is cacheable. The PCD bit is OR'ed with the CD (cache disable) bit in control register 0 to determine the state of the PCD pin. If CD= 1, the Intel486 processor forces the PCD pin HIGH. If CD=O, the PCD pin is driven with the value for the page table entry/directory. (See Figure 7-3.) The PWT and PCD bits for a bus cycle are obtained from either CR3, the page directory or page table entry. These bits are assumed to be zero during real mode, whenever paging is disabled, or for cycles that bypass paging, (1/0 references, interrupt acknowledge and HALT cycles). When paging is enabled, the bits from the page table entry are cached in the TLB, and are driven any time the page mapped by the TLB entry is referenced. For normal memory cycles, PWT and PCD are taken from the page table entry. During TLB refresh cycles where the page table and directory entries are read, the PWT and PCD bits must be obtained elsewhere. During page table updates the bits are obtained from the page directory. When the page directory is updated the bits are obtained from CR3. PCD and PWT bits are initialized to zero at reset, but can be modified by level 0 software. 2-119 Intel486TM PROCESSOR FAMILY CRO II~ I~ I I rr FLUSH_ CACHE CONTROL LOGIC KEN" 1.-0 CACHE MEMORY ---------------------------. 31 LINEAR ADDRESS 0 12 22 DIRECTORY TABLE I I OFFSET ro 10 It 31 31 0 t PWT 0 0 31 I I CRO I I CRI I I CR2 I I CR3 PCD, PWT I I I CONTROL REGISTERS I PCD PCD, PWT r PCD, PWT PAGE TABLE DIRECTORY CD (From CRO) ._-------------------------242202-55 Figure 7-3. Page Cacheability 7.6.1 Write-Back Enhanced IntelDX2 PROCESSOR PAGE CACHEABILITY In Write-Back Enhanced IntelDX2 processor· based system, both the processor and the system hardware must determine the cacheability and the configuration (write-back or write-through) on a line by line basis, The system hardware's cacheability is de· termined by KEN # and the configuration by WB/WT #. The processor's indication of cacheability is determined by PCD and the configuration by PWT. The PWT bit controls the write policy for the second level caches used with the Write-Back Enhanced IntelDX2 processor. Setting PWT to 1 de- 2-120 fines a write-through policy for the current page, while clearing PWT to 0 defines a write-back policy for th~ current page. 7.7 Cache Flushing The on-chip cache can be flushed by external hardware or by software instructions. Flushing the cache clears all valid bits for all lines in the cache. The cache is flushed when external hardware asserts the FLUSH# pin. I Intel486™ PROCESSOR FAMILY The FLUSH # pin needs to be asserted for one clock if driven synchronously or for two clocks if driven asynchronously. FLUSH # is asynchronous, but setup and hold times must be met for recognition in a particular cycle. FLUSH # should be de-asserted before the cache flush is complete. Failure to de-assert the pin will cause execution to stop as the processor will be repeatedly flushing the cache. If external hardware activates flush in response to an 1/0 write, FLUSH # must be asserted for at least two clocks prior to ready being returned for the 1/0 write. This ensures that the flush completes before the processor begins execution of the instruction following the OUT instruction. The instructions INVD and WBINVD cause the onchip cache to be flushed. External caches connected to the Intel486 processor are signaled to flush their contents when these instructions are executed. WBINVD will also cause an external write-back cache to write back dirty lines before flushing its contents. The external cache is signaled using the bus cycle definition pins and the byte enables (refer to section 9.2.6 "Bus Cycle Definition" for the bus cycle definition pins and section 10.2.11 "Special Bus Cycles" for special bus cycles). ,Refer to the Intel486fM Processor Programmers Reference Manual for detailed instruction definitions. The results of the INVD and WBINVD instructions are identical for the operation of the non-write-back enhanced Intel486 processor on-chip cache because the cache is write-through. 7.7.1 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CACHE FLUSHING The on-chip cache can be flushed by external hardware or by software instructions. Flushing the cache through hardware is accomplished by driving the FLUSH # pin low. This causes the cache to write back all modified lines in the cache and mark the state bits invalid. The First Flush Acknowledge cycle is driven by the Write-Back Enhanced IntelDX2 processor followed by the Second Flush Acknowledge cycle after all write-backs and invalidations are complete. The two special cycles are issued even if there are no dirty lines to write back. ' I The INVD and WBINVD instructions cause the onchip cache to be invalidated. WBINVD causes the modified lines in the internal cache to be written back, and all lines to be marked invalid. After execution of the WBINVD instruc-tion, the Write-back and Flush special cycles are driven to indicate to any external cache that it should write back and invalidate its contents. These two special cycles are issued even if there are no dirty lines to be written back. INVD causes all lines in the cache to be invalidated, so modified lines in the cache are not written back. The Flush special cycle is driven after the INVD instruction is executed to indicate to any external cache that it should invalidate its contents. Care should be taken when using the INVD instruction to avoid creating cache consistency problems. NOTE: It is recommended to use the WBINVD instruction instead of the INVD instruction if the on-chip cache is configured in the writeback mode. The assertion of the RESET pin invalidates the entire cache without writing back the modified lines. No special cycles are issued after the invalidation is complete. Snoop cycles with invalidation (INV= 1) cause the Write-Back Enhanced IntelDX2 processor to invalidate an individual cache line. If the snooped line is a modified line, then the processor schedules a writeback cycle. Inquire cycles with no-invalidation cause the Write-Back Enhanced IntelDX2 processor to only write-back the line, if the inquired line is in M-state, and not invalidate the line. SRESET, STPCLK#, INTR, NMI and SMI# are recognized and latched, but not serviced during the fullcache, modified-line write-backs, caused either by WBINVD instruction or the FLUSH #. However, BOFF # , AHOLD and HOLD are recognized DURING the full-cache, modified-line write-backs. 7.8 Write-Back Enhanced IntelDX2 Processor Write-Back Cache Architecture This section describes additional features pertaining to the write-back mode of the Write-Back Enhanced IntelDX2 processor. 2-121 Intel486TM PROCESSOR FAMILY main memory). A write to an invalid line will cause the Write-Back Enhanced IntelDX2 processor to execute a write-through cycle on the bus. 7.8.1 WRITE-BACK CACHE COHERENCY PROTOCOL The Write-Back Enhanced IntelDX2 prmcessor cache protocol supports a cache line in one of the following four states: • whether a line is valid and defined as write-back during allocation (E-state), • if it is valid and defined as write-through during allocation (5-state), • if it has been modified (M-state), • if it is invalid (I-state). These four states are the M (Modified line), E (write· back line), S (write-through line) and the I (Invalid line) states and the protocol is re-ferred to as the "Modified ME51 protocol." A definition of the states is given below: M - Modified: E - Exclusive: 5 - Shared: I -Invalid: 2-122 An M~state line is modified (different from main memory) and can be accessed (read/written to) without sending a cycle out on the bus. An E-state line is a "write·back" line, but the line is not modified (Le., it is coherent with main memory). An E-state line can be accessed (read/written to) without generating a bus cycle and a write to an E-state line will cause the line to become modified. An 5-state line is a "write-through" line, and is coherent with main memory. A read hit to an 5-state line will not generate bus activity, but a write hit to an 5-state line will generate a write-through cycle on the bus. A write to an 5-state line will update the cache and the main memory. This state indicates that the line is not in the cache. A read to this line will be a miss and may cause the Write-Back Enhanced IntelDX2 processor to execute a line fill (fetch the whole line into the cache from Every line in the Write-Back Enhanced IntelDX2 processor cache is assigned a state dependent on both Write-Back Enhanced IntelDX2 processor generated activities and activities generated by the system hardware. As the Write-Back Enhanced IntelDX2 processor is targeted for uniprocessor systems, a subset of ME51 protocol, namely MEl, is used in the Write-Back Enhanced IntelDX2 processor to maintain cache coherency. With the modified ME51 protocol, it is assumed that in a uniprocessor system lines are defined as writeback or write-through at allocation time. This property associated with a line is never altered. The lines allocated as write-through go to 5-state and remain in 5-state. A cache line that is allocated as writeback never enters the 5-state. The WB/WT # pin is sampled during line allocation and is used strictly to characterize a line as write-back or write-through. 7.8.1.1 State Transition Tables 5tate transi-tions are caused by processor-generated transactions (memory reads/writes) and by a set of external input signals and internally-generated variables. The Write-Back Enhanced IntelDX2 processor also drives certain pins as a consequence of the Cache Consistency Protocol. Read Cycles Table 7-5 shows the state transitions for lines in the cache during unlocked read cycles. Write Cycles The state transitions of cache lines during WriteBack Enhanced IntelDX2 processor-generated write cycles are described in Table 7-6. I Intel486™ PROCESSOR FAMILY Table 7-5. Cache State Transitions for Write-Back Enhanced IntelDX2TM Processor Initiated Unlocked Read Cycles Present State j)in Activity Next State M n/a M Read hit; data is provided to processor core by cache. No bus cycle is generated. Description E n/a E Read hit; data is pro-vided to processor core by cache. No bus cycle is generated. S n/a S Read hit; Data is pro-vided to the processor by the cache. No bus cycle is generated. I CACHE# low AND KEN# low AND WB/WT# high AND PWTlow E Data item does not exist in cache (MISS). A line fill cycle (read) will be gen-erated by the Write-Back Enhanced IntelDX2™ processor. This state transition will occur if WB/WT # is sampled high with first BRDY #. I CACHE# low AND KEN# low AND (WB/WT# low OR PWThigh) S Same as previous read miss case except that WB/WT # is sampled low with first BRDY # or PWT is high. I CACHE# high OR KEN# high I KEN # pin inactive; the line is not intended to be cached in the Write-Back Enhanced IntelDX2 processor. NOTES: Locked accesses to the cache will cause the accessed line to transition to the Invalid state. PCD can also be used by the processor to determine the cacheability, but using the CACHE# pin is recommended. The transition from I to E or S states (based on WB/WT#) occurs only if KEN# is sampled low one clock prior to the first BRDY # and then one clock prior to the last BRDY #, and the cycle is transformed into a line fill cycle. If KEN # is sampled high, the line is not cached and remains in the I state. Table 7-6. Cache State Transitions for Write-Back Enhanced IntelDX2TM Processor-Initiated Write Cycles Present State Pin Activity Next State M M Write hit; update cache. No bus cycle generated to update memory. M Write hit; update cache only. No bus cycle generated; line is now modified. S n/a n/a n/a S Write hit; cache updated with write data item. A write-through cycle is generated on the bus to update memory. Subsequent writes to E-state or M-state lines are held up until this write-through cycle is completed. I n/a I Write miss; a write-through cycle is generated on the bus to update external memory. No allocation is done. Subsequent writes to the E or M lines are blocked until the write-miss is completed. E I Description 2-123 Intel486TM PROCESSOR FAMILY Note that even though memory writes are buffered while 110 writes are not, these writes appear at the pins in the same order as they were generated by the processor. Write-Back cycles caused by, the replacement of M-state lines are buffered, while WriteBacks due to Snoop hit to M-statelines are not buffered. Cache Consistency Cycles (Snoop Cycles) The purpose of Snoop cycles is to check whether the address being presented by another bus master is con-tained within the cache of the Write-Back Enhanced IntelOX2 processor. Snoop cycles may be initiated with or without an invalidation request (INV = 1 or 0). If a snoop cycle is initiated with INV = 0 (usually during memory read cycles by another master), it is referred to as an Inquire cycle. If a snoop cycle is initiated with INV = 1 (usually during memory write cyCles), it is referred to as an Invalidate cycle. If the address hits a modified line in the cache, the HITM# pin is asserted, and the modified line is written back onto the bus. Table 7-7 describes state transitions for Snoop cycles. 7.8.2 DETECT,ING ON-CHIP WRITE-BACK CACHE OF THE WRITE·BACK ENHANCED INTELDX2 PROCESSOR ' The write-back policy of the on-chip cache of the Write-Back Enhanced IntelOX2 processor may be detected by software or hardware. The software mechanism makes use of the CPUID instruction. (See Appendix B, "Feature Determination," for use of the CPUID instruction.) The hardware mechanism makes use of a write-back related output signal from the processor. A software mechanism to determine if a given processor has write-back support for the on-chip cache should drive the WB/WT # pin to "1" during RESET. This pin will be sampled by the processor during the falling edge of the RESET. Execute the CPUID instruction, which returns the model number in the EAX register, EAX[7:4J. If the model number returned i's 7 (Write-Back Enhanced IntelOX2 processor) and the family number is 4, the on-chip cache supports the write-back policy. If the model number returned is in the range 0 through 6 or 8, the on-chip cache only supports the write-through policy. The following pseudo code/steps give an example of the initialization BIOS that can be used to detect the presence of the write-back on-chip cache: . • Boot Address Cold start • Load Segment Registers and null IDTR • Execute CPUID, instruction and determine the Family 10 and Model 10. • Compare the Family 10 to 4 and the Model 10 returned to 7. When the Family 10 is 4, and the model 10 is 7, the processor supports on-chip write-back caching. If the Family 10 does not match 7, the processor only supports on-chip write-through caching. The hardware mechanism involves using the HITM# signal. For the Write-Back Enhanced IntelDX2 processor, this signal is driven inactive (high) during , RESET, The chipset can sample this output on the falling edge of, RESET. If HITM # is sampled high on the falling edge of RESET, the processor supports on-chip write-back cache configuration. For those processors that do not support internal write-back, caching, this signal is an INC, and this output is not driven. Table 7·7. Cache State Transitions During Snoop Cycles Present State Next State INV=1 Next State INV=O M I E Snoop hit to a modified line indicated by HITM # pin low. Write-Back Enhanced IntelOX2 Processor schedules the write back of the modified line to memory. The state of the line changes to E provided INV = 0 and changes to I if INV = 1. E I E Snoop hit, no bus cycle generated.. State remains unaltered if INV = 0, and changes to I if INV = 1. There is no external indication of this snoop hit. S I S Snoop hit, no bus cycle generated. State remains unaltered if INV = 0, and changes to I if INV = 1. There is no external indication of this snoop hit. I I I Address not in cache. 2-124 Description I Intel486TM PROCESSOR FAMILY 8.0 SYSTEM MANAGEMENT MODE (SMM) ARCHITECTURES 8.1 SMM Overview The Intel486 processor supports four modes: Real, Virtual-86, Protected, and System Management Mode (SMM). As an operating mode, SMM has a distinct processor environment, interface and hardware/software features. SMM provides system designers with a means of adding new software-controlled features to computer products that operate transparently to the operating system and software applications. SMM is intended for use only by system firmware, not by applications software or general purpose systems software. The SMM architectural extension consists of the following elements: . 1. System Management Interrupt (SMI #) hardware interface. 2. Dedicated and secure memory space (SMRAM)· for SMI # handler code and processor state (context) data with a status signal for the system to decode access to that memory space, SMIACT #. (The 5MBASE address is relocatable and could also be relocated to non-cacheable address . space.) 3. Resume (RSM) instruction, for exiting the System Management Mode. 4. Special Features such as 1I0-Restart, for transparent power management of 110 peripherals, and Auto HALT Restart. 8.2 Terminology The following terms are used throughout the discussion of System Management Mode. SMM: System Management Mode. This is the operating environment that the processor (system) enters when the System Management Interrupt is being serviced. . SMI#: System Management Interrupt. This is part of the SMM interface. When- SMI # is asserted (SMI # pin asserted low) it causes the processor to invoke SMM. The SMI # pin Is the only means of enterIngSMM. I SMM handler: System Management Mode handler. This is the code that will be executed when the processor is in SMM. An example application that this code might implement is a power management control or a system control function. RSM: Resume instruction. This instruction is used by the SMM handler to exit the SMM and return to the interrupted operating system or application process. SMRAM: This is the physical memory dedicated to SMM. The SMM handler code and related data reside in this memory. This memory is also used by the processor to store its context before executing the SMM handler. The operating system and applications do hot have access to this memory space. 5MBASE: Control register that contains the address of the SMRAM space. Context: This term refers to the processor state. The SMM discussion refers to the context, or processor state, just before the processor invokes SMM. The context normally consists of the processor registers that fully represent the processor state. Context SWitch: A context switch is the process of either saving or restoring the context. The SMM discussion refers to the context switch as the process of saving/restoring the context while invoking/ exiting SMM, respectively. 8.3 System Management Interrupt Processing The system interrupts the normal program execution and invokes SMM by generating a System Management Interrupt (SMI#) to the processor. The processor will service the SMI # by executing the following sequence (see Figure 8-1): 1. The processor asserts the SMIACT# Signal, indicating to the system that it should enable the SMRAM. 2. The processor saves its state (context) to SMRAM, starting at default address location 3FFFFH, proceeding downward in a stack-like fashion. 3. The processor switches to the System Management Mode processor environment (a pseudoreal mode). . 2-125 Intel486TM PROCESSOR FAMILY SMI# SMIACT# "'L._..I ------1 / Active during bus cycles in SMM 242202-56 Figure 8-1. Basic SMI# Interrupt Service 4. The processor will then jump to the default absolute address of 38000H in SMRAM to execute the SMI# handler. This SMI# handler performs the system management activities. 5. The SMI # handler will then execute the RSM instruction which restores the processors context from SMRAM, de-asserts the SMIACT # signal, and then returns control to the, previously inter, rupted program execution. NOTE: The above sequence is valid for the default 5MBASE value only. See the following sections for a description of the 5MBASE register and 5MBASE relocation. The System Management Interrupt hardware interface consists of the SMI # interrupt request input and the SMIACT # output used by the system to decode the SMRAM. 2-126 CPU SMIACT# _ _S_M_I#__ •• } ~~rface 242202-57 Figure 8-2. B~sic SMI # Hardware Interface 8.3.1 SYSTEM MANAGEMENT INTERRUPT (SMI#) SMI # is a falling-edge triggered, non-maskable interrupt request signal. SMI# is an asynchronous signal, but setup and hold times, t20 and t21, must be met in order to guarantee recognition on a specific clock. The SMI # input need not remain active until the interrupt is actually serviced. The SMI# input only needs. to remain active for a single clock if the required setup and hold times are met. SMI # will also work correctly if it is held active for an 'arbitrary number of clocks. I Intel486TM PROCESSOR FAMILY The SMI # input must be held inactive for at least four external clocks after it is asserted to reset the edge triggered logic. A subsequent SMI # might not be recognized if the SMI # input is not held inactive for at least four clocks after being asserted. SMI#, like NMI, is not affected by the IF bit in the EFLAGS register and is recognized on an instruction boundary. An SMI # will not break locked bus cycles. The SMI # has a higher priority than NMI and is not masked during an NMI. In order for SMI # to be recognized with respect to SRESET, SMI# should not be asserted until two (2) clocks after SRESET becomes inactive. After the SMI# interrupt is recognized, the SMI# signal will be masked internally until the RSM instructionis executed and the interrupt service routine is complete. Masking the SMI # prevents recursive SMI # calls. SMI # must be de-asserted for at least 4 clocks to reset the edge triggered logic. If another SMI# occurs while the SMI# is masked, the pending SMI # will be recognized and executed on the next instruction boundary after the current SMI # completes. This instruction boundary occurs before execution of the next instruction in the interrupted application code, resulting in back to back SMM handlers. Only one SMI # can be pending while SMI # is masked. elK SMI# ~, , I' • I I f s u , thd ~ BRDY# 8.3.2 SMI# ACTIVE (SMIACT#) SM IACT # indicates that the processor is operating il'] System Management Mode. The proce~sor asserts SMIACT # in response to an SMI # Interrupt request on the SMI# pin. SMIACT# is driven active after the processor has completed all pending write cycles (including emptying the write buffers), and before the first access to SMRAM when the processor saves (writes) its state (or context). to SMRAM. SMIACT# remains active until the last access to SMRAM when the processor restores (reads) its state from SMRAM. The SMIACT# signal does not float in response to HOLD. The SMIACT # signal is used by the system logic to decode SMRAM (See Figure 8-2). The number of CLKs required to complete the SMM state save and restore is very dependent on-system memory performance. The values listed in Table 8-1 assume 0 wait-state memory writes (2 eLK cycles), 2-1-1-1 burst read cycles, and 0 wait-state non-burst reads (2 CLK cycles). Additionally, it is assumed that the data read during the SMM state restore sequence is not cacheable. , SMI# I Sampled : / ;:~ _ _ _ _ _-I-_ _..l..-_ _ _ _ _ __ ~-~- , The SMI # Signal is synchronized internally and must be asserted at least three (3) CLK periods prior to asserting the ROY # signal in order to guarantee recognition on a specific instruction boundary. This is important for servicing an 1/0 trap with an SMI # . handler. (See Figure 8-3.) t ...A' ' l --- I ...I I 'A· -'- - - - - , - - - - ! ' , ,'----io-..I/ A: Setup time for recognition on 1/0 instruction boundary 242202-58 Figure 8-3. SMI # Timing for Servicing an 1/0 Trap I 2-127 Intel486TM PROCESSOR FAMILY and the minimum time required to return to the interrupted application (following the final 8MM instruction before RSM) is given by: Figure 8-4 and Table 8-1 can be used for latency calculations. As shown, the minimum time required to enter an 8MI # handler routine for the Intel486 OX processor (from the completion of the interrupted instruction) is given by: latency to continue interrupted application = .E latency to beginning of 8MI # handler = A + B + + F + G = 243 ClKs C = 153 ClKs T1 T2 ClK , SMI# ,...-----,.:;,:J: ~..J: ,'~. ~s ,-~ AOS# BROY# ~ , I , I'. SMIACT# s,...'),.---...,.\: ~ ! ~ - ;W' ·:,,·i'·r)~D:·,,:;:' v' , sW ; ~ .\.J - - u A -1~ " f I 'I ' ' ») ' " , ~S -tJ>!t G~ I :r'r-',---i-""-: i ; ~~M-E__ ....-£_ ...~ ________ .______ -J.- __:: ~~El' ~;';e _~=.!.-__ _ Normal State System Management Mode Normal State 242202-59 Figure 8·4. Intel486TM Processor SMIACT # Timing 2-128 I Intel486TM PROCESSOR FAMILY Table 8-1.lnteI486TM Processor SMIACT# Timing Intel486SX IntelSX2TM Intel486DX IntelDX2™ IntelDX4TM IntelDX4 Processor Processor Processor Processor Processor 3X Processor 2X A:last ROY # from non- 2ClK SMM transfer to minimum SMIACT # assertion 1 ClK minimum 2ClK minimum 1 ClK minimum 1 ClK minimum 1 ClK minimum B:SMIACT# assertion to 40ClK first AOS# for 8MM minimum state save 20ClK minimum 40ClK minimum 20ClK minimum 13ClK minimum 20ClK minimum C:SMM state save Approx 139 ClKs (dependent on memory performance) Approx. 139ClKs Approx. 139 ClKs Approx. 139 ClKs Approx. 139 ClKs Approx. 139 ClKs O:SMM handler User User User User User determined determined determined determined determined User determined E:SMM state restore Approx. (dependent on 236ClKs memory performance) Approx. 236ClKs Approx. 236ClKs Approx. 236ClKs Approx. 236ClKs Approx. 236ClKs F:Last ROY # from 8MM 4ClK transfer to deminimum assertion of 8MIACT# 2ClK minimum 4ClK minimum 2ClK minimum 1 ClK minimum 1 ClK minimum G: SMIACT # deassertion to first nonSMMA08# 10ClK minimum 20ClK minimum 10ClK minimum 6ClK minimum 10ClK minimum 20ClK minimum 8.3.3 SMRAM The Intel486 processor uses the SMRAM space for state save and state restore operations during an SMI # and R8M. The 8MI # handler, which also resides in SMRAM, uses the SMRAM space to store code, data and stacks. In addition, the SMI# handier can use the 8MRAM for system management information such as the system configuration, configuration of a powered-down device, and system designer-specific information. The processor asserts the SMIACT # output to indicate to the memory controller that it is operating in System Management Mode. The system logic should ensure that only the processor has access to this area. Alternate bus masters or OMA devices trying to access the 8MRAM space when 8MIACT # is active should be directed to system RAM in the respective area. The system logic is minimally required to decode the physical memory address range from 38000H3FFFFH as SMRAM area. The processor will save its state to the state save area from 3FFFFH downward to 3FEOOH. After saving its state the processor I will jump to the address location 38000H to begin executing the SMI# handler. The system logic .can choose to decode a larger area of 8MRAM as needed. The size of this 8MRAM can be between 32 Kbytes and 4 Gbytes. The system logic should provide a manual method for switching the SMRAM into system memory space when the processor is not in SMM. This will enable initialization of the SMRAM space (i.e., loading 8MI # handler) before executing the SMI # handier during SMM. (See Figure 8-5.) 8.3.3.1· SMRAM State Save Map When the SMI # is recognized on an instruction boundary, the processor core .first sets the SMIACT # signal lOW indicating to the system logic that accesses are now peing made to the systemdefined SMRAM areas. The processor then writes its state to the state save area in the SMRAM. The state save area starts at C8 Base + [8000H + 7FFFHJ. The default CS Base is 30000H, therefore the default state save area is at 3FFFFH. In this . case, the C8 Base can also be referred to as the 8MBASE. 2-129 Intel486™ PROCESSOR FAMILY If 5MBASE Relocation is enabled, then the SMRAM addresses can change. The' following formula is used to determine the relocated addresses where the context is saved. The context will reside at CS Base + [BOOOH + Register Offset), where the default initial CS Base is 30000H and the Register Offset is listed in the SMRAM state save map (Table 8-2). Reservec;l spaces will be used to accommodate new registers in future processors. The state save area starts at 7FFFH and continues downward in a stack-like fashion. Some of the registers in the SMRAM state save area may be read and changed by the SMI # handler, with the changed values restored to the processor r~gisters by the RSM instruction. Some register images are read-only, and must not be modified (modifying these registers will result in unpredictable behavior). The values stored in the areas marked reserved may change in future processors. An SMM handler should not rely on any values stored in an area that is marked as reserved. Processor acc8S88S 10 system address space used for loading SMRAM L ·';~~=d System memory accesses lIot redlreeled to ---+ I--"""'i SMRAM Table 8-2. SMRAM State Save Map Register Offset Register Wrlteable? 7FFC CRO 7FFB CR3 NO NO 7FF4 EFLAGS YES 7FFO EIP YES 7FEC EDI YES 7FEB ESI YES 7FE4 EBP YES 7FEO ESP YES 7FDC EBX YES 7FDB EDX YES 7FD4 ECX YES 7FDO EAX YES 7FCC DR6 7FCB DR7 NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO 7FC4 ' TR" 7'FCO LDrR" 7FBC GS" 7FBB FS" 7FB4 DS· 7FBO SS" 7FAC CS· Nannal memory 7FAB ES" space 7FA7-7F9B Reserved 7F94 IDTBase 242202-60 7F93-7FBC Reserved 7FBB GDTBase Figure 8-5. Redirecting System Memory Addresses to SMRAM 7FB7-7F04 Reserved 7F02 Auto HALT Restart Slot (Word) 7FOO I/O Trap Restart Slot-- , YES YES (Word) 7EFC SMM Revision Identifier (Dword) NO 7EFB 5MBASE Slot (Dword) .YES 7EF7-7EOO fjeserved NO NOTES:, "Upper two bytes are reserved. Modifying a value that is marked as not writeable will result in unpredictable behavior. Words are stored in two consecutive bytes in memory with the low-order byte at the lowest address and the high-order byte in the high address. 2-130 I Intel486™ PROCESSOR FAMILY The following registers are saved and restored (in areas of the state save that are marked reserved), but are not visible to the system software programmer: CR 1, CR2 and CR4, hidden descriptor registers for CS, OS, ES, FS, GS, and SS. 1/0 Trap Restart. If the SMI# interrupt was generated on an I/O access to a powered-down device, the SMI # handler can tell the RSM instruction to reexecute that I/O instruction by setting the I/O Trap Restart slot. If an SMI # request is issued for the purpose of powering down the processor, the values of all reserved locations in the SMM state save must be saved to non-volatile memory. 5MBASE Relocation. The system can relocate the SMRAM by setting the 5MBASE Relocation slot in the state save area. The RSM instruction will set the . 5MBASE in the processor based on the value in the 5MBASE relocation slot. The 5MBASE must be 32K aligned. The following registers are not automatically saved and restored by SMI# and RSM: For further details on these SMM features, see section 8.5. DR5-DRO, TR7- TR3, FPU registers: STn, FCS, FSW, tag word, FP instruction pointer, FP opcode, and operand pointer. If the processor detects invalid state information, it enters the shutdown state. This happens only in the following situations: For all SMI# requests except for suspend/resume, these registers do not have to be saved because their contents will not change. However, during a power down suspend/resume, a resume reset will clear these registers back to their default values. In this case, the suspend SMI# handler should read these registers directly to save them and restore them during the power up resume. Any1ime the SMI# handler changes these registers in the processor, it must also save and restore them. 8.3.4 EXIT FROM SMM The RSM instruction is only available to the SMI # handler. The opcode of the instruction is OFAAH. Execution of this instruction while the processor is . executing outside of SMM will cause an invalid opcode error. The last instruction of the SMI# handler will be the R8M instruction. The RSM instruction restores the state save image from SMRAM back to the processor, then returns control back to the interrupted program execution. There are three SMM features that can be enabled by writing to control "slots" in the SMRAM state save area. Auto HALT Restart. It is possible for the 8MI # request to interrupt the HALT state. The SMI # handler can tell the RSM instruction to return control to the HALT instruction or to return control to the instruction following the HALT instruction by appropriately setting the Auto HALT Restart slot. The default operation is to restart the HALT instruction. I • The value stored in the 5MBASE slot is not a 32-Kbyte-aligned address. • A reserved bit of CR4 is set to 1. • A combination of bits in CRO is illegal; namely, (PG=1 and PE=O) or (NW=1 and CD=O). In shutdown mode, the processor stops executing instructions until an NMI interrupt is received or reset initialization is invoked. The processor generates a special bus cycle to indicate it has entered shutdown mode. NOTE: INTR and SMI # will also bring the processor out of a shutdown that is encountered due to invalid state information from SMM execution. Make sure that INTR and SMI # are not asserted if SMM routines are written such that a shutdown occurs. 8.4 System Management Mode Programming ModeJ 8.4.1 ENTERING SYSTEM MANAGEMENT MODE SMM is one of the major operating modes, on a level with Protected mode, Real address mode or virtual86 mode. Figure 8-6 shows how the processor can enter SMM from any of the three modes and then return. 2-131 Intel486™ PROCESSOR FAMILY then drives SMIACT# active, saves its register state to SMRAM space, and begins to execute the SMM handler. Reset or RSM SMIN Resel RSM RSM 242202-61 NOTE: Reset could occur by asserting the RESET or SRESET pin. " Figure 8-6. Transition to and from System Management Mode The external signal SMI# causes the processor to switch to SMM. The RSM instruction exits SMM. SMM is transparent to applications programs and operating systems because of the following: • The only way to enter SMM is via a type of nonmaskable interrupt triggered by an external signal. • The processor begins executing SMM code from a separate address space, referred to earlier as system management RAM (SMRAM). • Upon entry into SMM, the processor saves the register state of the interrupted program in a part of SMRAM called the SMM context save space. • All interrupts normally handled by the operating system or by applications are disabled upon entry into SMM • A special instruction, RSM, restores processor registers from the SMM context save space and returns control to the interrupted program. SMM is similar to Real address mode in that there are no privilege levels or address mapping. An SMM program can execute all 1/0 and other system instructions and can address up to four Gbytes of memory. 8.4~2 PROCESSOR ENVIRONMENT When "an SMI # signal is recognized on an instruction execution boundary, the processor waits for all stores to complete, including emptying of the write buffers. The final write cycle is complete when the system returns ROY # or BROY #. The processor 2-132 SMI # has greater priority than debug exceptions and external interrupts. This means that if more than one of these conditions occur at an instruction boundary, only the SMI # processing occurs, not a debug exception or external interrupt. Subsequent SMI # requests are not acknowledged while the processor is in SMM. The first SMI# interrupt request that occurs while the processor is in SMM is latched, and serviced when the processor exits SMM with the RSM instruction. Only one SMI# will be latched by the processor while. it is in SMM. When the processor invokes SMM, the processor core registers are initialized as shown in Table 8-3. Table 8-3. SMM Initial Processor Core Register Settings Register Contents General Purpose Registers Unpredictable EFLAGS 00000002H EIP 00008000H CS Selector 3000H CS Base SMM Base (default 30000H) OS, ES, FS, GS, SS Selectors OOOOH OS, ES, FS, GS, S8 Bases OOOOOOOOOH OS, ES, FS, GS, S8 Limits OFFFFFFFFH CRO Bits 0,2,3 & 31 cleared (PE, EM, TS & PG); others are unmodified OR6 Unpredictable OR? OOOOOOOOH The following is a summary of the key features in the SMM environment: 1. Real mode style address calculation 2. 4-Gbyte limit checking 3. IF flag is cleared I Intel486TM PROCESSOR FAMIL V In SMM, the processor can access or jump anywhere within the 4-Gbyte logical address space. The processor can also indirectly access or perform a near jump anywhere within the 4-Gbyte logical address space. 4. NMI is disabled 5. TF flag in EFLAGS is cleared; single step traps are disabled 6. DR7 is cleared, except for bits 12 and 13; debug traps are disabled. 7. The RSM instruction no longer generates anInvalid opcode error 8. Default 16-bit opcode, register and stack use. 8.4.3.1 Exceptions and Interrupts within System Management Mode When the processor enters SMM, it disables INTR interrupts, debug and single'step traps by clearing the EFLAGS, DR6 and DR7 registers. This is done to prevent a debug application from accidentally breaking into an SMM handler. This is necessary because the SMM handler operates from a distinct address space (SMRAM), and hence, the debug trap will not represent the normal system memory space. All bus arbitration (HOLD, AHOLD, BOFF#) inputs and bus sizing (BS8#, BS16#) inputs operate normally while the processor is in SMM. 8.4.2.1 Write-Back Enhanced IntelDX2 Processor Environment When the Write-Back Enhanced IntelDX2 processor is in Enhanced Bus Mode, SMI# has greater priority than debug exceptions and external interrupts, except for FLUSH# and SRESET. (See section 4.8.6.) If an SMM handler wishes to use the debug trap feature of the processor to debug SMM handler code, it must first ensure that an SMM compliant debug handler is available. The SMM handler must also ensure DRO-DR3 is saved to be restored later. The debug registers DRO-DR3 and DR7 must then be initialized with the appropriate values. 8.4.3 EXECUTING SYSTEM MANAGEMENT MODE HANDLER The processor begins execution of the SMM handler at offset 8000H in the CS segment. The CS Base is initially 30000H. However, the CS Base can be changed by using the SMM Base relocation feature. When the SMM handler is invoked, the processors PE and PG bits in CRO are reset to O. The processor is in an environment similar to Real mode, but without the 64-Kbyte limit checking. However, the default operand size and the default address size are set to 16 bits. ' The EM bit is cleared so that no exceptions are generated. (If the SMM was entered from Protected mode, the Real mode interrupt and exception support is not available.) The SMI# handler should not use floating pOint unit instructions until the FPU is properly detected (within the SMI # handler) and the exception support is initialized. Because the segment bases (other than CS) are cleared to 0 and the segment limits are set to 4 Gbytes, the address space may be treated as a single flat 4-Gbyte linear space that is unsegmented. The processor is still in Real mode and when a segment selector is loaded with a 16-bit value, that value is then shifted left by 4 bits and loaded into the segment base cache. The limits and attributes are not modified. I 1 If the processor wishes to use the single step feature of the processor, it must ensure that an SMM compliant single step handler is available and then set the trap flag in the EFLAGS register. If the system design requires the processor to respond to hardware INTR requests while in SMM, it must ensure that an SMM compliant interrupt handier is available and then set the interrupt flag in the EFLAGS register (using the STI instruction). Software interrupts are not blocked upon entry to SMM, and the system software designer must provide an SMM compliant interrupt handler before attempting to execute any software interrupt instructions. Note that in SMM mode, the interrupt vector table has the same properties and location as the Real mode vector table. NMI interrupts are blocked upon entry to the SMM handler. If an NMI request occurs during the SMM handler, it is latched and serviced after the processor exits SMM. Only one NMI request will be latched during the SMM handler. If an NMI request is pending when the processor executes the RSM instruction, the NMI is serviced before the next instruction of the interrupted code sequence. 2-133 Intel486TM PROCESSOR FAMILY Although NMI requests are blocked when the processor enters SMM, they may be enabled through software by executing an IRET instruction. If the SMM handler requires the use of NMI interrupts, it should invoke a dummy interrupt service routine for the purpose of executing an IRET instruction. Once an IRET instruction is executed, NMI interrupt requests are serviced in the same "real mod" manner in which they are handled outside of SMM. 8.5 SMM Features The SMM revision identifier is used to indicate the. version of SMM and the SMM extensions that are supported by the processor. The SMM revision identifier is written during SMM entry and can be examined in SMRAM space at Register Offset 7EFCH. The lower word of the SMMrevision identifier refers to the. version of the base SMM architecture. The upper word of the SMM revision identifier refers to the extensions available. (See Figure 8-7.) 17 16 Il_...;jl I I I/O Trap with Restart ~! I ~~~~~r I Offset ! SMM Revision level 242202-62 Figure 8-7. SMM Revision Identifier Table 8-4. Bit Values for SMM Revision Identifier Bits Value Comments 16 0 Processor does not support I/O Trap Restart 16 1 Processor supports 1/0 Trap Restart 17 0 Processor does not support 5MBASE relocation 17 1 Processor supports 5MBASE relocation 2-134 Bit 17 of this slot indicates whether the processor supports relocation of the SMM jump vector and the SMRAM base address. (See Table 8-4.) The Intel486 processor supports both the I/O Trap Restart and the 5MBASE relocation features. 8.5.1 SMM REVISION IDENTIFIER 5MBASE Relocation Bit 16 of the SMM revision identifier is used to indicate to the SMM handler that this processor supports the SMM 1/0 trap extension. If this bit is high, then this processor supports the SMM I/O trap extension. If this bit is low, then this processor does not support 1/0 trapping using the 1/0 trap slot . mechanism. (See Table 8-4.) 8.5.2 AUTO HALT RESTART The Auto HALT Restart slot at register offset (word location) 7F02H in SMRAM indicates to the SMM handler that the SMI# interrupted the processor during a HALT state (bit 0 of slot 7F02H is set to 1 if the previous instruction was a HALT). If the SMI # did not interrupt the processor in a HALT state,'then the SMI # microcode will set bit 0 of the Auto HALT Restart slot to a value of O. If the previous instruction was a HALT, the SMM handler can choose to either set or reset bit O. If this bit is set to 1, the RSM micro code execution will force the processor to re-enter the HALT state. If this bit is set to 0 when the RSM instruction is executed, the processor will continue execution with the instruction just after the interrupted HALT instruction. Note that if the interrupted instruction was not a HALT instruction (bit 0 is set to 0 in the Auto HALT Restart slot upon SMM entry), setting bit 0 to 1 will cause unpredictable behavior when the RSM instruction is executed. (See Figure 8-8 and Table 8-5.) o 15 Register Offset 7F02H L AutoHALT Restart 242202-63 Figure 8-8. Auto HALT Restart I Intel486™ PROCESSOR FAMILY Table 8-5. Bit Values for Auto HALT Restart o 15 Value of Bit 0 at Entry Value of Bit 0 at Exit 0 0 Register Offset Comments I/O Instruction Restart Slot Returns to next instruction in interrupted program. 0 1 Unpredictable 1 0 Returns to next instruction after HALT 1 1 Returns to HALT state If the HALT instruction is restarted, the processor will generate a memory access to fetch the HALT instruction (if it is not in the internal cache), and execute a HALT bus cycle. 8.5.3 I/O INSTRUCTION RESTART The I/O instruction restart slot (register offset 7FOOH in SMRAM) gives the SMM handler the option of causing the RSM instruction to automatically re·execute the interrupted I/O instruction. When the RSM instruction is executed, if the I/O instruction restart slot contains the value OFFH, then the processor will automatically re-execute the I/O instruction that the SMI# trapped. If the I/O instruction restart slot contains the value OOH when the RSM instruction is executed, then the processor will not reexecute the I/O instruction. The processor automatically initializes the I/O instruction restart slot to OOH during SMM entry. The I/O instruction restart slot should be written only when the processor has generated an SMI# on an I/O instruction boundary. Processor operation is unpredictable when the I/O instruction restart slot is set when the processor is servicing an SMI# that originated on a non-I/O instruction boundary. (See Figure 8-9 and Table 8-6.) I 7FOOh 242202-64 Figure 8-9. I/O Instruction Restart Table 8-6 I/O Instruction Restart Value Value at Entry Value at Exit OOH OOH OOH OFFH Comments Do not restart trapped I/O instruction Restart trapped I/O instruction If the system executes back-to-back SMI # requests, the second SMM handler must not set the I/O instruction restart slot (see section 8.6.6 "Nested SMI#s and I/O Restart"). 8.5.4 SMM BASE RELOCATION The Intel486 processor provides a control register, 5MBASE. The address space used as SMRAM can be modified by changing the 5MBASE register before exiting an SMI# handler routine. 5MBASE can be changed to any 32K aligned value (values that are not 32K aligned will cause the processor to enter the shutdown state when executing the RSM instruction). 5MBASE is set to the default value of 30000H on RESET, but is not changed on SRESET. If the 5MBASE register is changed during an SMM handler, all subsequent SMI # requests will initiate a state save at the new 5MBASE. (See Figure 8-10.) 2-135 Intel486TM PROCESSOR FAMIL V 31 I 0 I ======1· ~i L -_ _ _ _ Register Offset 7EF8H SMM Base 242202-65 Figure 8-10. SMM Base Location The 5MBASE slot in the SMM state save area is a feature used to indicate and change the SMI # jump vector location and the SMRAM save area. When bit 17 of the SMM Revision Identifier is set then this feature exists and the SMRAM base and consequently the jump vector are as indicated by the SMM Base slot. During the execution of the RSM instruction, the processor will read this slot and initialize the processor to use the new 5MBASE during the next SMI#. During an SMI#, the processor will do its context save to the new SMRAM area pointed to by the 5MBASE, store the current 5MBASE in the SMM Base slot (offset 7EF8H), and then start execution of the new jump vector based on the current 5MBASE The 5MBASE must be a 32-Kbyte aligned, 32-bit integer that indicates a base address for the SMRAM context save area and the SMI # jump vector. For example when the processor first powers up, the minimum SMRAM area is from 38000H-3FFFFH. The default 5MBASE is 30000H. Hence the starting address of the jump vector is calculated by: 5MBASE + 8000H While the st~rting address for the SMRAM state save area is calculated by: SMM Base + [8000H + 7FFFHl Hence, when this feature is enabled, the SMRAM register map is addressed according to the above formulas. (See Figure 8-11.) To change the SMRAM base address and SMM jump vector location, the SMM handler should modify the 5MBASE slot. Upon executing an RSM instruction,· the processor will read the 5MBASE slot and store it internally. Upon recognition of the next SMI# request, the processor will use the new 5MBASE slot for the SMRAM dump and SMI# jump vector. 2-136 If the modified 5MBASE slot does not contain a 32-Kbyte aligned value,the RSM microcode will cause the processor to enter the shutdown state. SMRAM 5MBASE + 8000H + 7FFFH Start of State Save 5MBASE + 8000H SMM Handler Entry 5MBASE 242202-66 Figure 8-11. SMRAM Usage 8.6 SMM System Design Considerations 8.6.1 SMRAM INTERFACE The hardware designed to control the SMRAM space must follow these guidelines: 1. A provision should be made to allow for initialization of SMRAM space during system boot up. This initialization of SMRAM space must happen before the first occurrence of an SMI# interrupt. Initializing the SMRAM space must include instal. lation of an SMM handler, and may include installation of related data structures necessary for particular SMM applications. The memory controller providingthe interface to the SMRAM should 'provide a means for the initialization code to manually open the SMRAM space. 2. A minimum initial SMRAM address space of 38000H-3FFFFH should be decoded by the memory controller. 3. Alternate bus masters (such as DMA controllers) should not be allowed to access SMRAM space. Only the processor, either through SMI # or during initialization, should be allowed access to SMRAM. 4. In order to implement a zero-volt suspend function, the system must have access to all of normal system memory from within an SMM handler routine. If the SMRAM is going to overlay normal system memory, there must be a method of accessing any system memory that is located underneath SMRAM. I Intel486TM PROCESSOR FAMILY There are two potential schemes for locating the SMRAM, either overlaid to an address space on top of normal system memory, or placed in a distinct. address space. (See Figure 8-12.) When SMRAM is overlaid on the top of normal system memory, the processor output signal SMIACT # must be used to distinguish SMRAM from main system memory. Additionally, if the overlaid normal memory is cacheable, both the processor internal cache and any second level caches must be empty before the first read of an SMM handler routine. If the SMM memory is cacheable, .the caches must be empty before the first read of normal memory following an SMM handier routine. This is done by flushing the caches, and is required to maintain cache coherency. When the default SMRAM location is used, SMRAM is overlaid on top of system main memory (at 38000H through 3FFFFH). If SMRAM is located in its own distinct memory space, which can be completely decoded with only the processor address signals, it is said to be nonoverlaid. In this case, there are no new requirements for maintaining cache coherency. . Nonnal Memory The processor does not unconditionally flush its cache before entering SMM (this option is left to the system designer). If SMRAM is shadowed in a cacheable memory area that is visible to the application or operating system, it is necessary for the system to empty both the processor cache and any second level cache before entering SMM. That is, if SMRAM is in the same physical address location as the normal cacheable memory space, then an SMM read may hit the cache which would contain normal memory space code/data. If the SMM memory is cacheable, the normal read cycles after SMM may hit the cache, which may contain SMM code/data. In this case the cache should be empty before the first memory read cycle during SMM and before the first normal cycle after exiting SMM. (See Figure 8-13.) The FLUSH # and KEN # signals can be used to ensure cache coherency when switching between normal and SMM modes. Cache flushing during SMM entry is accomplished by asserting the FlUSH# pin when SMI# is driven active. Cache flushing during SMM exit is accomplished by asserting the FlUSH# pin after the SMIACT# pin is deasserted (within 1 ClK). To guarantee this behavior, the constraints on setup and hold timings on the interaction of FLUSH # and SM IACT # as specified for a processor should be followed. If the SMRAM area is overlaid over normal memory and if the system designer does not want to flush the caches upon leaving SMM then references -to the SMRAM area should not be cached. It is the obligation of the system designer to ensure that the KEN # pin is sampled inactive during all references to the SMRAM area. Figures 8-14 and 8-15 illustrate a cached and non-cached SMM using FLUSH # and KEN#. Nonnal Memory Normal Memory Non-overlaid (no need to flush caches) 8.6.2 CACHE FLUSHES Overlaid (caches must be flushed) 242202-67 Figure 8-12. SMRAM Location I 2-137 Intel486TM PROCESSOR FAMILY , srlN IlnSlrllnstr IlnSlr I 111 112 InSlr IlnSlr I 114 '3 : I I , I StateS_I SMM Handler I SMIN L-J I SMIACTII J i t Flush Cache i 115 IState Resume I Caci1emust I i CachemuSl be empty beemply 242202-68 Figure 8-13. FLUSH# Mechanism during SMM State Save SMM Handler State Resume Normal Cycle SMIII SMIACTII -------.I FLUSHII 242202-69 Figure 8-14. Cached SMM State Save SMM Handler State Resume Normal Cycle SMIII SMIACTII-------,j RSM ~ KENII FLUSHII----......:.......~ 242202-70 Figure 8-15. Non-Cached SMM 2·138 I Intel486TM PROCESSOR FAMILY 8.6.2.1 Write-Back Enhanced IntelDX2 Processor System Management Mode and Cache Flushing Regardless of the on-chip cache mode (i.e., either write·through or write-back) it is recommended that SMRAM be non-overlaid. This provides the greatest freedom for caching of both SMRAM and normal memory, provides a simplified memory controller de· sign, and eliminates the performance penalty of flushing. In general, cache flushing is not required when the SMRAM and normal memory are not overlaid. Table 8-7 gives the cache flushing requirements for entering and exiting SMM, when the SMRAM is not over· laid with normal memory space. SMRAM can not be cached as write-back lines. If SMRAM is cached, it should be cached only as write-through lines. This is because dirty lines can not be written back to SMRAM upon exit from SMM. The de-assertion of SMIACT# signals that the processor is exiting SMM, and is used to assert FLUSH #. By the time the write back of dirty lines occurs, SMIACT # would already be inactive, so the SMRAM can no longer be decoded. When the SMRAM is cached as write-through, this problem will not occur. Table 8-7. Cache Flushing (Non-Overlaid SMRAM) Normal Memory Cacheable SMRAM Cacheable FLUSH Entering SMM No No No No WT No WT No No WB No No, but Snoop WBs must go to Normal Memory Space. WT WT No WB WT No, but Snoop and Replacement WBs must go to normal memory space. Coherency requirements must be met when the normal memory is cached in write-back mode. In this case, the snoop and replacement write-backs that I occur during SMM must go to normal memory, even though SMIACT# is active. This requirement is compatible with SMM security requirements, because these write backs can not decode the SMRAM, and the memory system must be able to handle this situation properly. If SMRAM is overlaid with normal memory space, additional system design features are needed to ensure that cache coherency is maintained. Table 8-8 lists the cache flushing requirements for entering and exiting the SMM when the SMRAM is overlaid with normal memory space. Table 8-8. Cache Flushing (Overlaid SMRAM) SMRAM Cacheable FLUSH Entering SMM No No No No No WT No Yes WTorWB No Yes No WTorWB WT Yes Yes Normal Memory Cacheable FLUSH Exiting SMM If SMI# and FlUSH# are asserted together, the Write-Back Enhanced IntelDX2 processor guarantees that the FLUSH # will be recognized first, followed by the SMI #. If the cache is configured in the write-back mode, the modified lines will be written back to the normal user space, followed by the two special cycles. The SMI# will then be recognized and the transition to SMM will occur, as shown in Figure 8-16. Cache flushing during SMM exit is accomplished by asserting the FLUSH # pin after the SMIACT # pin is de-asserted (within 1 ClK). To guarantee this behavior, the constraints on setup and hold timings on the interaction of FLUSH # and SMIACT # as specified for the Write-Back Enhanced IntelDX2 processor should be followed. The WBINVD instruction should not be used to flush the cache when exiting SMM. Instead, the FlUSH# pin should be asserted after the SMIACT # pin is deasserted (within 1 ClK). The cache coherency requirements associated with SMM and write-through vs. write-back caches apply to second level cache control designs as well. The appropriate second level cache flushing is also required upon entering and exiting the SMM. . 2-139 Intel486TM PROCESSOR FAMILV Flush Cache Write Back Cycles State Save SMM Handler Normal Cycle State Resume SMI# SMIACT# -----+-----\1 RSM ~ FLUSH#--~ Cache must be empty Cache must be empty 242202-71 Figure 8-16. Write-Back Enhanced IntelDX2TM Processor Cache Flushing for Overlaid SMRAM upon Entry and Exit of Cached SMM Snoops During SMM 8.6.3 A20M# PIN AND 5MBASE RELOCATION Snoops cycles are allowed during SMM. However, because the SMRAM is always cached as a writethrough, there can never be a snoop hit to a modified line in the SMRAM address space. Consequently, if there is a snoop hit to a modified line, it will correspond to the normal address space. In this case, even though SMIACT# is asserted, the memory controller must drive the snoop write-back cycle to the normal memory space and not to the SMRAM address space. PC-compatible architecture Systems based on contain a feature that enables the processor address bit A20 to be forced to O. This limits physical memory to a maximum of 1 Mbyte, and is provided to ensure compatibility with those programs that relied on the physical address wrap around functionality of the 8088 processor. The A20M # pin on Intel486 processors provides this function. When A20M # is active, all external bus cycles will drive A20M # low, and all internal cache accesses will be performed with A20M # low. If the overlaid normal memory is cacheable, FLUSH# must be asserted when entering SMM, causing all modified lines of normal memory to be written back. As a result, there can not be a snoop hit to a modified line in the cacheable normal memo. ry space that is overlaid with the SMRAM space. If the overlaid normal memory is not cacheable, no flushing is necessary when entering SMM. If normal memory is not overlaid with SMRAM, no flushing is required upon entering SMM and it is possible that a snoop can hit a modified line cached from anywhere in normal memory space while the processor is in SMM. 2-140 a The A20M # pin is recognized while the processor is in SMM. The functionality of the A20M # input must be recognized in the following two instances: 1. If the SMM handler needs to access system memory space above 1 Mbyte (for example, when saving memory to disk for a zero-volt suspend), the A20M # pin must be de-asserted before the memory above 1 Mbyte is addressed. 2. If SMRAM has been relocated to address space above 1 Mbyte, and A20M # is active upon entering SMM, the processor will attempt to access SMRAM at the relocated address, but with A20 low. This could cause the system to crash, because there would be no valid SMM interrupt handier at the accessed location. I Intel486™ PROCESSOR FAMILY In order to account for the above two situations, the system designer must ensure that A20M # is de-asserted on entry to SMM. A20M # must be driven inactive before the first cycle of the SMM state save, and must be returned to its original level after the last cycle of the SMM state restore. This can be done by blocking the assertion of A20M # whenever SMIACT# is active. 8.6.4 PROCESSOR RESET DURING SMM The system designer should take into account the following restrictions while implementing the processor RESET logic. 1. When running software written for the .80286 processor a processor SRESET is used to switch the processor from Protected mode to Real mode. Note that SRESET has a higher interrupt priority than SMIACT #. When the processor is in SMM, the SRESET to the processor during SMM should be blocked until the processor exits SMM. SRESET must be blocked beginning from the time when SMI# is driven active and ending at least 20 ClK cycles after SMIACT# is de-asserted. Be careful not to block the global system RESET, which may be necessary to recover from a system crash. 2. During execution of the RSM instruction to exit SMM, there is a small time window between the de-assertion of SMIACT# and the completion of the RSM microcode. If SRESET is asserted during this window, it is possible that the SMRAM space will be violated. The system designer must guarantee that SRESET is blocked until at least 20 processor clock cycles after SMIACT# has been driven inactive. 3. Any request fora processor SRESET for the purpose of switching the processor from Protected mode to Real mode must be acknowledged after the processor has exited SMM. In order to maintain software transparency, the system logic must latch any SRESET signals that are blocked during SMM: 8.6.5 SMM AND SECOND lEVEL WRITE BUFFERS Before an Intel486 processor enters SMM, it empties its internal write buffers. This is necessary so that the data in the write buffers is written to normal memory space, not SMM space. Once the processor is ready to begin writing an SMM state save to SMRAM, it asserts the SMIACT# signal. SMIACT# may be driven active by the processor before the I system memory controller has had an opportunity to empty the second level write buffers. To prevent the data from these second level write buffers from being written to the wrong location, the system memory controller needs to direct the memory write cycles to either SMM space or normal memory space. This can be accomplished by saving the status of SMIACT# along with the address for each word in the write buffers. 8.6.6 NESTED SMI#s AND 1/0 RESTART Special care must be taken when executing an SMM handler for the purpose of restarting an I/O instruction. When the processor executes a RSM instruction with the I/O restart slot set, the restored EI P is modified to point to the instruction immediately preceding the SMI # request, so that the I/O instruction can be re-executed. If a new SMI # request is received while the processor is executing an SMM handler, the processor will service this SMI # request before restarting the original I/O instruction. If the I/O restart slot is set when the processor executes the RSM instruction for the second SMM handier, the RSM microcode will decrement the restored EIP again. EIP now points to an address different from the originally interrupted instruction, and the processor will begin execution of the interrupted application code at an incorrect entry point. To prevent this from occurring, the SMM handler routine must not set the 110 restart slot during the second of two consecutive SMM handlers. 8.7 SMM Software Considerations 8.7.1 SMM CODE CONSIDERATIONS The default operand size and the default address size are 16 bits; however, operand-size override and address-size override prefixes can be used as needed to directly access data anywhere within the 4-Gbyte logical address space. With operand-size override prefixes, the SMM handier can use jumps, calls, and returns, to transfer control to any location within the 4-Gbyte space. Note, however, the following restrictions: • Any control transfer that does not have an operand-size override prefix truncates EIP to 16 loworder bits. 2-141 Intel486™ PROCESSOR FAMILV • Due to the Real mode Style of base-address formation, a far jump or call cannot transfer control to a segment with a base address of more than 20 bits (one megabyte). 4. The 5MBASE Relocation feature affects the way the processor will return from an interrupt or exception during an SMI # handler. 8.7.3 HALT DURING SMM 8.7.2 EXCEPTION HANDLING .Upon entry into SMM, external interrupts that require handlers are disabled (the IF bit in the EFLAGS is cleared). This is necessary because, while the processor is in SMM, it is running in a separate memory space. Consequently the vectors stored in the interruptdescriptor table (lOT) for the prior mode are not applicable. Before allowing exception handling (or software interrupts), the SMM program must initialize new interrupt and exception vectors. The interrupt vector table for SMM has the same format as for Real mode. Until the interrupt vector table is correctly initialized, the SMM handler must not generate an exception (or software interrupt). Even though hardware interrupts are disabled, exceptions and software interrupts can still occur. Only a correctly written SMM handler can prevent internal exceptions. When new exception vectors are initialized, internal. exceptions can be serviced. The following are the restrictions: 1. Due to the Real mode style of base address formation, an interrupt or exception cannot transfer control to a segment with a base address of more that 20 bits. 2. An interrupt or exception cannot transfer control to a segment offset of more than 16 bits (64 Kbytes). 3. If exceptions or interrupts are allowed to occur, only the low order 16 bits of the return address (EIP) are pushed onto the stack. If the offset of the interrupted procedure is greater than 64 Kbytes, it is not possible for the interrupti exception handler to return control to that procedure. (One work-around could be to perform software adjustment of the return address on the stack.) 2-142 HALT· should not be executed during SMM, unless interrupts have been enabled (see section 8.7.2. 'Exception Handling'). Interrupts are disabled in SMM and INTR, NMI, and SMI # are the only events that take the processor out of HALT. 8.7.4 RELOCATING SMRAM TO AN ADDRESS ABOVE ONE MEGABYTE Within SMM (or Real mode), the segment base registers can only be updated by changing the segment register. The segment registers contain only 16 bits, which allows only 20 bits to be used for a segment base address (the segment register is shifted left four bits to determine the segment base address). If SMRAM is relocated to an address above one megabyte, the segment registers can no longer be initialized to point to SMRAM. These areas can still be accessed by using address override prefixes to generate an offset to the correct address. For example, if the 5MBASE has been relocated immediately below 16M,the OS and ES registers are still initialized to 0000 OOOOH. We can still access data in SMRAM by using 32-bit displacement registers: mov eSi, OOFFxxxxH mov aX,ds :[esi] ;64K segment ;immediately ;below 16M I Intel486™ PROCESSOR FAMILY 9.0 HARDWARE INTERFACE 9.1 Introduction The Intel486 processor has separate parallel buses for data and addresses. The bidirectional data bus is 32 bits in width. The address bus consists of two components: 30 address lines (A2-A31) and4-byte enable lines (BEO#-BE3#). The address lines form the upper 30 bits of the address and the byte enables select individual bytes within a 4-byte location. The address lines are bidirectional for use in cache line invalidations. (See Figure 9-1.) The Intel486 processor's burst bus mechanism enables high-speed cache fills from external memory. Burst cycles can strobe data into the processor at a rate of one item every clock. Non-burst cycles have a maximum rate of one item every two clocks. Burst cycles are not limited to cache fills: all read bus cycles requiring more than a single data cycle can be bursted. The Write-Back Enhanced IntelDX2TM processor can also burst write cycles. During bus hold, the Intel486 processor relinquishes control of the local bus by floating its address, data and control buses. The Intel486 processor has an address hold feature in addition to bus hold. During address hold, only the address bus is floated, the data and control buses can remain active. Address hold is used for cache line invalidations. The Intel486 supports the IEEE 1149.1 boundary scan. This section provides a brief description of the Intel486 processor input and output signals arranged by functional groups. The # symbol at the end of a signal name indicates that the active or asserted state occurs when the signal is at a low voltage. When a # is not present after the signal name, the Signal is active at high voltage level. The term "ready" is used to indicate that the cycle is terminated with RDY# or BRDY#. I This section and section 10, "Bus Operation," describe bus cycles and data cycles. A bus cycle is at least two-clocks long and begins with ADS# active in the first clock and ROY # and/or BRDY # active in the last clock. Data is transferred to or from the Intel486 processor during a data cycle. A bus cycle contains one or more data cycles. 9.2 Signal Descriptions 9.2.1 CLOCK (ClK) ClK provides the fundamental timing and the internal operating frequency for the Intel486 processor. All external timing parameters are specified with respect to the rising edge of ClK. The Intel486 processor can operate over a wide frequency range, however the ClK frequency cannot change rapidly while RESET is inactive. The ClK frequency must be stable for proper chip operation because a single edge of ClK is used internally to generate two phases. ClK only needs TTL levels for proper operation. Figure 9-2 illustrates the ClK waveform. 9.2.2 INTElDX4 PROCESSOR CLOCK MULTIPLIER SELECTABLE INPUT (ClKMUl) The IntelDX4 processor differs from the IntelDX2 processor in that it provides for two internal clock multiplier ratios: speed doubled mode and speed tripled mode. Speed doubled mode is identical to the IntelDX2 processor mode of operation where the internal core is operating at twice the external bus frequency. Selecting speed tripled mode causes the internal core frequency to operate at three times the external bus frequency. The IntelDX4 processor determines the desired clock multiplier ratio by sampling the status of the ClKMUl input during cold (power on) processor resets. The clock multiplier ratio cannot be changed during warm resets. Also, SRESET cannot be used to select the clock multiplier ratio. 2-143 Intel486TM PROCESSOR FAMILY { Clocking ClK ClKMUl 0 Inte1486'" Processor Family AoS# 0 Page Caching Control Numeric Error Reporting e { Byte Enables 32-Blt Address Bus o/C# RESET SRESET lOCK# NMI SMI# PlOCK# EAoS# KEN# FLUSH# PWT PCO { 1 BEO# W/R# AHOlo { } BE1# INTR SMIACT# Cache Control BE2# M/IO# { Cache Invalidation { A2_A31 BE3# Roy# STPClK# Interrupt Signals < > FERR# IGNEE# HOLD HloA BOFF# } Bus Cycle Definition },~- BREQ BRoy# BLAST# } Bursl Control } Bus Size Control BS8# BSI6# UP# Upgrade Presenl OP3 Address Bil 20 Mask A20M# { { TCK TMS Boundary Scan 8 Tol TOO OP2 oPl }_ 0 OPO PCHK# VOloET Voltage oelect 0 CACHE# Wrile Back Cache Control 8 HITM# INV WBWT# o InleloX4'" processor only. 8 Not on Inte1486'" SX processor in PGA package. 8 Pins on Write-Back Enhanced InteloX2 when in Enhanced Bus Mode/write back cache mode. o Not on Intel486 SX and IntelSX2 processors. o SMI#, SMIACT#, STPClK, SRESET, UP# not on 50-MHz Intel486 OX processor. 242202-72 Figure 9-1_ Functional Signal Groupings 2-144 I Intel486™ PROCESSOR FAMILV 12 IntelDX4 Processor 3x F CLKMUL .....-K:::::Jf-o 2X!242202-73 tx = input setup times ty = input hold times, output float, valid and hold times 242202-74 Figure 9-2. ClK Waveform Figure 9-3. Voltage Detect (VOlDET) Sense Pin To determine which clock multiplier is desired, the IntelDX4 processor samples the status of CLKMUL while RESET is active. If the CLKMUL input is driven low during RESET, the frequency of the core will be twice the external bus frequency (speed doubled mode). If driven high or left floating, speed tripled mode is selected. (See Table 9-1.) In order to allow maximum flexibility, CLKMUL can be jumper-configurable to either Vee (speed tripled mode) or Vss (speed doubled mode). (See Figure 9-3.) The clock multiplier selection method is fully backward compatible with Intel486 processor-based system designs. The CLKMUL signal occupies a pin which is labeled as an 'INC' on other Intel486 processors. Therefore, this pin is not driven in other Intel486 processor system designs. The IntelDX4 processor contains an internal pull-up resistor on the CLKMUL signal. As shown in Table 9-1, when CLKMUL is not driven, the internal core frequency defaults to speed tripled mode. Table 9-1. Clock Multiplier Selection The internal pull-up resistor on the CLKMUL pin is disabled while the IntelDX4 processor is in the Stop Grant or Stop Clock modes. This prevents a low level DC current path from drawing current while in the Stop Grant or Stop Clock states on a system with CLKMUL connected to Vss. ClKMUl at RESET Clock Multiplier External Clock Freq. (MHz) Internal Clock Freq. (MHz) Vee or Not Driven 3 25 33 75 100 Vss 2 50 100 I 9.2.3 ADDRESS BUS (A31-A2, BEO#-BE3#) A31-A2 and BEO#-BE3# form the address bus and provide physical memory and 1/0 port addresses. The Intel486 processor is capable of addressing 4 gigabytes of physical memory space (OOOOOOOOH through FFFFFFFFH), and 64 Kbytes of 1/0 address space (OOOOOOOOH through OOOOFFFFH). A31-A2 identify addresses to a 4-byte location. BEO #BE3# identify which bytes within the 4-byte location are involved in the current transfer. 2-145 Intel486TM PROCESSOR FAMILY Addresses are driven back into the Intel486 processor oVer A31-A4 during cache line invalidations. The address lines are active HIGH. When used as inputs into the processor, A31-A4 must meet the setup and hold times, t22 and t23' A31-A2 are not driven during bus or address hold. The byte.enable outputs, BEO#-BE3#, determine which bytes must be driven valid for read and write cycles to external memory. • • • • BE3# BE2# BE1 # BEO# applies applies applies applies to to to to 024-031 016-023 08-015 00-07 BEO#-BE3# can be decoded to generate AO, A1 and BHE # signals used in 8- and 16-bit systems (see Table 10-5). BEO#-BE3# are active LOW and are not driven during bus hold. 9_2-4 DATA LINES (031-00) The bidirectional lines, 031-00, form the data bus for the Intel486 processor 00-07 define the least significant byte and 024-031 the most significant byte. Oata transfers to 8- or 16-bit devices are possible using the data bus sizing feature controlled by the BS8 # or BS 16 # input pins. 031 - 00 are active HIGH. For reads, 031-00 must meet the setup and hold times, t22 and t23' 031 - 00 are not driven during read cycles and bus hold. 9_2_5 PARITY Oata Parity Input/Outputs (OPO-OP3) OPO-OP3 are the data parity pins for the processor. There is one pin for each byte of the data bus. Even parity is generated or checked by the parity generators/checkers. Even parity means that there are an even number of HIGH inputs on the eightcorresponding data bus pins and parity pin. Oata parity is generated on all write data cycles with the same timing as the data driven by the Intel486 processor. Even parity information must be driven back to the Intel486 processor on these pins with the same timing as read information to insure that the correct parity check status is indicated by the Intel486 processor. Input signals on OPO-OP3 must meet setup and hold times t22 and t23 for proper operation. Parity Status Output (PCHK#) Parity status is driven on the PCHK# pin, and a parity error is indicated by this pin being LOW. PCHK# is driven the clock after ready for read operations to indicate the parity status for the data sampled at the end of the previous clock. Parity is checked during code reads, memory reads and I/O reads. Parity is not checked during interrupt acknowledge cycles. PCHK# only checks the parity status for enabled bytes as indicated by the byte enable and bus size signals. It is valid only in the clock immediately after read data is returned to the Intel486 processor. At all other times it is inactive (HIGH). PCHK# is never floated. Oriving PCHK # is the only effect that bad input parity has on the Intel486 processor. The Intel486 processor will not vector to a bus error interrupt when bad data parity is returned. In systems that will not employ parity, PCHK# can be ignored. In systems not using parity, OPO-OP3 should be connected to Vee through a pull-up resistor. 9_2.6 BUS CYCLE DEFINITION M/IO#, D/C#, W/R# Outputs MIIO#, O/C# and W/R# are the primary bus cycle definition signals. They are driven valid as the AOS# signal is asserted. M/IO# distinguishes between memory and I/O cycles, O/C# distinguishes between data and control cycles and W/R# distinguishes between write and read cycles. Bus cycle definitions as a function of M/IO#, D/C# and W/R# are given in Table 9-2. Note there is a difference between the Intel486 processor and Intel386TM processor bus cycle definitions .. The halt bus cycle type has been moved to location 001 in the Intel486 processor from location 101 in the Intel386 processor. Location 101 is now reserved and will never be generated by the Intel486 processor. Special bus cycles are discussed in section 10.2.11, "Special Bus Cycles". The values read on these pins do not affect program execution. It is the responsibility of the system to take appropriate actions if a parity error occurs. 2-146 I Intel486™ PROCESSOR FAMILY Table 9-2. ADS#, Initiated Bus Cycle Definitions M/IO# D/C# W/R# Bus Cycle Initiated 0 0 0 Interrupt Acknowledge 0 0 1 Halt/Special Cycle 0 1 0 I/O Read 0 1 1 1/0 Write 1 0 0 Code Read 1 0 1 Reserved 1 1 0 Memory Read 1 1 1 Memory Write is recognized on memory cycle boundaries even though PLOCK # is asserted. The Intel486 processor will drive PLOCK# active until the addresses for the last bus cycle of the transaction have been driven regardless of whether BRDY # or ROY # are returned. A pseudo-locked transfer 'is meaningful only if the memory operand is aligned and if its completely contained within a single cache line. Because PLOCK # is a function of the bus size and KEN# inputs, PLOCK# should be sampled only in the clock ready is returned. This pin is active LOW and is riot driven during bus hold. Refer to section 10.2.7, "Pseudo-Locked Cycles." Bus Lock Output (LOCK #) 9.2.6.1 PLOCK # Floating Point Considerations LOCK# indicates that the Intel486 processor is running a read-modify-write cycle where the external bus must not be relinquished between the read and write cycles. Read-modify·write cycles are used to implement memory-based semaphores. Multiple reads or writes can be locked. When LOCK # is asserted, the current bus cycle is locked and the Intel486 processor should be allowed exclusive access to the system bus. LOCK # goes active in the first clock of the first locked bus cycle and goes inactive after ready is returned indicating the last locked bus cycle. The Intel486 processor will not acknowledge bus hold when LOCK# is asserted (though it will allow an address hold). LOCK # is active LOW and is floated during bus hold. Locked read cycles will not be transformed into cache fill cycles if KEN # is returned active. Refer to section 10.2.6, "Locked Cycles," for a detailed discussion of Locked bus cycles. Pseudo-Lock Output (PLOCK#) The pseudo-lock feature allows atomic reads and writes of memory operands greater than 32 bits. These operands require more than one cycle to transfer. The Intel486 processor asserts PLOCK # during segment table descriptor reads (64 bits) and cache line fills (128 bits). When PLOCK # is asserted no other master will be given control of the bus between cycles. A bus hold request (HOLD) is not acknowledged during pseudolocked reads and writes, with one exception. During non-cacheable non-bursted code prefetches, HOLD I For processors with an on-chip FPU, the following must be noted for PLOCK# operation. A 64-bit floating point number must be aligned to an 8-byte boundary to guarantee an atomic access. Normally PLOCK # and BLAST # are inverse of each other. However, during the first cycle of a 64-bit floating pOint write, both PLOCK# and BLAST# will be asserted. Intel486 processors with on-chip FPUs also assert PLOCK # during floating point long reads and writes (64 bits), segmentable description reads (64 bits) and code line fills (128 bits). 9.2.7 BUS CONTROL The bus control signals allow the Intel486 processor to indicate when a bus cycle has begun, and allow other system hardware to control burst cycles, data bus width and bus cycle termination. Address Status Output (ADS #) The ADS # output indicates that the address and bus cycle definition signals are valid. This signal will go active in the first clock of a bus cycle and go inactive in the second and subsequent clocks of the cycle. ADS # is also inactive when the bus is idle. ADS # is used by the external bus circuitry as the indication that the Intel486 processor has started a bus cycle. The external circuit must sample the bus cycle definition pins on the next rising edge of the clock after ADS# is driven active. ADS# is active LOW and is not driven during bus hold. 2-147 Intel486TM PROCESSOR FAMILY Non-burst Ready Input (ROY #) Burst Last Output (BLAST#) ROY # indicates that the current bus cycle is complete. In response to a read, ROY # indicates that the external system has presented valid data on the data pins. In response to a write request, ROY # indicates that the external system has accepted the Intel486 processor data. ROY # is ignored when the bus is idle and at the end of the first clock of the bus cycle. Because ROY # is sampled during address hold, data can be returned to the processor when AHOlO is active. BLAST # indicates that the next time BROY # is returned it will be treated as a normal ROY #, terminating the line fill or other multiple-data-cycle transfer. BLAST # is active for all bus cycles regardless of whether they are cacheable or not. This pin is active lOW and is not driven during bus hold. ROY # is active lOW, and is not provided with an internal pull-up resistor. This input must satisfy setup and hold times t16 and tH for proper chip operation. 9.2.8 BURST CONTROL Burst Ready Input (BROY #) BROY # performs the same function during a burst cycle that ROY # performs during a non-burst cycle. BROY"" indicates that the external ~ystem has presented valid data on the data pins in response to a read or that the external system has accepted the Intel486 processor data in response to a write. BROY# is ignored when the bus is idle and at the end of the first clock in a bus cycle. During a burst cycle, BROY # will be sampled each clock, and if active, the data presented on the data bus pins will be strobed into the Intel486 processor. AOS# is negated during the second through last data cycles in the burst, but address lines A2.-A3 and byte enables will change to reflect the next data item expected by the Intel486 processor. 9.2.9 INTERRUPT SIGNALS The interrupt signals can interrupt or suspend execution of the processor's current instruction stream. Reset Input (RESET) The RESET input must be used at power-up to initialize the processor. The Reset. input forces the processor to begin execution at a known state. The processor cannot begin execution of instructions until at least 1 ms after Vee and ClK have reached their prciper OC and AC specifications. The RESET pin should remain active during this time to ensure proper processor operation. However, for warm boot-ups RESET should remain active for at least 15 ClK periods. RESET is active HIGH. RESET is asynchronous but must meet setup and hold times t20 and t21 for recognition in any specific clock. RESET will reset 5MBASE to the default value of 30000H. If 5MBASE relocation is not used, the RESET signal can be used .as the only reset. (See section 8, "System Management Mode Architecture.") The Intel486 processor will be placed in the Power Down Mode if UP"" is sampled active at the falling edge of-RESET. If ROY # is returned simultaneously with BROY #, BROY # is ignored and the burst cycle is prematurely aborted. An additional complete bus cycle will be initiated after an aborted burst cycle if the cache line fill was not complete. BROY"" is treated as a normal ready for the last data cycle in a burst transfer or for non-burstable cycles. Refer to section 10.2.2, "Multiple and Burst Cycle Bus Transfers," for burst cycle timing. Soft Reset Input (SRESET) BROY # is active lOW and is provided with a small internal pull-up resistor. BROY # must satisfy the setup and hold timest16 and tH' The SRESET input pin is provided to save the status of 5MBASE during Intel 286 processor-compatible mode change. SRESET leaves the location of 2-148 The SRESET (Soft RESET) input, has the same functions as RESET, but does .not change the 5MBASE, and UP# is not sampled on the falling edge of SRESET. If 5MBASE relocation is used by the system, the soft resets should be handled using the SRESET input. The SRESET signal should not be used for the cold boot-up power-on reset. I Intel486™ PROCESSOR FAMILY 5MBASE intact while resetting other units, including the on-chip cache. (See section 9.2.18.4, "Soft Reset," for Write-Back Enhanced IntelDX2 processor differences.) For compatibility, the system should use SRESET to flush the on-chip cache. The FlUSH# input pin should be used to flush the onchip cache. SRESET should not be used to initiate test modes. System Management Interrupt Request Input (SMI#) SMI # is the system management mode interrupt request signal. The SMI# request is acknowledged by the SMIACT # signal. After the SMI # interrupt is recognized, the SMI # signal will be masked internally until the RSM instruction is executed and the interrupt service routine is complete. SMI # is fallingedge sensitive after internal synchronization. The SMI# input must be held inactive for at least four clocks after it is asserted to reset the edge triggered logic. SMI# is provided with a pull-up resistor to maintain compatibility with designs which do not use this feature. SMI# is an asynchronous signal, but setup and hold times, t20 and t21, must be met in order to guarantee recognition on a specific clock. System Management (SMIACT#) Mode Active Output SMIACT # indicates that the processor is operating in System Management Mode. The processor asserts SMIACT# in response to an SMI interrupt request on the SMI# pin. SMIACT# is driven active after the processor has completed all pending write cycles (including emptying the write buffers), and before the first access to SMRAM when the processor saves (writes) its state (or context) to SMRAM. SMIACT# remains active until the last access to SMRAM when the processor restores (reads) its state from SMRAM. The SMIACT# signal does not float in response to HOLD. The SMIACT# signal is used by the system logic to decode SMRAM. Maskable Interrupt Request Input (INTR) INTR indicates that an external interrupt has been generated. Interrupt processing is initiated if the IF flag is active in the EFlAGS register. The Intel486 processor will generate two locked interrupt acknowledge bus cycles in response to asserting the INTR pin. An 8-bit interrupt number will be latched from an external interrupt controller at I the end of the second interrupt acknowledge cycle. INTR must remain active until the interrupt acknowledges have been performed to assure program interruption. Refer to section 10.2.10, "Interrupt Acknowledge," for a detailed discussion of interrupt acknowledge cycles. The INTR pin is active HIGH and is not provided with an internal pull-down resistor. INTR is asynchronous, but the INTR setup and hold times, t20 and t21, must be met to assure recognition on any specific clock. Non-maskable Interrupt Request Input (NMI) NMI is the non-maskable interrupt request signal. Asserting NMI causes an interrupt with an internally supplied vector value of 2. External interrupt acknowledge cycles are not generated because the NMI interrupt vector is internally generated. When NMI processing begins, the NMI signal will be masked internally until the IRET instruction is executed. NMI is rising edge sensitive after internal synchronization. NMI must be held lOW for at least four ClK periods before this rising edge for proper operation. NMI is not provided with an internal pull-down resistor. NMI is asynchronous but setup and hold times, t20 and t21 must be met to assure recognition on any specific clock. Stop Clock Interrupt Request Input (STPClK#) The Intel486 processor provides an interrupt mechanism, STPClK#, that allows system hardware to control the power consumption of the processor by stopping the internal clock (output of the Pll) to the processor core in a controlled manner. This lowpower state is called the Stop Grant state. In addition, the STPClK # interrupt allows the system to change the input frequency within the specified range or completely stop the ClK input frequency (input to the Pll). If the <:lK input is completely stopped, the processor enters into the Stop Clock state-the lowest power state. If the frequency is changed or stopped, the Intel486 processor will not return to the Stop Grant state until the ClK input has been running at a constant frequency for the time period necessary to stabilize the Pll (minimum of 1 ms). The Intel486 processor will generate a Stop Grant bus cycle in response to the STPClK# interrupt request. STPClK# is active lOW and is provided 2-149 Intel486TM PROCESSOR FAMILY with an internal pull-up resistor. STPCLK# is an asynchronous signal, but must remain active until the processor issues the Stop Grant bus cycle. (Refer to section 10.2.11.3, "Stop Grant Indication Cycle.") 9.2.10 BUS ARBITRATION SIGNALS This section describes the mechanism by which the processor relinquishes control of its local bus when requested by another bus master. Bus Request Output (BREQ) The Intel486 processor asserts BREQ whenever a bus cycle is pending internally. Thus, BREQ is alw~ys asserted in the first clock of a bus cycle, along with ADS#. Furthermore, if the Intel486 processor is currently not driving the bus (due to HOLD, AHOLD, or BOFF #), BREQ is asserted in the same clock that ADS#. would have been asserted if the Intel486 processor were driving the bus. After tile first clock of the bus cycle, BREQ may change state. It will be asserted if additional cycles are necessary to complete a transfer (via BS8 #, BS16 #, KEN #), or if more cycles are pending internally. However, if no additional cycles are necessary to complete the current transfer, BREQ can be negated before ready comes back for the current cycle. External logic can use the BREQ signal to arbitrate among multiple processors. This pin is driven regardless of the state of bus hold or address hold. BREQ is active HIGH and is never floated. During a hold state, internal events may cause BREQ to be de-asserted prior to any bus cycles. Bus Hold Request Input'(HOLD) HOLD allows another bus master cpmplete control of the Intel486 processor bus. The Intel486 proces~or will respond to an active HOLD signal by assertIng HLDA and placing 'most of its output and input! output pins in a high impedance state (floated) after completing its current bus cycle, burst cycle, or sequence of locked cycles. In addition, if the Intel486 processor receives a HOLD request while pertormIng a code fetch, and that cycle is backed off (BOFF#), the Intel486 processor will recogni:z:e HOLD before restarting the cycle. The code fetch can be non-cacheable or cacheable and non-bursted or bursted. The BREQ, HLDA, PCHK# and FERR # pins are not floated during bus hold. The Intel486 processor will maintain its bus in this state until the HOLD is de-asserted. Refer to section 10.2.9, "Bus Hold," for timing diagrams for bus hold cycles and HOLD request acknowledge during BOFF#. 2-150 Unlike the Intel386 processor, the Intel486 processor will recognize HOLD during reset. Pull-up resistors are not provided for the outputs that are floated in response to HOLD. HOLD is active HIGH and is not provided with an internal pull-down resistor. HOLD must satisfy setup and hold times t18 and t19 for proper chip operation. Bus Hold Acknowledge Output (HLDA) HLDA indicates that the Intel486 processor has given the bus to another local bus master. HLDA goes active in response to a hold request presented on the HOLD pin. HLDA is driven active in the same clock that the Intel486 processor floats its bus. HLDA will be driven inactive when leaving bus hold and the Intel486 processor will resume driving the bus. The Intel486 processor will not cease internal activity during bus hold because the internal cache will satisfy the majority of bus requests. HLDA is active HIGH and remains driven during bus hold. Backoff Input (BOFF#) Asserting the BOFF #. input forces the Intel486 processor to release control of its bus in the next clock. The pins floated are exactly the same as in re, sponse to HOLD. The response to BOFF # differs from the response to HOLD in two ways: First, the bus is ·floated immediately in response to BOFF# while the Intel486 processor completes the current bus cycle before floating its bus in response to HOLD. Second the Intel486 processor does not assert HLDA in response to BOFF#. The Intel486 processor remains in bus hold until BOFF # is negated. Upon negation, the Intel486 processor restarts the bus cycle aborted when BOFF# was asserted. To the internal execution engine the effect of BOFF # is the same as inserting a few wait states to the original cycle. Refer to section 10.2.12, 'Bus Cycle Restart,' for a description of bus cycle restart. Any data returned to the Intel486 processor while BOFF# is asserted is ignored. BOFF # has higher priority than RDY# or BRDY#. If both BOFF# and ready are returned in the same clock, BOFF # takes effect. If BOFF # is asserted while the bus is idle, the Intel486 processor will float its bus in the next clock. BOFF.# is active LOW and must meet setup and hold times t18 and t19 for proper chip operation. I Intel486TM PROCESSOR FAMILY 9.2.11 CACHE INVALIDATION The AHOLD and EADS# inputs are used during cache invalidation cycles. AHOLD conditions the Intel486 processor address lines, A4-A31, to accept an address input. EADS# indicates that an external address is actually valid on the address inputs. Activating EADS# will cause the Intel486 processor to read the external address bus and perform an internal cache invalidation cycle to the address indicated. Refer to section 10.2.8, "Invalidate Cycle," or cache invalidation cycle timing. be checked with the current cache contents. If the address specified matches any areas in the cache, that area will immediately be invalidated. An invalidation cycle may be run by asserting EADS# regardless of the state of AHOLD, HOLD and BOFF #. EADS # is active LOW and is provided with an internal pull-up resistor. EADS# must satisfy the setup and hold times t12 and t13 for proper chip operation. 9.2.12 CACHE CONTROL Address Hold Request Input (AHOLD) Cache Enable Input (KEN #) AHOLD is the address hold request. It allows another bus master access to the Intel486 processor address bus for performing an internal cache invalidation cycle. Asserting AHOLD will force the Intel486 processor to stop driving its address bus in the next clock. While AHOLD is active only the address bus will be floated, the remainder of the bus can remain active. For example, data can be returned for a previously specified bus cycle when AHOLD is active. The Intel486 processor will not initiate another bus cycle during address hold. Because the Intel486 processor floats its bus immediately in response to AHOLD, an address hold acknowledge is not required. If AHOLD is asserted while a bus cycle is in progress, and no readies are returned during th.e time AHOLD is asserted, the Intel486 processor Will redrive the same address (that it originally sent out) once AHOLD is negated. KEN # is the cache enable pin. KEN # is used to determine whether the data being returned by the current cycle is cacheable. When KEN # is active and the Intel486 processor generates a cycle that can be cached (most any memory read cycle), the cycle will be transformed into a cache line fill cycle. AHOLD is recognized during reset. Because the entire cache is invalidated by reset, any invalidation cycles run during reset will be unnecessary. AHOLD is active HIGH and is provided with a small internal pull-down resistor. It must satisfy the setup and hold times t18 and t19 for proper chip operation. AHOLD also determines whether or not the built in self test features of the Intel486 processor will be exercised on assertion of RESET. (See section 11.1, "Built-In Self Test.") External Address Valid Input (EADS#) EADS # indicates that a valid external address has been driven onto the Intel486 processor address pins. This address will be used to perform an intern~1 cache invalidation cycle. The external address Will I A cache line is 16 bytes long. During the first cycle of a cache line fill the byte-enable pins should be ignored and data should be returned as if all four byte enables were asserted. The Intel486 processor will run between 4 and 16 contiguous bus cycles to fill the line depending on the bus data width selected by BS8# and BS16#. Refer to section 10.2.3, "Cacheable Cycles," for a description of cache line fill cycles. The KEN # input is active LOW and is provid.ed with a small internal pull-up resistor. It must satisfy the setup and hold times t 14 and t 15 for proper chip operation. Cache Flush Input (FLUSH #) The FLUSH# input forces the Intel486 processor to flush its entire internal cache. FLUSH # is active LOW and need only be asserted for one clock. FLUSH # is asynchronous but setup and hold times t20 and t21 must be met for recognition on any specific clock. FLUSH # also determines whether or not the tristate test mode of the Intel486 processor will be invoked on assertion of RESET. (See section 11.4, "Tri-State Output Test Mode.") 2-151 Intel486TM PROCESSOR FAMILY 9.2.13 PAGE CACHEABILITY (PWT, PCD) The PWT and PCD 04tput signals qorrespond to two user attribute bits in the page table entry. When pag· ing. is enabled, PWT and PCD correspond to bits 3 and 4 of the page table entry respectively. For cycles that are not paged when paging is enabled (for example I/O cycles) PWT and PCD correspond to bits 3 and 4 in control register .3. When paging fS disabled, the Intel486 processor ignores the PCD and PWT bits and assumes they are zero for the , purpose of caching and driving PCD and PWT. PCD is masked by the CD (cache disable) bit in control register 0 (CRO). Whe,n CD= 1 (cache line fills disabled) the Intel486 processor forces PCD HIGH. When CD = 0, PCD is driven with the value of the page table entry/directory. The purpose of PCD is to provide a cacheable/noricacheable indication on a page by page basis. The Intel486 processor will not perform a cache fill to any page in which bit 4 of the page table entry is set. PWT corresponds to the write-back bit and can be used by an external cache to provide this functionality. PCD and PWT bits are assigned to be zero during real mode or whenever paging is disabled. Refer to section 7.6, 'Page Cacheability,' for a discussion of non-cacheable pages. PCD and PWT have the same timing as the cycle definition pins (MIIO#, D/C#, W/R#). PCD and PWT are active HIGH and are not driven during bus hold. NOTE: The PWT and pcb bits function differently in the write-back mode of the Write-Back Enhanced IntelDX2processor. (See section 7.6.1.) 9.2.14 UPGRADE PRESENT (UP#) The Upgrade Present input detects the presence of the upgrade processor, then powers down the core, and tri-states all outputs of the original processor, so that the original processor consumes very low current. This state is known as Upgrade Power Down Mode. UP# is active LOW and sampled at all times, includi~g after power-up and during reset. 2-152 9.2.15 NUMERIC ERROR REPORTING (FERR #, IGNNE#) To allow PC-type floating point error reporting, Intel486 DX, Inte1DX2, and IntelDX4 processors provide two. pins, FERR# and IGNNE#. Floating Point Error Output (FERR #) The processor assertsFERR # whenever an unmasked floating point error is encountered. FERR# is similar to the ERROR # pin on the Intel387 math , coprocessor. FERR # can be used by external logic for PC-type floating point error reporting in systems with an Intel486 DX, Inte1DX2, or IntelDX4 processor. FERR # is active LOW and is not floated during bus hold. In some cases, FERR # is asserted when the next floating point instruction is encountered. In other cases, it is asserted before the next floating' point instruction is encountered, depending on the execution state of the instruction that caused the exception. The following class of floating point exceptions drive FERR # at the time the exception occurs (Le., before encountering the next floating point instruction): 1. The stack fault, invalid operation, and denormal exceptions on all transcendental instructions, integer arithmetic instructions, FSQRT, FSCALE, FPREM(1), FXTRACT, FBLD, and FBSTP. 2. Any exceptions on store instructions (including integer store instructions). . The following class of floating point exceptions drive FERR # only after encountering the next floating point instruction: 1. Exceptions other than on all transcendental instructions, integer arithmetic instructions, FSQRT, FSCALE, FPREM(1), FXTRACT, FBLD, and FBSTP. 2. Any exception on all basic arithmetic, load, compare, and control instructions (Le., all other instructions). . Ignore Numeric Error Input (IGNNE#) Intel486 OX, Inte1DX2, and IntelDX4 processors will ignore a numeric error and continue executing noncontrol floating point instructions when IGNNE # is asserted, and FERR # is still activated. When de- I Intel486TM PROCESSOR FAMILV asserted, the processor will freeze on a non-control floating point instruction if a previous instruction caused an error. IGNNE# has no effect when the NE bit in control register 0 is set. The IGNNE# input is active LOW and provided with a small internal pull-up resistor. This input is asynchronous, but must meet setup and hold times t20 and t21 to insure recognition on any specific clock. 9.2.16 BUS SIZE CONTROL (BS16#, BS8#) The BS16 # and BS8 # inputs allow external 16- and 8-bit buses to be supported with a small number of external components. The Intel486 processor samples these pins every clock. The value sampled in the clock before ready determines the bus size. When asserting BS16# or BS8# only 16 or 8 bits of the data bus need be valid. If both BS16# and BS8# are asserted, an 8-bit bus width is selected. When BS16# or BS8# are asserted, the Intel486 processor will convert a larger data request to the appropriate number of smaller transfers. The byte enables will also be modified appropriately for the bus size selected. BS16# and BS8# are active LOW and are provided with small internal pull-up resistors. BS16# and BS8# must satisfy the setup and hold times t14 and t15 for proper chip operation. 9.2.17 ADDRESS BIT 20 MASK (A20M #) Asserting the A20M# input causes the Intel486 processor to mask physical address bit 20 before performing a lookup in the internal cache and before driving a memory cycle to the outside world. When A20M # is asserted, the Intel486 processor emulates the 1-Mbyte address wraparound that occurs on the 8086. A20M# is active LOW and must be asserted only when the processor is in real mode. The A20M # is not defined in Protected Mode. A20M # is asynchronous but should meet setup and hold times t20 and t21 for recognition in any specific clock. For correct operation of the chip, A20M # should not be active at the falling edge of RESET. A20M # exhibits a minimum 4 clock latency, from time of assertion to masking of the A20 bit. A20M # is ignored during cache invalidation cycles. liD writes require A20M # to be asserted a minimum of 2 clocks prior to RDY being returned for the liD write. This insures recognition of the address mask before the Intel486 processor begins execution of I the instruction following OUT. If A20M # is asserted after the ADS # of a data cycle, the A20 address signal is not masked during this cycle but is masked in the next cycle. During a prefetch (cacheable or not), if A20M # is asserted after the first ADS #, A20 is not masked for the duration of the prefetch; even if BS16# or BS8# is asserted. 9.2.18 WRITE-BACK ENHANCED INTELDX2 PROCESSOR SIGNALS AND OTHER ENHANCED BUS FEATURES This section describes the pins that interface with the system to support the Enhanced Bus mode write-back features at system level. 9.2.18.1 Cacheability (CACHE#) The CACHE# output indicates the internal cacheability on read cycles and a burst write-back on write cycles. CACHE # is asserted for cacheable reads, cacheable code fetches and write-backs. It is driven inactive for non-cacheable reads, special cycles, liD cycles and write-through cycles. This is different from the PCD (page cache disable) pin. The operational differences between CACHE# and PCD are listed in Table 9-3. See Table 9-4 for operational differences between CACHE# and other Intel486 processor signals. Table 9·3. Differences between CACHE # and PCD Bus Operation All reads (1) Replacement write-back CACHE# PCD same as PCD(3) same as PCD(3) low low Snoop-forced write-back low low S-state write-through high same as PCD(3) I-state write-through (2) high same as PCD(3) NOTES: 1. Includes line fills and non-cacheable reads. During locked read cycles CACHE# is inactive. The non-cacheable reads mayor may not be burst. 2. Due to the non·allocate on write policy, this includes both cacheable and non-cacheable writes. PCD will distinguish between the two, but CACHE# does not. 3. This behavior is the same as the existing specification of the Intel486 processor in write-through mode. 2-153 Intel486TM PROCESSOR FAMILY Table 9-4. Write-Back Enhanced IntelDX2TM Processor CACHE# vs. Other Intel486™ Processor Signals Pin Symbol Relation To.This Signal ADS# CACHE # is driven to valid state with ADS# RDY#, BRDY# CACHE # is de· asserted with the first ROY # or BRDY # HLDA, BOFF# CACHE # floats under these signals. KEN# The combination of CACHE # and KEN # determines if a read miss is converted into a cache line fill. 9.2.18.2 Cache Flush (FLUSH#) FLUSH # is an existing pin that operates differently if the processor is configured as Enhanced Bus mode (write·back). In Enhanced Bus mode, it acts similar to the WBINVD instruction. In Enhanced Bus mode, FLUSH # is treated as an interrupt. It is sampled at each clock, but is recognized only on instruction boundary. Pending writes are completed before FLUSH# is serviced, and all prefetching is stopped. Depending on the number of modified lines in the cache, the flush could take up to a minimum of 1280 bus Clocks or 2560 processor clocks and a maximum of 5000 + bus clocks to scan the cache, perform the write backs, invalidate the cache and run two special cycles. After all modified lines are written back to memory, two special bus cycles, "First Flush ACK Cycle" and "Second Flush ACK Cycle," are issued, in that order. These cycles differ from the special cycles issued after WBINVD only in that A2 = 1 (address line 2 = 1). SRESET, STPCLK #, INTR, NMI and SMI # are not recognized during a flush write-back, while BOFF #, AHOLD and HOLD are recognized. FLUSH # may be asserted just for a single clock, or may be retained asserted, but should be de-asserted at or prior to the ROY # returned from the "First Flush ACK" special bus cycle. If asserted during INVD or WBINVD, FLUSH# will be recognized. If asserted simultaneously with SMI#, then SMI# is recognized after FLUSH # is serviced. FLUSH # may be driven at any time. If driven during SRESET, it must be held for one clock after SRESET is de-asserted to be recognized. 2-154 9.2.18.3 Hit/Miss to a Modified Line (HITM#) HITM # is a cache coherency protocol pin that is driven only in Enhanced Bus mode. When a snoop cycle is run by the system (with INV = "O"or INV = "1"), HITM# indicates if the processor contains the snooped line in the M-state. Assertion of HITM # indicates that the line will be written back in total, unless the processor is already in the process of doing a replacement write-back of the same line. HITM # will be valid on the bus two system clocks after EADS# is asserted on the bus. If asserted, HITM # remains asserted until the last ROY # or BRDY # of the snoop write-back cycle is returned. It will be de-asserted before the next following ADS #. (See Table 9-5.) Table 9-5. HITM# vs. Other Intel486™ Signals Pin Symbol Relation To This Signal EADS# HITM# is asserted due to an EADS#driven snoop, provided the snooped line is in the M-state in the cache. HLDA, BOFF# HITM# does not float under these signals. ADS#, CACHE# The beginning of a snoop write-back cycle is identified by the assertion of ADS#, CACHE#, and HITM#. 9.2.18.4 Soft Reset (SRESET) When in Enhanced Bus mode, SRESET has the following differences: SRESET, unlike RESET, does not cause AHOLD, A20M#, FLUSH#, UP#, and WB/WT# pins to be sampled (i.e., special test modes and on-chip cache configuration can not be accessed with SRESET.) On SRESET, the internal SMRAM base register retains its previous value and the processor does not flush, write-back or disable the internal cache. CRO.CD and CRO.NW retain previous values, CRO.4 is set to '1, and the remaining bits are cleared. Because ~RESET is treated as an interrupt, it is possible to have a bus cycle while SRESET is asserted. A bus cycle could be due to an on-going instruction, emptying the write buffers of the processor, or snoop write-back cycles if there is a snoop hit to an M-state line while SRESET is asserted. I Intel486TM PROCESSOR FAMILY NOTE: For both Standard Bus mode and Enhanced Bus mode: • SMI # must be blocked during SRESET. It must also be blocked for a minimum of 2 clocks after SRESET is de-asserted. • SRESET must be blocked during SMI#. It must also be blocked for a minimum of 20 clocks after SMIACT# is de-asserted. 9.2.18.5 Invalidation Request (INV) INV is a cache coherency protocol pin that is used only in Enhanced Bus mode. It is sampled by the processor on EADS#-driven snoop cycles. It is necessary to assert this pin to simulate the standard mode processor invalidate cycle on writethrough-only lines. INV also invalidates the writeback lines. However, if the snooped line is in the Mstate, the line will be written back and then invalidated. INV is sampled when EADS# is asserted. If INV is not asserted with EADS #, the snoop cycle will have no effect on a write-through-only line or a line allocated as write-back, but not yet modified. If the line is write-back and modified, it will be written back to memory, but will not be de-allocated (invalidated) from the internal cache. The address of the snooped .cache line is provided on the address bus. (See Table 9-6.) Enhanced IntelDX2 processor operates in the Intel486 processor standard mode. For write-through only operation, i.e. standard mode, WB/WT# does not need to be connected . In Enhanced Bus mode, WB/WT# allows the system-hardware to force any allocated line to be treated as write-through or write-back. As with cacheability, both the processor and the external system must agree that a line may be treated as write-back for the internal cache to be allocated as write-back. The default is always write-through. The processor's indication of write-back vs. write-through is from the PWT pin, in which function and timing are the same as in the standard mode Intel486 proce.5sor. To define write-back or write-through configuration of a line, WB/WT # is sampled in the same clock as the first RDY # or BRDY # is returned during a line fill (allocation) cycle. (See Table 9-7.) Table 9-7. WB/WT # vs. Other Intel486™ Processor Signals Pin Symbol RDY#, BRDY# WB/WT# is sampled with the first RDY # or BRDY # PWT The combination of WB/WT # and PWT determine if the Write-Back Enhanced IntelDX2TM processor will treat the line as WB. Table 9-6. INV vs. Other Intel486™ Signals Pin Symbol Relation To This Signal Relation to This Signal PCD, The state of WB/WT # does not matter CACHE#, if PCD, CACHE # or KEN # define the KEN# line to be non-cacheable. EADS# EADS # determines when INV is sampled. W/R# A31-A4 The address of the snooped cache line is provided on these pins. WB/WT # is significant only on read fill cycles. RESET WB/WT# is sampled on the falling edge of RESET to define. the cache configuration. 9.2.18.6 Write-Back/Write-Through (WB/WT #) WB/WT # enables Enhanced Bus mode (write-back cache). It also allows the system to define a cached line as write-through or write-back WB/WT# is sampled at the falling edge of RESET to determine if Enhanced Bus mode is enabled (WB/WT # must be driven for two clocks before and two clocks after RESET for recognition by the processor). If sampled low or floated, the Write-Back I 9.2.18.7 Pseudo-Lock Output (PLOCK #) In the Enhanced bus mode, PLOCK # is always driven inactive. In this mode, a 64-bit data read (caused by an FP operand access or a segment descriptor read) is treated as a multiple cycle read request, which may be a burst or a non-burst access based on whether BRDY # or RDY # is returned by the system. Becau!?e only write-back cycles (caused by Snoop write-back or replacement write-back) are 2-155 Intel486TM PROCESSOR FAMILY burstable, a 64-bit write will be driven out as two non-burst bus cycles. BLAST # is asserted during both writes. Refer to section 10.2, "Bus Functional Description" for details on Pseudo-Locked bus cycles. 9.2.19 INTELDX4 PROCESSOR VOLTAGE DETECT SENSE OUTPUT (VOLDET) A voltage detect sense pin (VOLDET) has been added to the IntelDX4 processor PGA package. This pin allows external system logic to distinguish between a 5V Intel486 DX or IntelDX2 processor and the 3.3V IntelDX4 processor. The pin passively indicates to external logic whether the installed PGA processor requires 5V (in the case of the Intel486 DX or IntelDX2 processor) or 3.3V (in the case of the IntelDX4 processor). Pin S4 has been defined as the VOLDET pin because this pin is defined as an INC pin on the Intel486 DX and IntelDX2 processor. This pin is only provided in PGA package. To utilize this feature, a weak, external pull-up resistor should be connected to the VOLDET pin. This pin samples high (logic 1) if the installed processor is a 5V Intel486 DX or IntelDX2 processor. This pin samples low (logic 0) if a IntelDX4 processor is installed. Upon sampling the logic level of this pin, externallogic can then enable the proper Vee level to the processor. In power sensitive applications, an active element is preferred for the pull-up device because it could be disabled after sampling, thereby eliminating the resulting DC current path when the installed processor is the IntelDX4 processor. Figure 9-4 shows a logical representation of the Voltage Detect sense mechanism. This pin can remain not connected for those system designs that do not wish to utilize this voltage detect feature. 9.2.20 BOUNDARY SCAN TEST SIGNALS The following boundary scan test signals are available on all Intel486 processors except the Intel486 SX processor in PGA packages. 2-156 +5V ) 101«2 (or weak active pullup) External VOLD~ET~s~en~s~e~p~in!J---L--i Sampling rooLogic Processor Vee Enable 242202-75 Figure 9-4. Voltage Detect (VOLDET) Sense Pin Test Clock (TCK) TCK is an input to the Intel486 processor and provides the clocking function required by the JTAG boundary scan feature. TCK is used to clock state information and data into and out of the component. State select information and data are clocked into the component on the rising edge of TCK on TMS and TDI, respectively. Data is clocked out of the part on the falling edge of TCK on TDO. In addition to using TCK as a free running clock, it may be stopped in a low, 0, state, indefinitely as described in IEEE 1149.1. While TCK, is stopped in the low state, the boundary scan latches retain their state. When boundary scan is not used, TCK should be tied high or left as a NC. (This is important during power up to avoid the possibility of glitches on the TCK which could prematurely initiate boundary scan operations.) TCK is supplied with an internal pull-up resistor. TCK is a clock signal and is used as a reference for sampling other JTAG signals. On the rising edge of TCK, TMS and TDI are sampled. On the falling edge of TCK, TDO is driven. Test Mode Select (TMS) TMS is decoded by the JTAG TAP (Tap Access Port) to select the operation of the test logic, as described in section 11.5.4, "Test Access Port Controller." I Intel486TM PROCESSOR FAMILY To guarantee deterministic behavior of the TAP controller TMS is provided with an internal pull-up resistor. If boundary scan is not used, TMS may be tied high or left unconnected. TMS is sampled on the rising edge of TCK. TMS is used to select the internal TAP states required to load boundary scan instructions to data on TOL For proper initialization of the JTAG logic, TMS should be driven high, "1," for at least four TCK cycles following the rising edge of RESET. Test Data Input (TOI) TOI is the serial input used to shift JTAG instructions and data into the component. The shifting of instructions and data occurs during the SHIFT-IR and SHIFT-OR TAP controller states, respectively. These states are selected using the TMS signal as described in section 11.5.4, "Test Access Port Controller." An internal pull-up resistor is provided on TOI to ensure a known logic state if an open circuit occurs on the TOI path. Note that when "1" is continuously shifted into the instruction register, the BYPASS instruction is selected. TOI is sampled on the rising edge of TCK, during the SHIFT-IR and the SHIFT-OR states. Ouring all other TAP controller states, TOI is a "don't care." TOI is only sampled when TMS and TCK have been used to select the SHIFT-IRor SHIFT-OR states in the TAP controller. For proper initialization of JTAG logic, TOI should be driven high, "1," for at least four TCK cycles following the rising edge of RESET. Test Data Output (TOO) TOO is the serial output used to shift JTAG instruc~ions a~d data out of the component. The shifting of Instructions and data occurs during the SHIFT-IR and SHIFT-OR TAP controller states, respectively. These states are selected using the TMS signal as described in section 11.5.4, "Test Access Port Controller". When not in SHIFT-IR or SHIFT-OR state, TOO is driven to a high impedance state to allow connecting TOO of different devices in parallel. TOO is driven on the falling edge of TCK during the SHIFT-IR and SHIFT-OR TAP controller states. At all other times TOO is driven to the high impedance state. TOO is only driven when TMS and TCK have been used to select the SHIFT-IR or SHIFT-OR states in the TAP controller. I 9.3 Interrupt and Non-Maskable Interrupt Interface The Intel486 processor provides four asynchronous interrupt inputs: INTR (interrupt request), NMI (nonmaskable interrupt input), SMI# (system management interrupt) and STPCLK# (stop clock interrupt). This section describes the hardware interface between the instruction execution unit and the pins. For a description of the algorithmic response to interrupts refer to section 4.7.6, "Interrupts". For interrupt timings refer to section 10.2.10, "Interrupt Acknowledge". 9.3.1 INTERRUPT LOGIC The Intel486 processor contains a two-clock synchronizer on the interrupt line. An interrupt request will reach the internal instruction execution unit two clocks after the INTR pin is asserted, if proper setup is provided to the first stage of the synchronizer. There is no special logic in the interrupt path other than the synchronizer. The INTR Signal is level sensitive and must remain active for the instruction execution unit to recognize it. The interrupt will not be serviced by the Intel486 processor if the INTR Signal does not remain active. The instruction execution unit will look at the state of the synchronized interrupt signal at specific clocks during the execution of instructions (if interrupts are enabled). These specific clocks are at instruction boundaries, or iteration boundaries in the case of string move instructions. Interrupts will only be accepted at these boundaries. An .interrupt must be presented to the Intel486 processor INTR pin three clocks before the end of an instruction for the interrupt to be acknowledged. Presenting the interrupt 3 clocks before the end of an instruction allows the interrupt to pass through the two clock synchronizer leaving one clock to prevent the initiation of the next sequential instruction and to begin interrupt service. If the interrupt is not received in time to prevent the next instruction, it will be accepted at the end of next instruction, assuming INTR is still held active. The longest latency between when an interrupt request is presented on the INTR pin and when the interrupt service begins is: longest instruction used + the two clocks for synchronization + one clock required to vector into the interrupt service microcode. 2-157 Intel486TM PROCESSOR FAMILY 9.3.2 NMI LOGIC The NMI pin has a synchronizer like that used on the INTR line. Other than the synchronizer, the NMllogic is different from that of the maskable interrupt. NMI is edge triggered as opposed to the level tri'ggered INTR signal. The rising edge of the NMI signal is used to generate the interrupt request. The NMI input need not remain active until the interrupt is actually serviced. The NMI pin only needs to remain active for a single clock if the required setup and hold times are met. NMI will operate properly if it is held active for an arbitrary number of clocks. The NMI input must be held inactive for at least four clocks after it is' asserted to reset the edge triggered logic. A subsequent NMI may not be generated if the NMI is not held inactive for at least four clocks after being asserted. The NMI input is internally masked whenever the NMI routine is entered. The NMI input will remain masked until an IRET (return from interrupt) instruction is executed. Masking the NMI Signal prevents recursive NMI calls. If another NMI occurs while the NMI is maSked off, the pending NMI will be executed after the current NMI is done. Only one NMI can be pending while NMI is masked. 9.3.3 SMI # LOGIC 8MI# is edge triggered like NMI, but the interrupt request is generated on the falling-edge. 8MI # is an asynchronous signal, but setup and hold times, t20 and t21,' must be met in order to guarantee recognition on a specific clock. The 8MI # input need not remain active until the interrupt, is actually serviced. The 8MI # input only needs to remain active for a single clock if the required setup and hold times are met 8MI# will also work correctly if it is held active for an arbitrary number of clocks. The 8MI # input must be held inactive for at least four clocks after it is asserted to reset the edge triggered logic. A subsequent 8MI # might not be recognized if the 8MI # input is not held inactive for at least four clo<::ks after being asserted. 8MI #, like NMI, is not affected by the IF bit in the EFLAG8 register and is recognized on an instruction boundary. An 8MI # will not break locked bus cycles. 2-158 The 8MI # has a higher priority than NMI and is not masked during an NMI. After the 8MI # interrupt is recognized, the 8MI # signal will be masked internally until the RSM instruction is executed and the interrupt service routine is complete. Masking the 8MI # prevents recursive SMI # calls. The SMI # input must be de-asserted for at least 4 clockS to reset the edge triggered logic. If another SMI# occurs while the SMI# is masked, the pending SMI # will be recognized and executed on the next instruction boundary after the current SMI # completes. This instruction boundary occurs before execution of the next instruction in the interrupted application code, resulting in back to back SMM handlers. Only one 8MI# can be pending while SMI# is masked. The SMI# signal is synchronized internally and should be asserted at least three (3) ClK periods prior to asserting the RDY # signal in order to guarantee recognition on a specific instruction boundary. This is important for servicing an 1/0 trap with an SMI# handler. 9.3.4 STPCLK # LOGIC STPClK# is level triggered and active lOW. STPClK # is an asynchronous signal, but must remain active until the processor issues the Stop Grant bus cycle. STPClK # may be de-asserted at any time' after the processor has issued the 8top Grant bus cycle. When the processor enters the Stop Grant state, the internal pull-up resistor of STPClK #, ClKMUl (for IntelDX4 processor), and UP# are disabled so that the processor power consumption is reduced. The 8TPClK # input must be driven high (not floated) in order to exit the Stop Grant state. STPCLK # must be de-asserted for a minimum of 5 clocks after RDY # or BRDY # is returned active for the Stop Grant Bus Cycle before being asserted again. When the processor recognizes a 8TPClK # interrupt, the processor will stop execution on the next instruction boundary (unless superseded by a higher priority interrupt), stop the pre-fetch unit, empty all internal pipelines and the write buffers, generate a 8top Grant bus cycle, and then stop the internal clock. At this point the processor is in the Stop Grant state. I Intel486TM PROCESSOR FAMILY The processor cannot respond to a STPCLK # request from an HLDA state because it cannot empty the write buffers and, therefore, cannot generate a Stop Grant cycle. Intel486 T.. Processor Cache xEJ xl ,~x I The rising edge of STPCLK# will tell the processor that it can return to program execution at the instruction following the interrupted instruction. Unlike the normal interrupts, INTR and NMI, the STPCLK # interrupt does not initiate acknowledge cycles or interrupt table reads. Among external interrupts, the STPCLK # order of priority is shown in section 4.7.6. 9.4 Write Buffers The Intel486 processor contains four write buffers to enhance the performance of consecutive writes to memory. The buffers can be filled at a rate of one write per clock until all four buffers are filled. When all four buffers are empty and the bus is idle, a write request will propagate directly to the external bus bypassing the write buffers. If the bus is not available at the time the write is generated internally, the write will be placed in the write buffers and propagate to the bus as soon as the bus becomes available. The write is stored in the on-chip cache immediately if the write is a cache hit. Writes will be driven onto the external bus in the same order in which they are received by the write buffers. Under certain conditions a memory read will go onto the external bus before the memory writes pending in the buffer even though the writes occurred earlier in the program execution. A memory read will only be reordered in front of all writes in the buffers under the following conditions: If all writes pending in the buffers are cache hits and the read is a cache miss. Under these conditions the Intel486 processor will not read from an external memory location that needs to be updated by one of the pending writes. Reordering of a read with the writes pending in the buffers can only occur once before all the buffers are emptied. Reordering read once only maintains cache consistency. Consider the following example: The processor writes to location X. Location X is in the internal cache, so it is updated there immediately. However, the bus is busy so the write out to main memory is buffered (see Figure 9-5). At this point, any reads to location X would be cache hits and most up-to-date data would be read. I Main Memory Write Buffer Data X w X Y NawDataX New DataY Z 242202-76 Figure 9·5. Reordering of a Reads with Write Buffers The next instruction causes a read to location. Y. Location Y is not in the cache (a cache miss). Because the write in the write buffer is a cache hit, the read is reordered. When location Y is read, it is put into the cache. The possibility exists that location Y will replace location X in the cache. If this is true, location X would no longer be cached (see Figure 9-6). Inte1486 T" Processor Cache Wr~e Buffer Main Memory xEJ xl ,~x I wr-==l ~L:J 242202-77 Figure 9·6_ Reordering of a Reads with Write Buffers Cache consistency has been maintained up to this point. If a subsequent read is to location X (now a cache miss) and it was reordered in front of the buffered write to location X, stale data would be read. This is why only 1 read is allowed to be reordered. Once a read is reordered, all the writes in the write buffer are flagged as cache misses to ensure that no more reads are reordered. Because one of the conditions to reorder a read is that all writes in the write buffer must be cache hits, no more reordering is allowed until all of those flagged writes propagate to the bus. Similarly, if an invalidation cycle is run all entries in the write buffer are flagged as cache misses. 2-159 Intel486™ PROCESSOR FAMILY For multiple processor systems and/or systems using DMA techniques, such as bus snooping, locked semaphores should be used to maintain cache consistency. processor will read the data, operate on the data and place the results in a write buffer. The contents of the write buffer will then be written to external memory. LOCK# will become inactive after the write part of the locked cycle. 9.4.1 WRITE BUFFERS AND 1/0 CYCLES Input/Output (I/O) cycles must be handled in a different manner by the write buffers. I/O reads are never reordered in front of buffered memory writes. This insures that the Intel486 processor will update all memory locations before reading status from an I/O device. The Intel486 processor never buffers single I/O writes. When processing an OUT instruction, internal execution stops until the I/O write actually completes on the external bus. This allows time for the external system to drive an invalidate into the Intel486 processor or to mask interrupts before the processor progresses to the instruction following OUT. REP OUTS instructions will be buffered. A read cycle must be explicitly generated to a noncacheable location in memory to guarantee that a read bus cycle is performed. This read will not be allowed to proceed to the bus until after the I/O write has completed because I/O writes. are not buffered. The I/O device will have time to recover to accept another write during the read cycle. 9.5 Reset and Initialization The Intel486 processor has a built in self test (BIST) that can be run during reset. BIST is invoked if the AHOLD pin is asserted for 1 clock before and 1 clock after RESET is de-asserted. RESET must be active for 15 clocks with or without BIST being enabled. To ensure proper results, neither FLUSH# nor SRESET can be asserted while BIST is executing. f\efer to section. 11.0, "Processor Tes~~bility," for information on Intel486 processor testability. The Intel486 processor registers have the values shown in Table 9-8 after RESET is performed. The EAX register contains information on the success or failure of the BIST if the self test is executed. The OX register always contains a component identifier at the conclusion of RESET. The upper byte of OX (DH) will contain 04 and the lower byte (DL) will contain the revision identifier. (See Table 9-9.) RESET forces the Intel486 processor to terminate all execution and local bus activity. No instruction or bus activity will occur as long as RESET is active. All entries in the cache are invalidated by RESET. 9.4.2 WRITE BUFFERS IMPLICATIONS ON LOCKED BUS CYCLES Locked bus cycles are used for read-modify-write accesses to memory. During a read-modify-write access a memory base variable is read, modified and then 'written back to the same memory location. It is important that no other bus cycles, generated by other bus masters or by the Intel486 processor itself, be allowed on the external bus between the read and write portion of the locked sequence. During a locked read cycle, the Intel486 processor will always access external memory, it will never look for the location in the on-chip cache, but for write cycles, data is written in the internal cache (if cache hit)· and in the external memory. All data pending in the Intel486 processor's write buffer.s will be written to memory before a locked cycle IS al- . lowed to proceed to the external bus. The Intel486 processor will assert the LOCK# pin after the write buffers are emptied during a locked bus cycle. With the LOCK # pin asserted, the 2-160 9.5.1 FLOATING POINT REGISTER VALUES In addition to the register values listed above, Intel486 OX, IntelDX2, and IntelDX4 processors have the floating point register values shown in Table 9~10. The floating point registers are initialized as if the FINIT /FNINIT (initialize processor) instruction was executed if the BIST was performed. If the BIST is not executed, the floating pOint registers are unchanged. The Intel486 processor will start executing instructions at location FFFFFFFOH after RESET. When the first Inter Segment Jump or Call is executed, address lines A20-A31 will drop LOW for CS-relative memory cycles, and the Intel486 processor will only execute instructions in the lower one Mbyte of phYSIcal memory. This allows the system designer to use a ROM at the top of physical memory to initialize the system and take care of RESETs. I Intel486TM PROCESSOR FAMILV Table 9-8. Register Values after Reset Table 9-10. Floating Point Values after Reset Register Initial Value (BIST) Initial Value (No BIST) Register Initial Value (BIST) Initial Value (No BIST) EAX Zero (Pass) Undefined CW 037Fh Unchanged ECX Undefined Undefined SW OOOOh Unchanged EDX 0400 + Revision ID 0400 + Revision ID TW FFFFh Unchanged FIP OOOOOOOOh Unchanged ESX Undefined Undefined FEA OOOOOOOOh Unchanged ESP Undefined Undefined FCS OOOOh Unchanged ESP Undefined Undefined ESI Undefined Undefined EDI Undefined Undefined EFLAGS 00OOOO02h 00000002h EIP OFFFOh OFFFOh ES OOOOh OOOOh CS FOOOh* FOOOh* SS OOOOh OOOOh DS OOOOh OOOOh FS OOOOh OOOOh GS OOOOh OOOOh IDTR Base = 0, Limit = 3FFh Base = 0, Limit = 3FFh CRO 60000010h 60000010h DR7 OOOOOOOOh OOOOOOOOh Table 9-9. Intel486™ Processor Revision 10 Product FDS OOOOh Unchanged FOP OOOh Unchanged FSTACK Undefined Unchanged 9.5.2 PIN STATE DURING RESET The Intel486 processor recognizes and can respond to HOLD, AHOLD, and BOFF # requests regardless of the state of RESET. Thus, even though the proc· essor is in reset, it can still float its bus in response to any of these requests. While in reset, the Intel486 processor bus is in the state shown in Figure 9·7 if the HOLD, AHOLD and BOFF # requests are inactive. The figure shows the bus state for the Intel486 processor. Note that the address (A31-A2, BE3# -BEO#) and cycle definition (MIIO#, D/C#, W/R#) pins are undefined from the time reset is asserted up to the start of the first bus cycle. All undefined pins (except FERR#) assume known values at the beginning of the first bus cycle. The first bus cycle is always a code fetch to address FFFFFFFOH. Component 10 Revision (DH) 10 (DL) Intel486 SX Processor 04 IntelSX2TM Processor 04 5x Intel486TM DX Processor 04 Ox 1x IntelDX2TM Processor 04 3x Write-Back Enhanced 04 7x 04 8x 2x IntelDX2 Processor IntelDX4™ Processor I 2·161 Intel486™ PROCESSOR FAMILY CLK RESET AHOlD I FlUSH# (SyI1O FLUSH# (AsynO A20M# (SynO A20M# (AsynO AOS#, BREa A31-A4, MIO#, BLAST UNDEFINED BED-BE311, PWT, PCD A3, A2, PLOCK UNDEFINED DlCII,W/RII PCHK# no lOCKIi 031-00, 031-DO, HlDA SMIACT# WBN/Ttl _ _ _--.L)",)}./J)} •••••••••••••••••••••••••••••••••••••••••••••• (s) no _ _ _ _..s.1.LJ!l0 \\\ _ _ _ _..s.IJJ!1 (j) CACHEII HITM# Nol8O: 1. R~SET is an asynchronous input taa must be met only to guarantee recognition on a specific clock edge. 2., When A20M# Is driven synchronously, nmust be driven high (inaCtiVe) lor Ihe ClK edge prior 10 Ihe falling edge of RESET to ensure proper operation. A20M# setup and hold times must be met 2b, When A20M# is driven asynchronously, n MolAd be driven low (aCtiVe) lor two ClKs prior to and two ClKs aller Ihe failing adge of RESET to e"""III proper operation. 3a. W/1tn FLUSH. ii driven synchronously, Hmust be driven low (high) lor the ClK adge prior to Ihe failing edge of RESET 10 Invoke Ihe ktata Output Test Mode. All oulplJlS a,. guaranteed 3-slatad wilhin 10 ClKs of RESETR being deasserted. FlUSHlisaiup end hold times must be mal 3b. When FLUSH. is driven asynchronously. it must be driven I~ (active) for two elKs period prior to and two elKs after Ihe falling edge of RESET 10 Invoke Ihe klate Output Tesl Mode. All outputs ara guaranteed 3-statad wHhin 10 CLKs of RESET being de....lled. 4. AHOLD should be driven high (actiVe) lor the ClK edge prior 10 the famng edge of RESET 10 invoke the Built·ln-Sell· Test (BIST). AHOLD salup and hold times must be mel S. Hold Is "",og_d normally during RESET. On powsr·up HlDA is indetenninata until RESET Is recognized by the pro.....r. 8.15 eLKs RESET pulse wkiihforwarm rasela. Power-uprese1B require RESET to be asserted for at least 1 msafter V.. end ClK a.. stable. 7. WBN/T# ohoutd be driven high lor at 1..lt one ClK before failing edge of RESET and alleast one CLK attar falling edge of RESET 10 enable Ihe·Enhancod Bus Mode. The Stendard Bus Mode will be enabled ~ WBNIT# is sampled low or loft floating at the falling adge of RESET. 8. The lyotem msy sample HITM# 10 detect the presence of Ihe Enhenced Bus Mode. II HITM# is HIGH for one ClK after AtIB! is inaCtive. the Enhanced Bus Mode is ~nt. 242202-78 Figure 9-7. PIn States during RESET 2·162 I Intel486™ PROCESSOR FAMILY 9.5.2.1 Controlling the ClK Signal in the Processor during Power On The power on requirements of the Intel486 proces· sor with regards to allowable ClK input during the power on sequence have never been specified. Clocking the processor before Vcc has reached its normal operating level can cause unpredictable results on Intel486 processors. While Intel will maintain original clock and power specifications (none), this section reflects what Intel considers to be a good clock design. Intel strongly recommends that system designers ensure that a clock signal is not presented to the Intel486 processor until Vcc has stabilized at its normal operating level. This design recommendation can easily be met by gating the clock signal with a POWERGOOD signal. The POWER GOOD signal should reflect the status of Vcc at the Intel486 processor (which may be different from the power supply status in designs that provide power to the processor through the use of a voltage regulator or converter). Most clock synthesizers and some clock oscillators contain on-board gating logic. If external gating logic is implemented, it should be done on the original clock signal output from the clock oscillator/synthesizer. Gating the clock to the processor independently ·of the clock to the rest of the motherboard will cause clock skew, which may violate processor or chipset timing requirements. If the clock signal to the motherboard is enabled with a POWER GOOD signal, it is also important to verify that the motherboard logic does not require a clock input prior to this POWERGOOD signal. Some chipsets also gate the clock to the processor only after a POWERGOOD signal, which inherently meets the requirements of this design note. Designs should implement this design note, so as to maintain maximum flexibility with all Intel486 processor steppings. 9.5.2.2 FERR # Pin State During Reset for Intel486 DX, Inte1DX2, and IntelDX4 Processors FERR# reflects the state of the ES (error summary status) bit in the floating point unit status word. The ES bit is initialized whenever the floating pOint unit state is initialized. The floating point unit's status word register can be initialized by BIST or by executing FINIT /FNINIT instruction. Thus, after reset and I before executing the first FINIT or FNINIT instruction, the values of the FERR # and the numeric status word register bits 0-7 depends on whether or not BIST is performed. Table 9-11 shows the state of FERR # signal after reset and before the execution of the FINIT /FNINIT instruction. Table 9-11. FERR# Pin State after Reset and before FP Instructions BIST Performed FERR# Pin FPU Status Word Register Bits 0-7 YES Inactive (High) Inactive (low) NO Undefined (low or High) Undefined (low or High) After the first FINIT or FNINIT instruction. FERR# pin and the FPU status word register bits (07) will be inactive irrespective of the Built-In Self-Test (BIST). 9.5.2.3 Power Down Mode (Upgrade Processor Support) The Power Down Mode on the Intel486 processor, when initiated by the upgrade processor, reduces the power consumption of the Intel486 processor (see Table 17-3 DC Specifications). as well as forces all of its output signals to be tri-stated. The UP# pin on the Intel486 processor is used for enabling the Power Down Mode. Once the UP# pin is driven active by the upgrade processor upon power-up, the Intel486 processor's bus is floated immediately. The Intel486 processor enters the Power Down Mode when the UP# pin is sampled asserted in the clock before the falling edge of RESET. The UP# pin has no effect on the power down status, except during this edge. The Intel486 processor then remains in the Power Down Mode until the next time the RESET signal is activated. For warm resets, with the upgrade processor in the system, the Intel486 processor will remain tristated and re-enter the Power Down Mode once RESET is de-asserted. Similarly for power-up resets, if the upgrade processor is not taken out of the system. the Intel486 processor will tri-state its outputs upon sensing the UP# pin active and enter the Power Down Mode after the falling edge of RESET. 2-163 Intel486TM PROCESSOR FAMIL V 9.6 Clock Control 9.6.2 PIN STATE DURING STOP GRANT The Intel486 processor provides an interrupt mechanism (STPClK #) that allows system hardware to control the power consumption of the processor by stopping the internal clock (output of the Pll) to the processor core in a controlled manner. This lowpower state is called the Stop Grant state. In addition, the STPClK # interrupt allows the system to change the input frequency within the specified range or completely stop the ClK input frequency (input to the Pll). If the ClK input is completely stopped, the processor enters into the Stop Clock statethe lowest power state. During the Stop Grant state, most output and input/ output signals of the processor will maintain their previous condition (the level they held when entering the Stop Grant state). The data and data parity signals will be tri-stated. In response to HOLD being driven active during the Stop Grant state (when the ClK input is running), the processor will generate HlDA and tri-state all output and input/output signals that are tri-stated during the HOlDlHlDA state. After HOLD is de-asserted all signals will return to their prior state before the HOlDlHlDA sequence. There are two targets for the low-power mode supply current: • - 20-100 mA in the Stop Grant state (fast wake-up, frequency-and voltage-dependent), and • - 100-1000 !LA in the full Stop Clock state (slow wake-up, voltage-dependent). See section 9.6.4.2 and 9.6.4.3, for a detailed description of the Stop Grant and Stop Clock states. 9.6.1 STOP GRANT BUS CYCLE A special Stop Grant bus cycle will be driven to the , bus after the processor recognizes the STPClK# interrupt. The definition of this bus cycle is the same as the HALT cycle definition for the standard Intel486 processor, with the exception that the Stop Grant bus cycle drives the value 0000 0010H on the address pins. The system hardware must acknowledge this cycle by returning RDY # or BRDY #. The processor will not enter the Stop Grant state until either RDY # or BRDY # has been returned. The Stop Grant bus cycle is defined as follows: MIIO# = 0, D/C# = 0, W/R# = 1, Address Bus = 0000 0010H (AI = 1), BE3#-BEO# = 1011, pata bus = undefined The latency between a STPClK# request and the Stop' Grant bus cycle is dependent· on the current instruction, the amount of data in the processor write buffers, and the system memory performance. (See Figure 9-8.) In order to achieve the lowest possible power consumption during the Stop Grant state, the system designer must ensure the input Signals with pull-up resistors are not driven lOW and the input signals with pull-down resistors are not driven HIGH. (See Table 3-11 in the Quick Pin Reference section for signals with internal pull-up and pull-down resistors.) All inputs, except the data bus pins must be driven to the power supply rails to ensure the lowest possible current consumption during Stop Grant or Stop Clock modes. For compatibility with future processors; data pins should be driven low to achieve the lowest possible power consumption. Pull-down resistors/bus keepers are needed to minimize leakage current. If HOLD is asserted during the Stop Grant state, all that are normally floated during HlDA will still be floated by the processor. The floated pins should be driven to a low level. (See Table 9-12.) pin~ 9.6.3 WRITE-BACK ENHANCED INTElDX2 PIN STATE DURING STOP GRANT SPECIFICS During the Stop Grant state, most output signals of the processor will maintain their previous condition, which is the level they held when entering the Stop Grant state. The data bus and data parity signals also maintain their previous state. In response to HOLD being driven active during the Stop Grant state when the ClK input is running, the Write-Back Enhanced IntelDX2 processor will generate HlDA and tri-state all output and input/output signals that are tri-stated during the HOlD/HlDA state. After HOLD is de-asserted all signals will return to the state they were in prior to the HOlD/HlDA sequenc~. 2-164 I Intel486TM PROCESSOR FAMILY ClK STPClK# AOOR ROY# 242202-79 Figure 9-8. Stop Clock Protocol Table 9-12. Pin State during Stop Grant Bus State Signal Type State 0 Previous state A31-A4 1/0 Previous state 031-00 1/0 Floated A3-A2 BE3#-BEO# DP3-DPO 0 1/0 Previous state Floated W/R#, D/C#, M/IO# 0 Previous state ADS# 0 Inactive LOCK#, PLOCK# 0 Inactive BREQ 0 Previous state HLDA 0 Asper HOLD BLAST # 0 Previous state FERR# 0 Previous state PCD, PWT 0 Previous state PCHK# 0 Previous state PWT, PCD 0 Previous state SMIACT# 0 Previous state I All inputs should be driven to the power supply rails to ensure the lowest possible current consumption during the Stop Grant or Stop Clock states. (See Table 9-13.) The Write-Back Enhanced IntelDX2 processor has bus keepers features. The data bus and data parity pins have bus keepers that maintain the previous state while in the Stop Grant state. External resistors are no longer required, which prevents excess current during the Stop Grant state. (If external -resistors are present, they should be strong enough to "flip" the bus hold circuitry and eliminate potential DC paths. Alternately, "weak" resistors may also be added to prevent excessive current flow.) See section 17.3.3, "External Resistors Recommended to Minimize Leakage Currents," for external register values. In order to obtain the lowest possible power consumption during the Stop Grant state, system designers must ensure that the input signals with pull-up resistors are not driven LOW, and the input signals with pull-down resistors are not driven HIGH. (See the Table 3-11 for signals with internal pull-up and pull-down resisters). ! .2-165 Intel486™ PROCESSOR FAMILY Table 9-13. Write-Back Enhanced IntelDX2TM Pin State during Stop Grant Bus Cycle Signal Type State 0 Previous state A31-A4 I/O Previous state 031-00 110 Previous state 0 Previous state I/O Previous state 0 Previous state ADS# 0 Inactive (high) lOCK #, PLOCK # 0 Inactive (high) A3-A2 BE3#-BEO# DP3-DPO W/R#, D/C#, M/IO# BREQ 0 Previous state HlDA 0 As per HOLD BLAST # 0 Previous state FERR# 0 Previous state PCHK# 0 Previous state PWT,PCD 0 Previous state CACHE # 0 Inactive(1) (high) HITM# 0 Inactive(1) (high) SMIACT# 0 Previous state NOTES: 1. For the case of snoop cycles (via EADS#) during Stop Grant state, both HITM# and CACHE# may go active depending on the snoop hit in the internal cache. During Stop Grant state, AHOLD, HOLD, BOFF # and EADS# are serviced normally. ' 9.6.4 CLOCK CONTROL STATE DIAGRAM The following state descriptions and diagram show the state transitions during a Stop Clock cycle for the Intel486 processor. (Refer to Figure 9-9 for a Stop Clock state diagram.) Refer to section 9.6.5 for Write-Back Enhanced IntelDX2 processor Clock State specifics. 9.6.4.1 Normal State This is the normal operating state of the processor. 2-166 9.6.4.2 Stop Grant State The Stop Grant state provides a fast wake-up state ,that can be entered by simply asserting the external STPClK # interrupt pin. Once the Stop Grant bus cycle has been placed on the bus, and either ROY # or BRDY # is returned, the processor is in this state (depending on the ClK input frequency). The processor returns to the normal execution state 10-20 clock periods after STPClK # has been de-asserted. While in the Stop Grant state, the pull-up resistors on STPClK#, ClKMUl (for the IntelDX4 processor) and UP# are disabled internally. The system must , continue to drive, these inputs to the state they were in immediately before the processor entered the Stop Grant state. For minimum processor power consumption, all other input pins should be driven to their inactive level while the processor is in the Stop Grant state. A RESET or SRESET will bring the processor from the Stop Grant state to the Normal state. The processor will recognize the inputs required for cache invalidation's (HOLD, AHOlD, BOFF# andEADS#) as explained later in this section. The processor will not recognize any other inputs while in the Stop Grant state. Input signals to the processor will not be recognized until 1 ClK after STPClK# is,de-asserted (see Figure 9-10). While in the Stop Grant state, the processor will not recognize transitions on the interrupt signals (SMI #, NMI, and INTR). Driving, an active edge on either SMI# or NMI will not guarantee recognition and service of the interrupt request following exit from the Stop'Grant state. However, ifone of the interrupt signals (SMI#, NMI, or INTR) is driven active while the processor is in the Stop Grant state, and held active for at least one ClK after STPClK # is de-asserted, the corresponding interrupt will be serviced. The Intel486 processor requires INTR to be held active until the processor issues an interrupt acknowledge cycle in order to guarantee recognition. (See Figure 9-10). When the processor is in the Stop Grant state, the system is allowed to stop or change the ClK input. When the ClK input to the processor is stopped (or changed), the Intel486 processor requires the ClK input to be held at a constant frequency for a mini-' mum of 1 ms before de-asserting STPClK #. This 1-ms time period is necessary so that the Pll can stabilize, and it must be met before the processor will return to the Stop Grant state. I Intel486TM PROCESSOR FAMILY HALT asserted and HALT Bus cycle generated 4 Auto HALT Power Down State ClKRunning 20mA-100mA 1 Normal State Normal Execution --"" INTR, NMI, SMI#, RESET, SRESET ~Il STPCLK# de-asserted and HALT Bus cycle generated EADS# 41l STPCLK# asserted and Stop Grant Bus cycle generated STPCLK# asserted and Stop Grant Bus cycle generated ~ 5 Stop Clock Snoop State EADS# One Clock Powerup Perform Cache Invalidation r 2 Stop Grant State Clock Running Icc - 20mA - 100mA ~~ StopCLK , StartCLK + PLL Startup Laten 3 Stop Clock State Internal Powerdown ClK Changed * Icc - 1OOuA - 100uA 242202-80 * The system can change the input frequency within the specified range or completely stop the elK input frequency (input to Pll) Figure 9·9. Intel486™ Processor Family Stop Clock State Machine The processor will generate a Stop Grant bus cycle only when entering that state from the Normal or the Auto HALT Power Down state. When the processor enters the Stop Grant state from the Stop Clock state or the Stop Clock Snoop state, the processor will not generate a Stop Grant bus cycle. 9.6.4.3 Stop Clock State Stop Clock state is entered from the Stop Grant state by stopping the ClK input (either logic high or logic low). None of the processor input signals I should change state while the ClK input is stopped. Any transition on an input signal (with the exception of INTR, NMI and SMI #) before the processor has returned to the Stop Grant state will result in unpredictable behavior. If INTR is driven active while the elK input is stopped, and held active until the processor issues an interrupt acknowledge bus cycle, it will be serviced in the normal manner. The system design must ensure the processor is in the correct state prior to asserting cache invalidation or interrupt signals to the processor. 2-167 Intel486TM PROCESSOR FAMILY ClK STPCLKI# / STPCLK# Sampled _+-__-+_J NMI--~-----~-----+-J SMI# -+-----l------+~ 242202-81 A.Earliest time at which NMI or SMI# will be recognized. Figure 9-10. Recognition of Inputs when Exiting Stop Grant State The processor will return to the Stop Grant state after the CLK input has been running at a constant frequency for a period of time equal to the PLL startup latency (see section 9.6A.2). The CLK input can be restarted to any frequency between the minimum and maximum frequency listed in the AC timing specifications. 9.6.4.4 Auto HALT Power Down State The execution of a HALT instruction will also cause the processor to automatically enter the Auto HALT Power Down state. The processor will issue a normal HALT bus cycle before entering this state. The processor will transition to the Normal state on the occurrence of INTR, NMI, SMI#, RESET, or SRESET. The system can generate a STPCLK # while the processor is in the Auto HALT Power Down state. The processor will generate a Stop Grant bus cycle when it enters the Stop Grant state from the HALT state. 9.6.4.5 Stop Clock Snoop State (Cache Invalidations) When the processor is in the Stop Grant state or the Auto HALT Power Down state, the processor will recognize HOLD, AHOLD, BOFF# and EADS# for cache invalidation. When the system asserts HOLD, AHOLD, or BOFF #, the processor will float the bus accordingly. When the system then asserts EADS#, the processor will transparently enter the Stop Clock Snoop state and will power up for 1 full core clock in order to perform the required cache snoop cycle. It will then re-freeie the clock to the processor core and return to the previous state. The processor does not generate a bus cycle when it returns to the previous state. A FLUSH #. event during the Stop Grant state or the Auto HALT Power Down state will be latched and acted upon by asserting the internal FLUSH # signal for one clock upon re-entering the Normal state. When the system de-asserts the STPCLK # interrupt, the processor will return execution to the HALT. state. The processor will generate a new HALT bus cycle when it re-enters the HALT state from the Stop Grant state. 2-168 I Intel486TM PROCESSOR FAMILY NOTE: The Stop Clock State Machine in the Standard bus configuration is identical to that of other Intel486 processors. (See section 9.6.4, "Clock Control State Diagram".) 9.6.4.6 Auto Idle Power Down State When the chip is known to be truly idle and waiting for a ROY # or BRDY # from a memory or I/O bus cycle read, the Intel486 processor will reduce its core clock rate to be equal to the external ClK frequency without affecting performance. When any ROY # or BRDY # is asserted, the part will return to clocking the core at the specified multiplier of the external ClK frequency. This functionality is transparent to software and external hardware. Normal StateThis is the normal operating state of the processor. When the processor is executing program/instruction and the STPClK # pin is not asserted, the processor is said to be in it's normal state. 9.6.5 WRITE-BACK ENHANCED INTELDX2 PROCESSOR CLOCK CONTROL STATE DIAGRAM Figure 9-11 (state diagram) shows the state transitions during Stop Clock for the Write-Back Enhanced IntelDX2 processor. ... 4 Auto HALT Power Down State HAlT CLKRunning Halt Bus Cycle Generated Icc approximately 100uA r-- EADS' 1 Normal State RESET Normal Execution All Clocks Running INTR. NMI. SMI'. RESET. SRESET STPCLKN asserted STPCLKN de-asserted STPCLKN asserted and Stop Grant Bus cycle generated STPCLKN asserted ~ 5 Stop Clock Snoop State Clock Powerup Write Through: Cache Invalidation Write back: Write, Invalidation iEADS' F 2 Stop Grant State Clock Running Icc approximately 20 - 50mA A StopCLK FLUSH' FLUS~ Done 6 Auto HALT Power Down Flush State Wr~e through: Cache Invalidation Write back: Writeback, Invalidation. 2 flush Ack. cycles StartCLK + PLL Startup Latency 3 Stop Clock State Internal Powerdown CLKStopped . Icc approximately 100uA 242202-62 Figure 9-11. Write-Back Enhanced IntelDX2TM Processor Stop Clock State Machine (Enhanced Bus Configuration) I 2-169 Intel486™ PROCESSOR FAMIL V 9.6.5.2 Stop Grant State 9.6.5.3 Stop Clock State For minimum processor power consumption, all other input pins should be driven to their inactive level while the processor is in the Stop Grant state excepting data bus, data parity, WB/WT # and INV pins. WB/WT# should be driven low and INV should be driven high. Stop Clock StateStop Clock state is the lowest power consumption mode in the processor, because it allows removal of the external clock. It also has the longest latency for returning to normal state. The Stop Clock state is entered from the Stop Grant state by stopping the ClK input. In the Stop Clock state, total processor power consumption drops to 100 Jl-A, which is approximately 200-250 times lower than the Stop Grant state. None of the processor input signals should change state while the ClK input is stopped. Any transition on an input Signal before the processor has returned to the Stop Grant state will result in unpredictable behavior. If INTR is driven active, it must be held active until the processor issues an interrupt acknowledge cycle. In both the Standard mode and Enhanced mode states, the following conditions exist: • A RESET, SRESET or de-assertion of STPClK # will bring the processor from the Stop Grant state to the Normal state. • While in the Stop Grant state, the processor will not recognize transitions on the interrupt signals (SMI#, NMI, and INTR). This means SMI#, NMI, INTR are not Stop Break events. The external logic should de-assert STPClK # before issuing interrupts or if an interrupt is asserted it should be kept asserted for at least 1 clock after STPClK # is removed. (Note that the Write-Back Enhanced IntelDX2 processor requires that INTR must be held active until the processor issues an interrupt acknowledge cycle in order to guarantee recognition). • FLUSH # is not a Stop Break event. But if FLUSH # is asserted during the Stop Grant state, it is latched by the Write-Back Enhanced IntelDX2 processor and serviced later when STPClK # is deasserted. • The processor will latch and respond to the inputs BOFF#, EADS#, AHOlD, and HOLD. The processor will not recognize any other inputs while in the Stop Grant state except FLUSH #. Other input signals to the processor will not be recognized until the ClK following the ClK in which STPClK # is de-asserted. (See Figure 911.) • The processor will generate a Stop Grant bus cycle only when entering that state from the Normal or the Auto HALT Power Down state. The Stop .Grant bus cycle is not generated when the processor enters the Stop Grant state from the Stop Clock state or the Stop Clock Snoop state. • The processor will not enter the Stop Grant state until all the pending writes are completed, all pending interrupts are serviced and· the processor is idle. 2-170 In the Stop Clock state, the processor is dormant. It does not respond to any transitions on any of the input pins including snoops, flushes and interrupts. It is recommended that this mode only be entered if the processor cache is coherent with main memory and the processor is not processing any interrupts. If this mode is entered with a dirty cache, no alternate master cycles can be allowed while the processor is in the Stop Clock state. The processor will return to the Stop Grant state after the ClK input has been running at a constant frequency for a period of time equal to the Pll startup latency. The ClK input can be restarted to any frequency between the minimum and maximum frequency listed in the AC timing specifications. In Enhanced Bus Mode If the processor is taken into the Stop Clock state with a dirty cache, alternate bus master cycles are not allowed while the processor remains in the Stop Clock state. In order to take the processor into the Stop Clock state with a clean cache, the cache must be flushed. During the time the cache is being flushed, the system must block interrupts to the processor. With all interrupts other than STPClK# blocked, the processor does not write into the cache during the time from the completion of the flush and time it enters the Stop Grant state. This is necessary for the cache to be coherent. To ensure this, the system should drive KEN # inactive from the time the flush starts until the Stop Grant cycle is issued. The system can then put the processor in the Stop Clock state by stopping the CLOCK. I Intel486TM PROCESSOR FAMILY If the processor is already in the Stop Grant state and entering the Stop Clock state is desired, the system must de-assert STPClK # before flushing the cache in order to ensure the cache coherency. The 5-clock de-assertion specification for STPClK# must also be met before the above sequence can occur. 9.6.5.4 Auto HALT Power Down State Upon execution of a HALT instruction, the processor will automatically enter a low power state, called the Auto HALT Power Down state. The processor will issue a normal HALT bus cycle when entering this state. Because interrupts are HALT break events, the processor will transition to the Normal state on the occurrence of INTR, NMI, SMI# or RESET (SRESET is also a HALT break event). If there is a FLUSH # while the processor is in this state, the FLUSH # will be serviced by transitioning to the Stop Clock Flush state. After the FLUSH # is completed, the processor returns back to the Auto HALT Power Down state. The system can generate a STPClK# while the processor is in the Auto HALT Power Down state. The processor will then generate a Stop Grant bus cycle and enter the Stop Grant state from the Auto HALT Power Down state. When the system de-asserts the STPClK # interrupt, the processor will return to the Auto HALT Power Down state. The processor will not generate a new HALT bus cycle when it re-enters the Auto HALT Power Down state from the Stop Grant state. 9.6.6 STOP CLOCK SNOOP STATE (CACHE INVALIDATIONS) When the processor is in the Stop Grant state or the Auto HALT Power Down state, the processor will I recognize HOLD, AHOlD, BOFF#, and EADS# for cache invalidation. When the system asserts HOLD, AHOlD, or BOFF #, the processor will float the bus accordingly. When the system asserts EADS#, the processor will transparently enter the Stop Clock Snoop state and will power up in order to perform the required cache snoop cycle and write-back cycles. It will then refreeze the ClK to the processor core and return to the previous state (i.e., either the Stop Grant state or the Auto HALT Power Down state). The processor does not generate a bus cycle when it returns to the previous state. 9.6.6.1 Auto HALT Power Down Flush State (Cache Flush) for the Write-Back Enhanced IntelDX2 If the processor is in either Standard or Enhanced mode and a FLUSH # event occurs during Auto HALT Power Down state, the processor will transition to the Auto HALT Power Down Flush state. If. the on-chip cache is· configured as a write-back cache, the ClK to the processor core is turned on until all the dirty lines are written back, the cache is invalidated, and the two flush acknowledge cycles are completed. If the on-chip cache is configured as a write-through cache, the ClK to the processor core is turned on until the cache is invalidated. The processor then refreezes the elK and returns to the previous state (i.e., the Auto HALT Power Down state). Auto HALT Power Down Flush state is entered only from the Auto HALT Power Down state and not from the Stop Grant state. 9.6.7 SUPPLY CURRENT MODEL FOR STOP CLOCK MODES AND TRANSITIONS Figures 9-12 and 9-13 illustrate the effect of different Stop Clock state transitions on the supply current of the Intel486 processor. 2-171 Intel486™ PROCESSOR FAMILY A' STPCLQ _ . CLK notdl8ngod Sholl (10ClK) latency 4 B CLK . _ . _Iroa PLl star1up .-- StopGront_ o Slop CIock_ .----'------., 20/40 • ~~~~~~~~~~) 16 MHz C' CLK allll1lod. noqulroa PLl startup lalllncy to _tor Stop Grontatalo 25150 • la10ncylD _ S t o p Gront_ C 33166 • • A • A' ~--------------- 8 MHz • }_::>_&. _---.---.~--A------- .. .;.', - ,- C' -- - - - - I I I I I I 8 MHz 16 MHz 20 MHz 25 MHz I 33 MHz 242202-83 Figure 9·12. Supply Current Model for Stop Clock Modes and Transitions for the Intel486TM Processor Icc A STPCLK# asserted, ClK not changed • A' STPClK# deasserted, CLK not changed ,Short (1OClK) lalency 331100 • III B ClK started, requires Pll slartup latency 10 re-enler Slop Granl stale Slop CLK (from Slop Granllo Slop Clock,) C' ClK slarted, requires Pll startup lalency 10 re-enler Slop Grant state ~---------------- • c ",_,A " 11 A' ',.......-_------ I I ~,' ,," 2S/75 A • _------~-- .. - - - - - - - - A' ---.- "...,--- " C' . I I • .---------. C ----------------~ I 501100 8 MHz • Normal SIaie _ Slop Grant Slate o Slop Clock Slale I I 25 MHz I I 33 MHz I 50 MHz 242202-84 Figure 9·13. Supply Curre,nt Model for Stop Clock Modes and Transitions for the IntelDX4TM Processor 2-172 I Intel486TM PROCESSOR FAMILY separating two or three asserted byte enables will never occur (see Table 10-5). All other byte enable patterns are possible. 10.0 BUS OPERATION All Intel486 processors operate in Standard Bus (write-through) mode .. However, when the internal cache of the Write-Back Enhanced IntelOX2™ processor is configured in write-back mode, the processor bus operates in the Enhanced Bus mode, which is described in section 10.3. When the internal cache of the Write-Back Enhanced IntelDX2 processor is configured in write-through mode, the processor bus operates in Standard Bus mode, identical to the other Intel486 processors in Standard Bus mode. Table 10-1. Byte Enables and Associated Data and Operand Bytes Byte Enable Signal 10.1 Data Transfer Mechanism All data transfers occur as a result of one or more bus cycles. Logical data operands of byte, word and doubleword lengths may be transferred without restrictions on physical address alignment. Data may be accessed at any byte boundary but two or three cycles may be required for unaligned data transfers. (See section 10.1.2, "Dynamic Data Bus Sizing," and section 10.1.5, "Operand Alignment.") Associated Data Bus Signals BEO# 00-07 (byte O-Ieast significant) BE1# 08-015 (byte 1) BE2# 016-023 (byte 2) BE3# 024-031 (byte 3-most significant) Address bits AO and A 1 of the physical operand's base address can be created when necessary. Use of the byte enables to create AO and A1 is shown in Table 10-2. The byte enables can also be decoded to generate BLE# (byte low enable) and BHE# (byte high enable). These signals are needed to address 16-bit memory systems. (See section 10.1.3, "Interfacing with 8-, 16-, and 32-Bit Memories.") The Intel486 processor address signals are split into two components. High-order address bits are provided by the address lines, A2-A31. The byte enables, BEO # - BE3 #, form the low-order address and provide linear selects for the four bytes of the 32-bit address bus. 10.1.1 MEMORY AND 1/0 SPACES Bus cycles may access physical memory space or 1/ Peripheral devices in the system may either be memory-mapped, or I/O-mapped, or both. Physical memory addresses range from OOOOOOOOH to FFFFFFFFH (4 gigabytes). I/O addresses range from OOOOOOOOH to OOOOFFFFH (64 Kbytes) for programmed I/O. (See Figure 10-1.) o space. The byte enable outputs are asserted when their associated data bus bytes are involved with the present bus cycle, as listed in Table 10-1. Byte enable patterns that have a negated byte enable Table 10-2. Generating AO-A31 from BEO# -BE3# and A2-A31 Intel486™ Processor Address Signals A31 ... A31 ... A31 A31 A2 BE3# BE2# BE1# BEO# Physical Base Address I A2 A1 AO ... A2 0 0 X X X Low ... A2 0 1 X X Low High A31 ... A2 1 0 X Low High High A31 ... A2 1 1 Low High High High 2-173 Intel486™ PROCESSOR FAMILY w·····i FFFFFFFFH .....-_--, 32-Blt Wide Organization ~ ~ ~~v~ Physical Memory 4 Gbyte OOOOFFFFH 5~!i OOOOOOOOH 64 Kbyte Physical Memory Space 0OOO0003H. .• •. .. OJ • OOOOOOOOH 16-Blt Wide Organization 00000001 H \--.",0 I -----J FFFFFFFEH OOOOOOOOH BHE, BLE' 242202-86 'Accessible } Programmed 110 Space 110 Space 242202-85 Figure 10-1. Physical Memory and I/O Spaces 10.1.1.1 Memory and I/O Space Organization The Intel486 processor datapath to memory and input/ output (liD) spaces can be 32-, 16- or 8-bits wide. The byte enable signals, BEO#-BE3#, allow byte granularity when addressing any memory or liD structure whether 8,16 or 32 bits wide. The Intel486 processor includes bus control pins, B816# and B88#, which allow direct connection to 16- and 8-bit memories and liD devices. Cycles to 32-, 16- and 8-bit may occur in any sequence, since the B88# and B816# signals are sampled during each bus cycle. 32-bit wide memory and liD spaces are organized as arrays of physical 4-byte words. Each memory or I/O 4-byte word has four individually addressable bytes at consecutive byte addresses (see Figure 102). The lowest addressed byte is associated with data signals 00-07; the highest-addressed byte with 024-031. Physical 4-byte words begin at addresses divisible by four. 2-174 1 1 .1. IFFFFFFFCH FFFFFFFFH ! OOOOOOOOH 1...-_---1 FFFFFFFFHI Figure 10-2. Physical Memory and 1/0 Space Organization 16-bit memories are organized as arrays of physical 2-byte words. Physical· 2-byte words begin at addresses divisible by two. The byte enables BEO#BE3#., must be decoded to A1, BLE# and BHE# to address 16-bit memories. (8ee section 10.1.3, "Interfacing with 8-, 16- and 32-Bit Memories.") To address 8-bit memories, the two low order address bits AO and A1, must be decoded from BEO # BE3 #. The same logic can be used for 8- and 16-bit memories, because the decoding logic for BLE # and AO are the same. (8ee section 10.1.3, "Interfacing with 8-, 16", and 32-Bit Memories.") 10.1.2 DYNAMIC DATA BUS SIZING Oynamic data bus sizing is a feature allowing processor connection to 32-, 16- or 8-bit buses for memory or liD. The Intel486 processor may connect to all three bus sizes. Transfers to or from 32-, 16- or 8bit devices are supported by dynamically determining the bus width during each bus cycle. Address decoding circuitry may assert B8 16 # for 16-bit devices, or B88# for 8-bit devices during each bus cycle. B88# and B816# must be negated when addressing 32-bit devices. An 8-bit bus width is selected if both B816# and B88# are asserted. I Intel486™ PROCESSOR FAMILY 8816# and 888# force the Intel486 processor to run additional bus cycles to complete requests larger than 16- or 8 bits. A 32-bit transfer will be converted into two 16-bit transfers (or 3 transfers if the data is misaligned) when 8816# is asserted. Asserting 888# will convert a 32-bit transfer into four 8-bit transfers. Extra cycles forced by 8816 # or 888 # should be viewed as independent bus cycles. 8816# or 888# must be driven active during each of the extra cycles unless the addressed device has the ability to change the number of bytes it can return between cycles. The Intel486 processor will drive the byte enables appropriately during extra cycles forced by 888 # and 8816#. A2-A31 will not change if accesses are to a 32-bit aligned area. Table 10-3 shows the set of byte enables that will be generated on the next cycle for each of the valid possibilities of the byte enables on the current cycle. The dynamic bus sizing feature of the Intel486 processor is significantly different than that of the Intel386 processor. Unlike the Intel386 processor, the Intel486 processor requires that data bytes be driven on the addressed data pins. The simplest example of this function is a 32-bit aligned, 8816# read. When the Intel486 processor reads the two high order bytes, they must be driven on the data bus pins 016-031. The Intel486 processor expects the two low order bytes on 00-015. The Intel386 processor expects both the high and low order bytes on 00015. The Intel386 processor always reads or writes data on the lower 16 bits of the data bus when 8816# is asserted. The external system must contain buffers to enable the Intel486 processor to read and write data on the appropriate data bus pins. Table 10-4 shows the data bus lines to which the Intel486 processor expects data to be returned for each valid combination of byte enables and bus sizing options. Table 10-3. Next Byte Enable Values for B5n# Cycles Next with B516# Next with B58 # Current BE3# BE2# BE 1# BEO# BE3# BE2# BE1# BEO# BE3# BE2# BE1# BEO# 1 1 1 0 1 1 O· 1 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 n 1 1 0 n 1 0 n 0 n n 1 0 0 n 0 0 n 1 n n 0 0 0 n 1 1 n 1 n n 1 1 1 n 1 1 n 1 n n n 1 n n 0 0 n 0 0 n n n n n 1 1 n 1 1 n n n n n 1 1 n 1 1 n n n b n 1 0 n n n NOTE: "n" means that another bus cycle will not be required to satisfy the request. Table 10-4. Data Pins Read with Different Bus 5izes I BE3# BE2# BE1# BEO# 1 1 1 0 1 1 0 1 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 w/o B58#/B516# wB58# w B516# 07-00 015-00 023"'-00 031-00 015-08 023-08 031-08 023-016 031-016 031-024 07-00 07-00 07-00 07-00 015-08 015-08 015-08 023-016 023-016 031-024 07-00 015-00 015-00 015-00 015-08 015-08 015-08 023-016 031-016 031-024 2-175 Intel486TM PROCESSOR FAMILY Valid data will only be driven onto data bus pins corresponding to active byte enables during write cycles. Other pins in the data bus will be driven but they will not contain valid data. Unlike the Intel3B6 processor, the Intel486 processor will not duplicate write data onto parts of the data bus for which the corresponding byte enable is negated. 16- and B-bit memories require external byte swapping logic for routing data to the appropriate data lines and logic for generating BHE#, BLE# and A1. In systems where mixed memory widths are used, extra address decoding logic is necessary to assert BS16# or BSB#. Data Bus (00-031) ~ 32-BIt Memory Proce_ Add...... Bu. (BEOAJ-BE3It, A2-A31 fBSBII 1Bsl611 'HIGH' 'HIGH' 242202-87 10.1.3 INTERFACING WITH 8-, 16- AND 32-BIT MEMORIES In 32-bit physical memories, such as the one shown in Figure 10·3, each 4-byte word begins at a byte address that is a multiple of four. A2-A31 are used as a 4-byte word select. BEO# -BE3# select individual bytes within the 4-byte word. BSB# and BS16# are negated for all bus cycles involving the 32-bit array. , 32., Intet488 T11 Figure 10-3.lnteI486TM Processor with 32-Bit Memory Figure 10-4 shows the Intel4B6 processor address bus interface to 32-, 16- and B-bit memories. To address 16-bit memories the byte enables must be decoded to produce A1, BHE # and BLE # (AO). For 8bit wide memories the byte enables must be decoded to produce AO and A1. The same byte select logic can be used in 16- and 8-bit systems, because BLE# is exactly the same as AD. (See Table 10·5.) BEO#-BE3# can be decoded as shown in Table 10-5 to generate A1, BHE# and BLE#. The byte select logic necessary to generate BHE # and BLE # is shown in Figure 10-5. Address Bus (A31-A2 BEO#-BE3#) A31-A2 BHE#,BLE#,A1 AO (BLE#), A 1 A31-A2 242202-88 Figure 10-4. Addressing 16- and 8-Bit Memories 2-176 I Intel486TM PROCESSOR FAMILY Table 10-5. Generating A1, BHE# and BLE# for Addressing 16-Bit Devices Intel486™ Processor 8-, 16-Bit Bus Signals BE3# BE2# BE1# BEO# A1 BHE# BLE# (AO) H* H H H H H* H H L L* L* L* L L* L L H* H H H L L* L L H H* H* H* H* H L L H H* L L H H* L* L* H H* L L H* L H L H L* H L H L* H* L* H L* H L x L L L H x L L H x x x H x L L x H L L H x L L L x x x L x L L x L H L L x H L H x x x L x H L L* L L Comments x-no active bytes x-not contiguous bytes x-not contiguous bytes x-not contiguous bytes x-not contiguous bytes x-not contiguous bytes BLE# asserted when 00-07 of 16-bit bus is active. BHE# asserted when 08-015 of 16-bit bus is active. A1 low for all even words; A1 high for all odd words. Key: x = don't care H = high voltage level L = low voltage level * = a non-occurring pattern of Byte Enables; either none are asserted or the pattern has Byte Enables asserted for non-contiguous bytes BEO# [~~ _BE_1_#........ 242202-Ml BE1# _BE_3_#-...[~~ 242202-M2 BLE# (OR AO) Combinations of BEO#-BE3# that never occur are those in which two or three asserted byte enables are separated by one or more negated byte enables. These combinations are "don't care" conditions in the decoder. A decoder can use the non-occurring BEO# -BE3# combinations to its best advantage. Figure 10-6 shows an Intel486 processor data bus interface to 16- and 8-bit wide memories. External byte swapping logic is needed on the data lines so that data is supplied to and received from the Intel486 processor on the correct data pins (see Table 10-4). 242202-89 Figure 10-5. Logic to Generate A1, BHE# and BLE# for 16-Bit Buses I 2-177 Intel486TM PROCESSOR FAMILY Intel486™ Processor ~ 00-07 08-015 016-023 024-031 4 /4 /4 /4 32-Bit Memory ~ as8# BS16# (A2-A31. BEO#--BE3#) Byte Swap Logic 16 16-Bit Memory 8 8-Bit Memory ~ . r r Byte Swap Logic Address Decode 242202-90 Figure 10-S.Data Bus Interface to 1S- and 8-Bit Memories 10.1.4 DYNAMIC BUS SIZING DURING CACHE LINE FILLS BS8# and B516# can be driven during cache line filfs. The Intel486 processor will generate enough 8or 16-bit cycles to fill the cache line. This can be up to sixteen 8-bit cycles. The external system should assume that all byte enables are active for the first cycle of a cache line fill. The Intel486 processor will generate proper byte enables for subsequent cycles in the line fill. Table 106 shows the appropriate AO(BLE#), A1 and BHE# for the various combinations of the Intel486 processor byte enables on both the first and subsequent cycles of the cache line fill. The"'" marks .all combinations of byte enables that will be generated by the Intel486 processor during a cache line fill. operand that spans more than one physical 4-byte word of memory or I/O at the expense of extra cycles. Examples are 4-byte operands beginning at addresses that are not evenly divisible by 4, or 2-byte words split between two physical 4-byte words. These are referred to as unaligned·transfers. Operand alignment and data bus size dictate when multiple bus cycles are required. Table 10-7 describes the transfer cycles generated for all combinations of logical operand lengths, alignment, and data bus sizing. When multiple cycles are required to transfer a multibyte logical operand, the highest-order bytes are transferred first. For example, when the processor does a 4-byte unaligned read beginning at location x11 in the 4-byte aligned space, the three high order bytes are read in the first bus cycle. The low byte is read in a subsequent bus cycle. 10.1.5 OPERAND ALIGNMENT Physical 4-byte words begin at addresses that are multiples of four. It is possible to transfer a logical 2-178 I Intel486TM PROCESSOR FAMILY Table 10-6. Generating AO, A1 and BHE# from the Intel486TM Processor Byte Enables BE3# BE2# BE1# BEO# 1 1 1 '0 1 1 '0 1 '0 '0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 First Cache Fill Cycle BHE# AO A1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 AO 0 0 0 0 0 0 0 0 0 0 Any Other Cycle BHE# A1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 Table 10-7. Transfer Bus Cycles for Bytes, Words and Dwords Byte-Length of Logical Operand 1 4 2 Physical Byte Address in Memory (Low Order Bits) xx 00 01 10 11 00 01 10 11 Transfer Cycles over 32-Bit Bus b w w w hb Ib d hb 13 hw Iw h3 Ib Transfer Cycles over 16-Bit Bus (t = B8 # 16 asserted) b w Ib t hbt w hb Ib Iwt hwt hb Ibt mwt hw Iw mwt hbt Ib Transfer Cycles over 8-Bit Bus (:j: = B88# Asserted) b Ib :j: hb :j: Ib :j: hb:j: Ib :j: hb :j: hb Ib Ib :j: mlb:j: mhb:j: hb :j: KEY: b =' byte transfer w = 2-byte transfer 3 = 3·byte transfer d = 4·byte transfer h = high-order portion I = low-order portion m = mid-order portion The function of unaligned transfers with dynamic bus sizing is not obvious. When the external systems asserts B816# or B88# forcing extra cycles, loworder bytes or words are transferred first (opposite to the example above). When the Intel486 processor requests a 4-byte read and the external system asserts B816#, the lower 2 bytes are read first followed by the upper 2 bytes. In the unaligned transfer described above, the processor requested three bytes on the first cycle. If the external system asserted B816 # during this 3-byte transfer, the lower word is transferred first I 4-8yte Operand mhb:j: hb Ib :j: hb :j: Ib :j: mlb + mhb:j: mlb:j: mlb:j: mhb:j: hb :j: kb _I~b_L.._m...:lb_L..~m...:.hb=--L......:.:.:hb=--..J LI t byte with lowest address t byte with highest address followed by the upper byte. In the final cycle the lower byte of the 4-byte operand is transferred as in the 32-bit example above. 10.2 Bus Functional Description The Intel486 processor supports a wide variety of bus transfers to meet the needs of high performance systems. Bus transfers can be single cycle or multiple cycle, burst or non-burst, cacheable or noncacheable, 8-, 16- or 32-bit, and pseudo-locked. To 2-179 Intel486TM PROCESSOR FAMILY support multiprocessing systems there are cache invalidation cycles and locked cycles. This section begins with basic non-cacheable nonburst single cycle transfers. It moves on to multiple cycle transfers and introduces the burst mode. Cacheability is introduced in section 10.2.3, "Cacheable Cycles." The remaining sections describe locked, pseudo-locked, invalidate, bus hold and interrupt cycles. Bus cycles and data cycles are discussed in this section. A bus cycle is at least two clocks long and begins with ADS# active in the first clock and ready active in the last clock. Data is transferred to or from the Intel486 processor during a data cycle. A bus cycle contains one or more data cycles. Refer to section 10.2.13, "Bus States," for a description of the bus states shown in the timing diagrams. 10.2.1 NON-CACHEABLE NON-BURST SINGLE CYCLE 10.2.1.1 No Wait States The fastest non-burst bus cycle that the Intel486 processor supports is two clocks long. These cycles are called 2-2 cycles because reads and writes take two cycles each. The first "2" refers to reads and the second to writes. For example, if a wait state needs to be added to the write, the cycle would be called 2-3. Basic two clock read and write cycles are shown in Figure 10-7. The Intel486 processor initiates a cycle by asserting the address status signal (ADS#) a~ th~ rising edge of the first clock. The ADS # output In~l cates that a valid bus cycle definition and address IS available on the cycle definition lines and address bus. CLK ADS# A2-A31 Mil 0# D/C# , I' BEO-3# W/R# ~ x \ / ~ x / RDY# , I BLAST# : I I I en en en en I I DATA I 8 >----«. . . . _-' ** ) 0 I I , c ( ** ~ I PCHK# WJ READ WRITE READ WRITE 242202-91 • To Processor •• From Processor Figure 10-7. Basic 2-2 Bus Cycle 2-180 I Intel486™ PROCESSOR FAMILY The non-burst ready input (RDY#) is returned by the external system in the second clock. RDY # indicates that the external system has presented valid data on the data pins in response to a read or the external system has accepted data in response to a write. The timing of the parity check output (PCHK#) is shown in Figure 10-7. The Intel486 processor drives the PCHK# output one clock after ready terminates a read cycle. PCHK # indicates the parity status for the data sampled at the end of the previous clock. The PCHK# signal can be used by the external system. The Intel486 processor does nothing in response to the PCHK# output. The Intel486 processor samples RDY # at the end of the second clock. The cycle is complete if RDY # is active (LOW) when sampled. Note that RDY # is ignored at the end of the first clock of the bus cycle. 10.2.1.2 Inserting Wait States The external system can insert wait states into the basic 2-2 cycle by driving RDY # inactive at the end of the second clock. RDY # must be driven inactive to insert a wait state. Figure 10-8 illustrates a simple non-burst, non-cacheable signal with one wait state added. Any number of wait states can be added to an Intel486 processor bus cycle by maintaining RDY # inactive. The burst last signal (BLAST#) is asserted (LOW) by the Intel486 processor during the second clock of the first cycle in all bus transfers illustrated in Figure 10-7. This indicates that each transfer is complete after a single cycle. The Intel486 processor asserts BLAST # in the last cycle of a bus transfer. Ti I Tl T2 T2 Tl T2 T2 Ti 2-181 Intel486TM PROCESSOR FAMILV The burst ready input (BRDY #) must be driven inactive on all clock edges where RDY# is driven inactive for proper operation of these simple non-burst cycles. indicates that it is willing to perform a burst cycle by holding the burst last signal (BLAST#) inactive in the second clock of the cycle. The external system indicates its willingness to do a burst cycle by returning the burst ready signal (BRDY#) active. 10.2.2 MULTIPLE AND BURST CYCLE BUS TRANSFERS The external system can cause a multiple cycle transfer when it can only supply 8- or l6-bits per cycle. The addresses of the data items in a burst cycle will all fall within the same l6-byte aligned area (corresponding to an internal Intel486 processor cache line). A 16-byte aligned area begins at location XXXXXXXO and ends at location XXXXXXXF. During a burst cycle, only BEO-3#, A2, and A3 may change. A4-A31, M/IO#, D/C#, and W/R# will remain stable throughout a burst. Given the first address in a burst, external hardware can easily calculate the address of subsequent transfers in advance. An external memory system can be designed to quickly fill the Intel486 processor internal cache lines. Only multiple cycle transfers' caused by internal requests are considered in this section. Cacheable cycles and 8- and l6-bit transfers are covered in section 10.2.3, "Cacheable Cycles" and section 10.2.5, "8- and l6-Bit Cycles." Burst cycles are not limited to cache line fills. Any multiple cycle read request by the Intel486 processor can be converted into a burst cycle. The Intel486 processor will only burst the number of bytes need. ed to complete a transfer. Multiple cycle bus transfers can be caused by internal requests from the Intel486 processor or by the external memory system. An internal request for a 128-bit pre-fetch must take more than one cycle. Intern~1 requests for unaligned data may also require multiple bus cycles. A cache line fill requires multiple cycles to complete. Internal Requests from Intel486 OX, Inte1DX2, and IntelDX4 Processors An internal request by an Intel486 DX, IntelDX2, or IntelDX4 processor for a 64-bit floating point load must take more than one internal cycle. 10.2.2.1 Burst Cycles The Intel486 processor can accept burst cycles for any bus requests that require more than a single data cycle. During burst cycles, a new data item is strobed into the Intel486 processor every clock rather than every other clock as in non-burst cycles. The fastest burst cycle requires 2 clocks for the first data item with subsequent data items returned every clock. The Intel486 processor is capable of bursting a maximum of 32 bits during a write. Burst writes can only occur if BS8# or BS16# is asserted. For example, the Intel486 processor can burst write four 8-bit operands or two l6-bit operands in a single burst cycle. But the Intel486 processor cannot burst multiple 32bit writes in a single burst cycle. Burst cycles begin with the Intel486 processor driving out an address and asserting ADS # in the same manner as non-burst cycles. The Intel486 processor 2-182 For example, the Intel486 DX, IntelDX2, Write-Back Enhanced IntelDX2 or IntelDX4 processor will burst eight bytes for a 64-bit floating point non-cacheable read. The external system converts a multiple cycle request into a burst cycle by returning BRDY # active rather than RDY # (non-burst ready) in the first cycle of a transfer. For cycles that cannot be burst, such as interrupt acknowledge and halt, BRDY # has the same effect as RDY #. BRDY # is ignored if both BRDY # and RDY # are returned in the same clock. Memory areas and peripheral devices that cannot perform bursting must terminate cycles with RDY #. 10.2.2.2 Terminating Multiple and Burst Cycle Transfers The Intel486 processor drives BLAST# inactive for all but the last cycle in a multiple cycle transfer. BLAST # is driven inactive in the first cycle to inform the external system that the transfer could take additional cycles. BLAST# is driven active in the last cycle of the transfer indicating that the next time BRDY # or RDY # is returned the transfer is complete. BLAST# is not valid in the first clock of a bus cycle. It should be sampled only in the second and subsequent clocks when RDY # or BRDY # is returned. I Intel486TM PROCESSOR FAMILY The number of cycles in a transfer is a function of several factors including the number of bytes the Intel486 processor needs to complete an internal request (1, 2, 4, 8, or 16), the state of the bus size inputs (BS8# and BS16#), the state of the cache enable input (KEN #) and alignment of the data to be transferred. When the Intel486 processor initiates a request it knows how many bytes will be transferred and if the data is aligned. The external system must indicate whether the data is cacheable (if the transfer is a read) and the width of the bus by returning the state of the KEN#, BS8# and BS16# inputs one clock before ROY # or BROY # is returned. The Intel486 processor determines how many cycles a transfer will take based on its internal information and inputs from the external system. BLAST # is not valid in the first clock of a bus cycle because the Intel486 processor cannot determine the number of cycles a transfer will take until the external system returns KEN#, BS8# and BS16#. BLAST# should onlybe sampled in the second and subsequent clocks of a cycle when the external system returns ROY # or BROY #. The system may terminate a burst cycle by returning ROY # instead of BROY #. BLAST # will remain deasserted until the last transfer. However, any transfers required to complete a cache line fill will follow the burst order, e.g., if burst order was 4, 0, C, 8 and ROY # was returned at after 0, the next transfers will be from C and 8. 10.2.2.3 Non-Cacheable, Non-Burst, Multiple Cycle Transfers a sequence of two single cycle transfers. The Intel486 processor indicates to the external system that this is a multiple cycle transfer by driving BLAST # inactive during the second clock of the first cycle. The external system returns ROY # active indicating that it will not burst the data. The external system also indicates that the data is not cacheable by returning KEN # inactive one clock before it returns ROY # active. When the Intel486 processor samples ROY # active it ignores BROY #. Each cycle in the transfer begins when AOS# is driven active and the cycle is complete when the external system returns ROY # active. The Intel486 processor indicates the last cycle of the transfer by driving BLAST # active. The next ROY # returned by the external system terminates the transfer. 10.2.2.4 Non-Cacheable Burst Cycles The external system converts a multiple cycle request into a burst cycle by returning BROY # active rather than ROY # in the first cycle of the transfer. This is illustrated in Figure 10-10. There are several features to note in the burst read. AOS# is only driven active during the first cycle of the transfer. ROY # must be driven inactive when BROY # is returned active. BLAST# behaves exactly as it does in the non-burst read. BLAST # is driven inactive in the second clock of the first cycle of the transfer indicating more cycles to follow. In the last cycle, BLAST# is driven active telling the external memory system to end the burst after returning the next BROY #. Figure 10-9 illustrates a 2 cycle non-burst, noncacheable multiple cycle read. This transfer is simply I 2-183 Intel486TM PROCESSOR FAMILY n Tl T2 n T2 Tl ClK I ADSA2-A31 ~/IO- D/e O W/RBEO-30 X X ROY- BRoy· KEN" 'LlJ BLASTtI ~ ~ I C I 8 DATA I 0I I 1st DATA 2nd DATA 242202-93 • To Processor Figure 10-9. Non~Cacheable, Non-Burst, Multiple-Cycle Transfers n T1 12 T2 r; Ti ClK I ADSA2-A31 W "/100 O/C" W/RO • BEo-3- X X ROY" BRDY. KEN" BlASTo LlJ \ L I DATA 0-0 242202-94 • To Processor Figure 10-10. Non-Cacheable Burst Cycle 2-184 I Intel486TM PROCESSOR FAMILY 10.2.3 CACHEABLE CYCLES Any memory read can become a cache fill operation. The external memory system can allow a read request to fill a cache line by returning KEN # active one clock before ROY # or BROY # during the first cycle of the transfer on the external bus. Once KEN # is asserted and the remaining three requirements described below are met, the Intel486 processor will fetch an entire cache line regardless of the state of KEN #. KEN # must be returned active in the last cycle of the transfer for the data to be written into the internal cache. The Intel486 processor will only convert memory reads or prefetches into a cache fill. KEN # is ignored during write or I/O cycles. Memory writes will only be stored in the on-chip cache if there is a cache hit. liD space is never cached in the internal cache. To transform a read or a prefetch into a cache line fill the following conditions must be met: 1. The KEN # pin must be asserted one clock prior to ROY # or BROY # being returned for the first data cycle. 2. The cycle must be of the type that can be internally cached. (Locked reads, liD reads, and interrupt acknowledge cycles are never cached). 3. The page table entry must have the page cache disable bit (PCO) set to O. To cache a page table entry, the page directory must have PCO = O. To cache reads or prefetches when paging is disabled, or to cache the page directory entry, control register 3 (CR3) must have PCO = O. 4. The cache disable (CD) bit in control register 0 (CRO) must be clear. I External hardware can determine when the Intel486 processor has transformed a read or prefetch into a cache fill by examining the KEN#, M/IO#, O/C#, W/R#, LOCK#, and PCO pins. These pins convey to the system the outcome of conditions 1-3 in the above list. In addition, the Intel486 processor drives PCO high whenever the CD bit in CRO is set, so that external hardware can evaluate condition 4. Cacheable cycles can be burst or non-burst. 10.2.3.1 Byte Enables during a Cache Line Fill For the first cycle in the line fill, the state of the byte enables should be ignored. In a non-cacheable memory read, the byte enables indicate the bytes actually required by the memory or code fetch. The Intel486 processor expects to receive valid data on its entire bus (32 bits) in the first cycle of a cache line fill. Data should be returned with the assumption that all the byte enable pins are driven active. However if BS8# is asserted only one byte need be returned on data lines 00-07. Similarly if BS16# is asserted two bytes should be returned on 00-015. The Intel486 processor will generate the addresses and byte enables for all subsequent cycles in the line fill. The order in which data is read during a line fill depends on the address of the first item read. Byte ordering is discussed in section 10.2.4, "Burst Mode Details." 2-185 Intel486TM PROCESSOR FAMILY 10.2.3.2 Non-Burst Cacheable Cycles Figure 10-11 shows a non-burst cacheable cycle. The cycle becomes a cache fill when the Intel486 processor samples KEN # active at the end of the first clock. The Intel486 processor drives BLAST # inactive in the second clock in response to KEN # . BLAST # is driven inactive because a cache fill requires 3 additional cycles to complete. BLAST # re-. mains inactive until the last transfer in the cache line fill. KEN # must be returned active in the last cycle of the transfer for the data to be written into the internal cache. clock. The subsequent three reads would not have happened since a cache fill was not requested. The BLAST # output is invalid in the first clock of a cycle. BLAST # may be active during the first clock due to earlier inputs. Ignore BLAST # until the second clock. During the first cycle of the cache line fill the external system should treat the byte enables as if they are all active. In subsequent cycles in the burst, the Intel486 processor drives the address lines and byte enables. (See section 10.2.4.2, "Burst and Cache Line Fill Order') . Note that this cycle would be a single bus cycle if KEN# was not sampled active at the end of the first elK AOSA2-A31 ... /10_ o/e. W/R. BEO-3· BRDY- , W BLAST" OATA 242202-95 • To Processor Figure 10-11. Non-Burst, Cacheable Cycles 2-186 I Intel486TM PROCESSOR FAMILY 10.2.3.3 Burst Cacheable Cycles Figure 10-12 illustrates a burst mode cache fill. As in Figure 10-11, the transfer becomes a cache line fill when the external system returns KEN # active at the end of the first clock in the cycle. r; Tl T2 The external system informs the Intel486 processor that it will burst the line in by driving BRDY # active at the end of the first cycle in the transfer. Note that during a burst cycle, ADS# is only driven with the first address. T2 T2 T2 Ti eLK , LU , A4-A3' • ~/IO •• O/C". W/R" A2-A3. 8EO-3" ____~x~~--~--~--~--~-- r----------'~:--~ ------...L..~X. .---i-------i_i-~. . . _------- RDY# 8RDY" KEN" BLAST" ,W -~iQ w \'----_C~ DATA PCHK" 242202-96 • To Processor Figure 10-12. Burst Cacheable Cycle I 2-187 Intel486TM PROCESSOR FAMILY 10.2.3.4 Effect of Changing KEN# during a Cache line Fill KEN # can change multiple times as long as it arrives at its final value in the clock before ROY # or BROY # is returned. This is illustrated in Figure 10-13. Note that the timing of BLAST# follows that of KEN # by one clock. The Intel486 processor samples KEN # every clock and uses the value returned in the clock before ready to determine if a bus cycle Ti T2 Tl would be a cache line fill. Similarly, it uses the value of KEN # in the last cycle before early ROY # to load the line just retrieved from memory into the cache. KEN # is sampled every clock and it must satisfy setup and hold time. KEN # can also change multiple times before a burst cycle, as long as it arrives at its final value one clock before ready is returned active. T2 T2 Tl T2 CLK ADS# ',---,--,I ''---'--II A4-A31, M!IO#. O!C#. W!R# ____~x~-+__~~--+_------_+------ A2-A3, 8EO-3# ____~x~~___+--~----~x~~----- ROY# I KEN# BLAST# W X I 7 ',--~I DATA 242202-97 • To Processor Figure 10-13. Effect of Changing KEN# 2-188 I Intel486™ PROCESSOR FAMILY chip when either ROY # or BRDY # are active. Driving BRDY # and ROY # inactive adds a wait state to the transfer. A burst cycle where two clocks are required for every burst item is shown in Figure 10-14. 10.2.4 BURST MODE DETAILS 10.2.4.1 Adding Wait States to Burst Cycles Burst cycles need not return data on every clock. The Intel486 processor will only strobe data into the Ti TI 12 12 T2 T2 T2 T2 12 ClK I I A4-A31. 1.1110 •• O/CO. w/RO A2-A3. BEO-30 W I X i I X ; ~ X X I I KENO . . . . .\.L....... I I I ur---: I W I \ BLAST- I DATA 242202-98 • To Processor Figure 10-14. Slow Burst Cycle I 2-189 Intel486™ PROCESSOR FAMILV Table 10-8. Burst Order (Both Read and Write Bursts) 10.2.4.2 Burst and Cache Line Fill Order The burst order used by the Intel486 processor is shown in Table 10·8. This burst order is followed by any burst cycle (cache or not), cache .line fill (burst or not) or codeprefetch. The Intel486 processor presents each request for data in an order determined by the first address in the transfer. For example, if the first address was 104 the next three addresses in the burst will be 100, 10C and 108. An example of burst address sequencing is shown in Figure 10-15. T1 T2 AoS# \ I A2-A31 ~ Ti First Addr. Second Addr. Third Addr. Fourth Addr. 0 4 8 C 4 0 C 8 8 C 0 4 C 8 4 0 T2 T2 T2 Ti ClK 104 ., ~ 100 i ~ lac i ~ 108 : ROY· BRoy# KEN# BlAST# W W X I \ C DATA 242202-99 • To Processor Figure 10-15. Burst Cycle Showing Order of Addresses 2-190 I Intel486TM PROCESSOR FAMILY another normal bus cycle after being interrupted to complete the data transfer. This is called an interrupted burst cycle. The external system can respond to an interrupted burst cycle with another burst cycle. The sequences shown in Table 10-8 accommodate systems with 64-bit buses as well as systems with 32-bit data buses. The sequence applies to all bursts, regardless of whether the purpose of the burst is to fill a cache line, do a 64-bit read, or do a pre-fetch. If either BS8# or BS16# is returned active, the Intel486 processor completes the transfer of the current 32-bit word before progressing to the next 32-bit word. For example, a BS16# burst to address 4 has the following order: 4-6-0-2-C-E-8-A. The external system can interrupt a burst cycle by returning ROY # instead of BROY #. ROY # can be returned after any number of data cycles terminated with BROY#. An example of an interrupted burst cycle is shown in Figure 10-16. The Intel486 processor immediately drives AOS# active to initiate a new bus cycle after ROY # is returned active. BLAST # driven inactive one clock after AOS# begins the second bus cycle indicating that the transfer is not complete. 10.2.4.3 Interrupted Burst Cycles Some memory systems may not be able to respond with burst cycles in the order defined in Table 10-8. To support these systems the Intel486 processor allows a burst cycle to be interrupted at any time. The Intel486 processor will automatically generate Ti T2 Tl T2 T1 T2 T2 Ti elK \'----+---J! ADS# A2-A31 ______~~X~__~_I_04__T_~X'00: X~__~_'_0C__T_~X~I_0_8~:_____ ROY# BRDY# KEN# BlAST# w I W \,---,-~I DATA 242202-AO • To Processor Figure 10-16. Interrupted Burst Cycle I 2-191 Intel486TM PROCESSOR FAMILY KEN # need not be returned active in the first data cycle of the second part of the transfer in Figure 10-16. The cycle had been converted to a cache fill in the first part of the transfer and the Intel486 processor expects the cache fill to be completed. Note that the first half and second half of the transfer in Figure 10,16 are each two cycle burst transfers. the external system mixes ROY # and BROY # is shown in Figure 10-17. The Intel486 processor initially requests a transfer beginning at location 104. The transfer becomes a cache line fill when the external system returns KEN # active. The first cycle of the. cache fill transfers the contents of location 104 and is terminated with ROY #. The Intel486 processor drives out a new request (by asserting AOS#) to address 100. If the external system terminates the second cycle with BROY #, the Intel486 processor will next request/expect address 10C. The correct order is determined by the first cycle in the transfer, which may not be the first cycle in the burst if the system mixes ROY# with BROY#. The order in which the Intel486 processor requests operands during an interrupted burst transfer is determined by Table 10-7. Mixing ROY# and BROY# does not change the order in which operand addresses are requested by the Intel486 processor. An example of the order in which the Intel486 processor requests operands during a cycle in which TI T2 TI T2 ADS# \ I \ I A2-A31 ~ Ti Ti T2 T2 ClK ~ 104 100 X10C i X108 : RDY# BRDY" I KEN" BLAST" DATA W W X I \ I • I \ C • 242202-Al • To Processor Figure 10-17. Interrupted Burst Cycle with Unobvious Order of Addresses 2-192 I Intel486TM PROCESSOR FAMILY Intel486 processor to run additional cycles to complete what would have been only a single 32-bit cycle. 858# and 8516# may change the state of 8LAST # when they force subsequent cycles from the transfer. ' 10.2.5 8- AND 16-BIT CYCLES The Intel486 processor supports both 16· and 8-bit external buses through the 8516# and 858# inputs. 8516 # and 858 # allow the external system to specify, on a cycle by cycle basis, whether the addressed component can supply 8, 16 or 32 bits. 8516# and 858# can be used in burst cycles as well as non-burst cycles. If both 8516# and 858# are returned active for any bus cycle, the Intel486 processor will respond as if only 858 # were active. Figure 10-18. shows an example in which 8S8 # forces the Intel486 processor to run two extra cycles to complete a transfer. The Intel486 processor issues a request for 24 bits of information. The external system drives 8S8 # active indicating that only eight bits of data can be supplied per cycle. The Intel486 processor issues two extra cycles to complete the transfer. The timing of 8516# and 858# is the same as that of KEN#. 8516# and 858# must be driven active before the first RDY # or 8RDY # is driven active. Driving the 8516# and 858# active can force the Ti T1 T2 T2 T1 T2 T1 Ti ClK \Io....-~/ ADS# A2-A31 1iI/10# D/c# w/R# __~x~~__~____~___ : c= __-+I~X~____~X~~,-, __~X~____:_C= I I BEO-3- L , RDY# BS8" BlAST- DATA w _---.---JX'----'---I! ----~~0 w w \\.--'---oJ! \ \ C . 0~~-8-242202-A2 • To Processor Figure 10-18. 8·Blt Bus Size Cycle I 2-193 Intel486TM PROCESSOR FAMILY Extra cycles forced by the 8S16# and 858# should be viewed as independent bus cycles. BS16# and BS8# should be driven active for each additional cycle unless the addressed device has the ability to change the number of bytes it can return between cycles,' The Intel486 processor will drive BLAST # inactive until the last cycle before the transfer is complete. 8S8# and BS16# operate during burst cycles in exactly the same manner as non-burst cycles. For example, a single non-cacheable read could be transferred by the Intel486-processor as four 8-bit burst data cycles. Similarly, a single 32-bit write could be written as four 8-bit burst data cycles. An example of a burst write is shown in Figure 10-19. Burst writes can only occur if BS8# or BS16# is asserted. Refer to section 10.1.2, "Dynamic Data Bus Sizing," for the sequencing of addresses while 8S8# or BS16# are active. Tl Tl T2 T2 T2 T2 Tl CLK AOS# - - - - . , . . :- - . , AOOR SPEC BEG-311 ! I / ! ----rl--JX\..--rI-----'1:----~--..,..----r---'x: _---I..:--JX I : X I X I X: X : I x: I i · ROYtl BROYtI Bwm::::::::X:::~~7----------~~'~~~;-_ DATA ------~---t~(C::::::~F~~~~UM~::~~>242202-A3 Figure 10·19. Burst Write as a Result of B58# or B516# 2-194 I Intel486™ PROCESSOR FAMILY processor is performing a read-modify-write operation and the external bus should not be relinquished until the cycle is complete. Multiple reads or writes can be locked. A locked cycle is shown in Figure 10-20. LOCK # goes active with the address and bus definition pins at the beginning of the first read cycle and remains active until ROY # is returned for the last write cycle. For unaligned 32 bits read-modifywrite operation, the LOCK.# remains active for the entire duration of the multiple cycle. It will go inactive when ROY # is returned for the last write cycle. 10.2.6 LOCKED CYCLES Locked cycles are generated in software for any instruction that performs a read-modify-write operation. During a read-modify-write operation the Intel486 processor can read and modify a variable in external memory and be assured that the variable is not accessed between the read and write. Locked cycles are automatically generated during certain bus transfers. The xchg (exchange) instruction generates a locked cycle when one of its operands is memory based. Locked cycles are generated when a segment or page table entry is updated and during interrupt acknowledge cycles. Locked cycles are also generated when the LOCK instruction prefix is used with selected instructions. When LOCK# is active, the Intel486 processor will recognize address hold and backoff but will not recognize bus hold. It is left to the external system to properly arbitrate a central bus when the Intel486 processor generates LOCK # . Locked cycles are implemented in hardware with the LOCK # pin. When LOCK # is active, the Intel486 Ti Tl T2 Tl Ti T2 CLK ADS# \\--!--,/ \\-~J A2-A31 M/IO# D/C# BEO-3# W/R# RDY# DATA ----!---+--{s>------« *t ) I LOCK# READ WRITE 242202-A4 • To Processor •• From Processor Figure 10-20. Locked Bus Cycle I 2-195 Intel486TM PROCESSOR FAMILY The first cycle of a 64-bit floating point write is the only case in which both PLOCK# and 8LA8T# are asserted. Normally PLOCK # and 8LA8T # are the inverse of each other. 10.2.7 PSEUDO-LOCKED CYCLES Pseudo-locked cycles assure that no other master will be given control of the bus during operand transfers which take more than one bus cycle. During all of the cycles where PLOCK# is asserted, HOLD is not acknowledged until the cycle completes. This results in a large HOLD latency, espe~ cially when 888# or 8816# is asserted. To reduce the HOLD latency during these cycles, windows are available between transfers to allow HOLD to be acknowledged during non-cacheable code prefetches. PLOCK# will be asserted since 8LA8T# is negated, but it is ignored and HOLD is recognized during the prefetch. For the Intel486 processor, examples include 64-bit description loads and cache line fills. Pseudo-locked transferS are indicated by the PLOCK # pin. The memory operands must be aligned for correct operation of a pseudo-locked cycle. PLOCK # need not be examined during burst reads. A 64-bit aligned operand can be retrieved in one burst (note: this is only valid in systems that do not interrupt bursts). PLOCK # can change several times during a cycle settling to its final value in the clock ready is returned. The system must examine PLOCK # during 64-bit writes since the Intel486 processor cannot burst write more than 32 bits. However, burst can be used within each 32-bit write cycle if 888# or 8816# is asserted. 8LA8T will be de-asserted in response to 888# or 8816#. A 64-bit write will be driven out as two non-burst bus cycles. 8LA8T# is asserted during both writes since a burst is· not possible. PLOCK # is asserted during the first write to indicate that another write follows. This behavior is shown in Figure 10-21. For Intel486 OX, Inte1DX2, Write-8ack Enhanced Inte1DX2, andlntelDX4· processors, 64-bit floating point read and write cycles are also examples of operand transfers that take more than one bus cycle. T2 T1 TI 10.2.7.1 Floating Point Read and Write Cycles 12 T1 TI CLK i W-J AOS. A2-A31 I ~ ~=:::::~::X~~+I_ _ _ _ _ ~JX~-r~;__________~______ BEO-3I ~......I1 W/fIJI _ _ _ I I ---.;---J~ / PLOCKlJ ______ L ROY' Bl.ASn _ _ ~x-:t=\ CC\ C I DATA -----1---~-{:::*~*=}-_+.,.--C~*~*~ WRITE WRITE 242202-A5 •• From Processor Figure 10-21. Pseudo Lock Timing 2-196 I Intel486TM PROCESSOR FAMILV 10.2.8 INVALIDATE CYCLES Invalidate cycles are needed to keep the Intel486 processor internal cache contents consistent with external memory. The Intel486 processor contains a mechanism for listening to writes by other devices to external memory. When the Intel486 processor finds a write to a section of external memory contained in its internal cache, the Intel486 processor's internal copy is invalidated. Invalidations use two pins, address hold request (AHOlD) and valid external address (EADS#). Th.ere are two steps in an invalidation cycle. First, the external system asserts the AHOlD input forcing the Intel486 processor to immediately relinquish its address bus. Next, the external system asserts EADS# indicating that a valid address is on the Intel486 processor address bus. Figure 10-22 shows the fastest possible invalidation cycle. The Intel486 processor recognizes AHOlD on one ClK edge and floats the address bus in response. To allow the address bus to float and avoid contention, EADS# and the invalidation address should not be driven until the following ClK edge. The Intel486 processor reads the address over its address lines. If the Intel486 processor finds this address in its internal cache, the cache entry is invalidated. Note that the Intel486 processor address bus is input/output, unlike the Intel386 processor's bus, which is output only. I The Intel486 processor immediately relinquishes its address bus in the next clock upon assertion of AHOlD. For example, the bus could be 3 wait states into a read cycle. If AHOLD is activated, the Intel486 processor will immediately float its address bus before ready is returned terminating the bus cycle. When AHOlD is asserted only the address bus is floated, the data bus can remain active. Data can be returned for a previously specified bus cycle during address hold. (See Figure 10-22 and Figure 10-23.) EADS # is normally asserted when an external master drives an address onto the bus. AHOlD need not be driven for EADS # to generate an internal invalidate. If EADS# alone is asserted while the Intel486 processor is driving the address bus, it is possible that the invalidation address will come from the Intel486 processor itself. Note that it is also possible to run an invalidation cycle by asserting EADS# when BOFF# is asserted or after HlDA has been returned, following the assertion of HOLD. Running an invalidate cycle prevents the Intel486 processor cache from satisfying other internal requests, so invalidations should be run only when necessary. The fastest possible invalidate cycle is shown in Figure 10-22, while a more realistic invalidation cycle is shown in Figure 10-23. Both of the examples take one clock of cache access from the Intel486 processor. 2-197 Intel486TM PROCESSOR FAMILY Ti Tl Ti T2 Ti T1 Ti T2 CLK , , W-1 , ADS· ADDR W-1 , --~X~i~)r:~0~~(~i==CiQ / AHOLD , V EADS. ROY. , DATA --~~~G)r--------~~ , , I \ i \ BREQ • To Processor 242202-A6 Figure 10-22. Fast Internal Cache Invalidation Cycle Ti T1 T2 Ti Ti Ti T2 T1 CLK I I I ADS· ADDR u...J ,, X L.L.J I I 0 i ) /' AHOLD ( ,, \ i I \JJ, EADS· I ROY. 0 DATA BREQ / \ ~ / * To Processor 242202-A7 Figure 10-23. Typical Internal Cache Invalidation Cycle 2-198 I Intel486™ PROCESSOR FAMILY 10.2.8.1 Rate of Invalidate Cycles The Intel486 processor can accept one invalidate per clock except in the last clock of a line fill. One invalidate per clock is possible as long as EADS# is negated in ONE or BOTH of the following cases: 1. In the clock ROY # or BRDY # is returned for the last time. 2. In the clock following ROY # or BRDY # being returned for the last time. This definition allows two system designs. Simple designs can restrict invalidates to one every other clock. The simple design need not track bus activity. Alternatively, systems can request one invalidate per clock provided that the bus is monitored. 10.2.8.2 Running Invalidate Cycles Concurrently with Line Fills Precautions are necessary to avoid caching stale data in the Intel486 processor cache in a system with a second level cache. An example of a system with a second level cache is shown in Figure 10-24. An external device can be writing to main memory over the system bus while the Intel486 processor is retrieving data from the second level cache. The Intel486 processor will need to invalidate a line in its internal cache if the external device is writing to a main memory address also contained in the Intel486 processor cache. A potential problem exists if the external device is writing to an address in external memory, and at the same time the Intel486 processor is reading data from the same address in the second level cache. The system must force an invalidation cycle to invalidate the data that the Intel486 processor has requested during the line fill. If the system asserts EADS# before the first data in the line fill is returned to the Intel486 processor, the system must return data consistent with the new data in the external memory upon resumption of the line fill after the invalidation cycle. This is illustrated by the asserted EADS# signal labeled 1 in Figure 10-25. If the system asserts EADS# at the same time or after the first data in the line fill is returned (in the same clock that the first ROY # or BRDY # is returned or any subsequent clock in the line fill) the data will be read into the Intel486 processor input buffers but it will not be stored in theon-chip cache. This is illustrated by asserted EADS# signal labeled 2 in Figure 10-25. The stale data will be used to satisfy the request that initiated the cache fill cycle. 10.2.9 BUS HOLD Intel486'" Processor Add",ss. Data & Control Bus System Bus 242202-A8 Figure 10-24. System with Second Level Cache I The Intel486 processor provides a bus hold, hold acknowledge protocol using the bus hold request (HOLD) and bus hold acknowledge (HLDA) pins. Asserting the HOLD input indicates that another bus master desires control of the Intel486 processor bus. The Intel486 processor will respond by floating its bus and driving HLDA active when the current bus cycle, or sequence of locked cycles is complete. An example of a HOLD/HLDA transaction is shown in Figure 10-26. Unlike the Intel386 processor, the Intel486 processor can respond to HOLD by floating its bus and asserting HLDA while RESET is asserted. Note that HOLD will be recognized during un-aligned writes (less than or equal to 32-bits) with BLAST # being active for each write. For greater than 32-bit or un-aligned write, HOLD# recognition is prevented by PLOCK # getting asserted. However, HOLD is recognized during non-cacheable, non-burstable code prefetches even though PLOCK # is active. 2-199 Intel486TM PROCESSOR FAMILY TI TI 12 12 12 T2 12 T2 Ti elK I ADS# LW I AOOR AHOlO --~X~i_~~--~(=== I ,, \'----, I I \:Jf\2J [ADS- I I ROY. BROY. KEN" W I I I W I I DATA • To Processor 242202-A9 NOTES: 1. Data returned must be consistent if its address equals the invalidation address in this clock 2. Data returned will not be cached if its address equals the invalidation address in this clock Figure 10-25. Cache Invalidation Cycle Concurrent with Line Fill For cacheable and non-bursted or bursted cycles, HOLD is acknowledged during backoff only if HOLD and BOFF # are asserted during an active bus cycle (after ADS# asserted) and before the first RDY# or BRDY# has been returned (see Figure 10-27). The order in which HOLD and BOFF # go active is unimportant (so long as both are active prior to the first RDY # IBRDY # returned by the system). 2-200 Figure 10-27 shows the case where HOLD is asserted first; HOLD could be asserted simultaneously or after BOFF # and still be acknowledged. The pins floated during bus hold are: BEO#-BE3#, PCD, PWT, W/R#, D/C#, MIIO#, LOCK#, PLOCK#, ADS#, BLAST#, DO-D31, A2-A31, DPO-DP3. I Intel486TM PROCESSOR FAMILY T1 T1 T1 T2 T1 T1 T1 CLK I ~~------~----~~\~-+-J/r~--~\--~----~ MnOI eE::::::::::::::X! i) C ROY. I DATA I I I / \'-"""'T"'""- -----:----r---I H~ I -r-.---+-- ----+----+--+-~(::::*;*:::>-) HOLD I ______~____~____~____~J/ \.. 242202-80 •• From Processor Figure 10-26. HOLD/HLDA Cycles elK ADS# -'IV'""\..J iW iWIV!U '""\..JIViU ru '""\..Ji\ ' -~ t.t/IO# I D/C# \. W/R# \. KEN# BRDY# RDY# HOLD I I HlDA BOFF# \. 242202-81 Figure 10-27. HOLD Request Acknowledged during BOFF# I 2-201 Intel486™ PROCESSOR FAMILY 10.2.10 INTERRUPT ACKNOWLEDGE The Intel486 processor generates interrupt acknowledge cycles in response to maskable interrupt requests generated on the interrupt request input (INTR) pin. Interrupt acknowledge cycles have a unique cycle type generated on the cycle type pins. An example of an interrupt acknowledge transaction is shown in Figure 10-28. Interrupt acknowledge cycles are generated in locked pairs. Data returned during the first cycle is ignored. The interrupt vector is returned during the second cycle on the lower 8 bits of the data bus. The Intel486 processor has 256 possible interrupt vectors. 11 T1 T2 \ / The state of A2 distinguishes the first and second interrupt acknowledge cycles. The byte address driven during the first interrupt acknowledge cycle is 4 (A31-A3 low. A2 high. BE3 # -BE1 # high. and BEO# low). The address driven during the second interrupt acknowledge cycle is 0 (A31-A2 low. BE3#-BE1# high. BEO# low). Each of the interrupt acknowledge cycles are terminated when the external system returns ROY # or BRDY#. Wait states can be added by withholding RDY# or BRDY#. The Intel486 processor automatically generates four idle clocks between the first and second cycles to allow for 8259A recovery time. TI n' 11 T2 Ti CLK ADS# ADDR :. 4 CLOCKS .: \ X / X RDY# DATA ---".----@I, LOCK# i--~-_i_--...JI • To Processor 242202-82 Figure 10-28. Interrupt Acknowledge Cycles 2-202 I, Intel486TM PROCESSOR FAMILV Two of the special cycles indicate halt or shutdown. Another special cycle is generated when the Intel486 processor executes an INVD (invalidate data cache) instruction and could be used to flush an external cache. The Write Back cycle is generat· ed when the Intel486 processor executes the WBINVD (write·back invalidate data cache) instruc· tion and could be used to synchronize an external write· back cache. 10.2.11 SPECIAL BUS CYCLES The Intel486 processor provides special bus cycles to indicate that certain instructions have been exe· cuted, or certain conditions have occurred internally. The special bus cycles in Table 10·9 are defined when the bus cycle definition pins are in the follow· ing state: M/IO# =0, D/C# =0 and W/R# = 1. During theSe cycles the address bus is driven low while the data bus is undefined. The external hardware must acknowledge these special bus cycles by returning ROY # or BRDY #. Table 10-9.Special Bus Cycle Encoding Cycle Name MIIO# D/C# W/R# BE3#-BEO# A4-A2 Write-Back(1) First Flush Ack Cycle(1) Flush(1) Second Flush Ack Cycle(1) Shutdown HALT Stop Grant Ack Cycle(2) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0111 0111 1101 1101 1110 1011 1011 000 001 000 001 000 000 001 NOTES: 1. These cycles are specific to the Write· Back Enhanced IntelDX2 processor. (See section 7.4.1, "Snoop Cycles and Write· Back Invalidation.") The FLUSH# cycle is applicable to all Intel486 processors. See appropriate sections. 2. See section 9.6.1, "Stop Grant Bus Cycle," for details. T; T1 T2 Tb Tlb Tb T2 T2 T2 T2 BROY· KENft 80Fr# BLAST· w ,, LP , DATA W, I \ \ :, ,, cp ,, ' ,, l.i-fi , , , ,, W , ---_--~--~--~--t---~~__{ • To Processor 242202-83 Figure 10-29. Restarted Read Cycle I 2·203 Intel486TM PROCESSOR FAMILY definition Signals in special bus cycle state and a byte address of O. 10.2.11.1 HALT Indication Cycle The Intel486 processor halts as a result of executing a HALT instruction. Signaling its entrance into the HALT state, a HALT indication cycle is performed. The HALT indication cycle is identified by the bus definition signals in special bus cycle state and a byte address of 2. BEO# and BE2# are the only Signals distinguishing HALT indication from· shutdown indication, which drives an address of O. During the HALT cycle, undefined data is driven on 00-031. The HALT indication cycle must be acknowledged by ROY # asserted. 10.2.11.3 Stop Grant Indication Cycle A speciat Stop Grant bus cycle will be driven to the bus after the processor recognizes the STPCLK # interrupt. The definition of this bus cycle is the same as the HALT cycle definition for, the Intel486 processor, with the exception that the Stop Grant bus cycle drives the value 0000 0010H on the address pins. The system hardware must acknowledge this cycle by returning ROY # or BROY #. The processor will not enter the Stop Grant state until either ROY # or BROY# has been returned. (See Figure 10-31.) A halted Intel486 processor resumes execution when INTR (if interrupts are enabled) or NMI or RESET is asserted. The Stop Grant Bus Cycle is defined as follows: M/IO# = D, O/C# = 0, W/R# = 1, Address Bus = .0000 0010H (A4 = 1), BE3#-BEO# = lOll, Data bus = undefined. 10.2.11.2 Shutdown Indication Cycle The Intel486 processor shuts down as a result of a protection fault while attempting to process a double fault. Signaling its entrance into the shutdown state, a shutdown indication cycle is performed. The shutdown indication cycle is identified by the bus n T2 Tl The latency between a STPCLK# request and the Stop Grant bus cycle is dependent on the current instruction, the amount of data in the processor write buffers, and the system memory performance. 1b n T2 Tlb 1b ClK ADStl ADDR SPEC X ------~I-J 100) I I ( I I 100 I I x: RDYtI BRDYtI BOFFtI DATA ( *~ ) ( ** )242202-84 •• From Processor Figure 10-30. Restarted Write Cycle 2-204 I Intel486™ PROCESSOR FAMILY ClK STPClK# i AOOR ~ ____ __ __ ~ ~ BROY# or ROY# I I I I ~_x: Stop: Grant Bus 9YCle \ I I I I I I I I I 242202-85 Figure 10-31. Stop Grant Bus Cycle 10.2.12 BUS CYCLE RESTART In a multi-master system another bus master may require the use of the bus to enable the Intel486 processor to complete its current bus request. In this situation the Intel486 processor will need to restart its bus cycle after the other bus master has completed its bus transaction. A bus cycle may be restarted if the external system asserts the backoff (BOFF#) input. The Intel486 processor samples the BOFF # pin every clock. The Intel486 processor will immediately (in the next clock) float its address, data and status pins when BOFF# is asserted (see Figures 10-29 and 10-34). Any bus cycle in progress when BOFF # is asserted is aborted and any data returned to the processor is ignored. The same pins are floated in response to BOFF# as are floated in response to HOLD. HlDA is not generated in response to BOFF #. BOFF # has higher priority than ROY # or BRDY #. If either ROY # or BRDY # are returned in the same clock as BOFF #, BOFF # takes effect. The device asserting BOFF # is free to run any cycles it wants while the Intel486 processor bus is in its high impedance state. If backoff is requested after the Intel486 processor has started a cycle, the new master should wait for memory to return ROY # I or BRDY # before assuming control of the bus. Waiting for ready provides a handshake to insure that the memory system is ready to accept a new cycle. If the bus is idle when BOFF # is asserted, the new master can start its cycle two clocks after issuing BOFF#. The external memory can view BOFF # in the same manner as BlA5T #. Asserting BOFF # tells the external memory system that the current cycle is the last cycle in a transfer. The bus remains in the high impedance state until BOFF # is negated. Upon negation, the Intel486 processor restarts its bus cycle by driving out the address and status and asserting AD5#. The bus cycle then continues as usual. Asserting BOFF# during a burst, B58# or B516# cycle will force the Intel486 processor to ignore data returned for that cycle only. Data from previous cycles will still be valid. For example, if BOFF # is asserted on the third BRDY # of a burst, the Intel486 processor assumes the data returned with the first and second BRDY # is correct and restarts the burst beginning with the third item. The same rule applies to transfers broken into multiple cycle by B58# or B516#. 2-205 Intel486TM PROCESSOR FAMIL V Asserting BOFF# in the same clock as ADS# will cause the Intel486 processor to float its bus in the next clock and leave ADS# floating low. Because ADS# is floating low, a peripheral may think that a new bus cycle has begun even-though the cycle was aborted. There are two possible solutions to this problem. The first is to have all devices recognize this condition and ignore ADS# until ready comes back. The second approach is to use a "two clock" backoff: in the first clock AHOLD is asserted, and in the second clock BOFF # is asserted. This guarantees that ADS # will not be floating low. This is only necessary in systems where BOFF # may be asserted in the same clock as ADS # . 10.2.13 BUS STATES A bus state diagram is shown in Figure 10-32. A description of the signals used in the diagram is given in Table 10-10. (RDY# ASSERTED + (9RDY# • BLAST# )ASSERTED) • (HOLD + AHOLD + NO REQUEST) • BOFF# NEGATED REQUEST PENDING' (RDY# ASSERTED + (BRDY# • BLAST#)ASSERTED) • HOLD NEGATED • AHOLD NEGATED • BOFF# NEGATED • REQUEST PENDING • HOLD NEGATED' _ AHOLD NEGATED • BOFF# NEGATED BOFF# NEGATE/ BOFF# ASSERTED / ~~~~ ," 'Ov:l." ,,"c,c,<"'' 1- BOFF# NEGATED /BOFF#~ASSERTED AHOLD NEGATED • BOFF# NEGATED • (HOLD NEGATED 0) • HOLD is only, factored into this state transition if Tb was entered while a non-cacheable, non-bursted, code prefetch was in progress. Otherwise, ignore HOLD. 242202-86 Figure 10-32. Bus State Diagram Table 10-10. Bus State Description State Ti Means Bus is idle. Address and status signals may be driven to undefined values, or the bus to a high impedance state. may be floated T1 First clock cycle of a bus cycle. Valid address and status are driven and ADS# is assertE;!d. T2 Second and subsequent clock cycles of a bus cycle. Data is driven if the cycle is a write, or data is expected if the cycle is a read. RDY # and BRDY # are sampled. T1b First clock cycle of a restarted bus cycle. Valid address and status are driven and ADS# is asserted. Tb Second and subsequent clock cycles of an aborted bus cycle. 2-206 I intel® 10.2.14 FLOATING POINT ERROR HANDLING FOR THE INTEL486 DX, INTELDX2, ANDINTELDX4PROCESSORS The Intel486 DX, Inte1DX2, and IntelDX4 processors provide two options for reporting floating point errors. The simplest method is to raise interrupt 16 whenever an unmasked floating point error occurs. This option may be enabled by setting the NE bit in control register 0 (GRO). The Intel486 DX, Inte1DX2, and IntelDX4 processors also provide the option of allowing external hardware to determine how floating point errors are reported. This option is necessary for. compatibility with the error reporting scheme used in DOS based systems. The NE bit must be cleared in GRO to enable user-defined error reporting. User-defined error reporting is the default condition because the NE bit is cleared on reset. Two pins, floating point error (FERR#) and ignore numeric error (IGNNE#), are provided to direct the actions of hardware if user-defined error reporting is used. The Intel486 DX, Inte1DX2, and IntelDX4 processors assert the FERR # output to indicate that a floating point error has occurred. FERR # corresponds to the ERROR # pin on the Intel387™ math coprocessor. However, there is a difference in the behavior of the two. In some cases FERR # is asserted when the next floating point instruction is encountered, and in other cases it is asserted before the next floating point instruction is encountered depending upon the execution state of the instruction causing the exception. The following class of floating point exceptions drive FERR # at the time the exception occurs (Le., before encountering the next floating point instruction). 1. The stack fault, invalid operation, and denormal exceptions on all transcendental instructions, integer arithmetic instructions, FSORT, FSEALE, FPREM(1), FXTRAGT, FBLD, and FBSTP. 2. Any exceptions on store instructions (including integer store instructionS). I Intel486™ PROCESSOR FAMILV The following class of floating point exceptions drive FERR # only after encountering the next floating point instruction. 1. Exceptions other than on all transcendental instructions, integer arithmetic instructions, FSORT, FSGALE, FPREM(1), FXTRAGT, FBLD, and FBSTP. 2. Any exception on all basic arithmetic, load, compare, and control, instructions (Le., all other instructions). For both sets of exceptions above, the Intel387 math coprocessor asserts ERROR # when the error occurs and does not wait for the next floating point instruction to be encountered. IGNNE# is an input to the Intel486 DX, Inte1DX2, and IntelDX4 processors. When the NE bit in GRO is cleared, and IGNNE# is asserted, the Intel486 DX, Inte1DX2, and IntelDX4 processors will ignore a user floating point error and continue executing floating point instructions. When IGNNE# is negated, the IGNNE# is an input to these processors that will freeze on floating point instructions which get errors (except for the control instructions FNGLEX FNINIT FNSAVE, FNSTENV, FNSTGW, FNSTSW, FNSTSW AX, FNENI, FNDISI and FNSETPM). IGNNE# may be asynchronous to the Intel486 OX, Inte1DX2, and IntelDX4 processor clock. In systems with user-defined error reporting, the FERR # pin is connected to the interrupt controller. When an unmasked floating point error occurs, an interrupt is raised. If IGNNE# is high at the time of this interrupt, the Intel486 OX, Inte1DX2, and IntelDX4 processors will freeze (disallowing execution of a subsequent floating point instruction) until the interrupt handler is invoked. By driving the IGNNE# pin low (when clearing the interrupt request), the interrupt handler can allow execution of a floating point instruction, within the interrupt handler, before the error condition is cleared (by FNGLEX, FNINIT, FNSAVE or FNSTENV). If execution of a non-control floating point instruction, within the floating point interrupt handler, is not needed, the IGNNE# pin can be tied HIGH. 2-207 Intel486TM PROCESSOR FAMILY 10.2.15 INTEL486 DX, INTELDX2, AND INTELDX4 PROCESSORS FLOATING POINT ERROR HANDLING IN AT·COMPATIBLE SYSTEMS The Intel486 DX, Inte1DX2, and IntelDX4 processors provide special features to allow the implementation of an AT-compatible numerics error reporting scheme. These features DO NOT replace the external circuit. Logic is still required that decodes the OUT FO instruction and latches the FERR # signal. What follows is a description of the use of these Intel Processor features. The features provided by the Intel486 DX, Inte1DX2, and IntelDX4 processors are the NE bit in the Machine Status Register, the IGNNE II pin, and the FERR# pin. The NE bit determines the action taken by the Intel486 DX, Inte1DX2, and IntelDX4 processors when a numerics error is detected. When set this bit signals that non-DOS compatible error handling will be implemented. In this mode the Intel486 DX, Inte1DX2, and IntelDX4 processors take a software exception (16) if a numerics error is detected. If the NE bit is reset, the Intel486 DX, Inte1DX2, and IntelDX4 processors use the IGNNE# pin to allow an external circuit to control the time at which noncontrol numerics instructions are allowed to execute. Note that floating point control instructions such as FNINIT and FNSAVE can be executed during a floating point error condition regardless of the state of IGNNE#. 2-208 To process a floating point error in the DOS environment the following sequence must take place: 1. The error is detected by the Intel486 DX, Inte1DX2, and IntelDX4 processor that activates the FERR # pin. 2. FERR # is latched so that it can be cleared by the OUT FO instruction. 3. The latched FERR II signal activates an interrupt at the interrupt controller. This interrupt is usually handled on IRQ13. 4. The Interrupt Service Routine (ISR) handles the error and then clears the interrupt by executing an OUT instruction to port FO. The address FO is decoded externally to clear the FERR # latch. The IGNNEII signal is also activated by the decoder . output. 5. Usually the ISR then executes an FNINIT instruction or other control instruction before restarting the program. FNINIT clears the FERR# output. Figure 10-3;3 illustrates a sample circuit that will perform the function described above. Note that this circuit has not been tested and is included as an example of required error handling logic. Note that the IGNNE# input allows non-control instructions to be executed prior to the time the FERR # signal is reset by the Intel486 DX, Inte1DX2, and IntelDX4 processors. This function is implemented to allow exact compatibility with the AT implementation. Most programs reinitialize the floating point unit before continuing after an error is detected. The floating point unit can be reinitialized using one of the following four instructions: FCLEX, FINIT, FSAVE and FSTENV. I Intel486TM PROCESSOR FAMILY RESET I I/O PORT aFOH Address decoder I Processor Bus I C) 5V CLR Q oU 0 - FERR# PR L.... 5V ! CLR Q 5V Inte1486 1M oU Processor ,-0 PR IRQ13 8259A Programmable Interrupt Controller L.... 5V IGNNE# INTR 242202-87 Figure 10-33. DOS-Compatible Numerics Error Circuit I 2-209 Intel486™ PROCESSOR FAMILY 4. The FLUSH# signal behaves the same as the WBINVD instruction. Upon assertion, FLUSH# writes back all modified lines, invalidates the cache, and issues two special bus cycles. 10.3 Enhanced Bus Mode Operation (Write-Back Mode) for the WriteBack Enhanced IntelDX2 Processor 5. The PLOCK# signal remains inactive in the Enhanced Bus mode. This section describes how the' processor bus operation changes for the Enhanced Bus mode when the internal cache is configured in write-back mode. 10.3.2 BURST CYCLES Figure 10-34 shows a basic burst read cycle of the Write-Back Enhanced IntelDX2 processor. In the Enhanced Bus mode, both PCD and CACHE# are asserted if the cycle is internally cacheable. The WriteBack Enhanced IntelDX2 processor samples KEN# in the clock before the first BRDY#. If KEN# is returned active by the system, this cycle is transformed into a multiple-transfer cycle. With each data item returned from external memory, the data is 'cached' only if KEN # is returned active again in the clock before the last BRDY # signal. Data is sampled only in the clock in which BRDY # is returned. If the data is not sent to the processor every clock, it causes a "slow burst" cycle. 10.3.1 SUMMARY OF BUS DIFFERENCES The following is a list of the differences between the Enhanced and Standard Bus modes: 1. Burst write capability is extended to four doubleword burst cycles (for write-back cycles only) 2. Four new signals: INV, WB/WT#, HITM#, and CACHE #, have been added to support the writeback operation of the internal cache. These signals function the same as the equivalent signals on the Pentium™ OverDrive™ Processor pins. 3. The SRESET signal has been modified so that it neither writes back, invalidates, nor disables the cache. Special test modes are also not initiated through SRESET. 2 3 4 5 8 7 8 9 : 10 : 11 . ADSt:.~l ~1~ : 12 : 1~ .UllJ..I.LtlJl eLK . , I . '1 __' __~__________~~__~__~__~__~__4 -_ _ ~ MnOtI. O/Ct,~ WfRII , , -..:..----7---7---~--i-__::---__:---:------;---_i_--_;_-- ~~'~~_O~~~~':--~---T---'---r--~--' . .....,_ _ _ _ _ ~ __+ _ - - _ - < _ , BLAST. ~ . .... ......"..f.... " ..,,'... , ..; , . . . . . ... -... t-.... . ......................;.... .,. \.......,---;.---.,.----;...--....;.---+---.;----;, CACHE.~'_...;':---._:.,_I!.,...:..-.:-"--".+-:.-..- - 1 - -.-.....,...--.- - - . -..;..:- - -........-".-".-'.:'-...-..---------' PeO ~~__~__....;-__'~'______-.,.__-.,.__''',.J_.---"'··-···-··TI.---":r"--, BROV. V1fZZZ/r"'''(m'Y7ZX(7&:! . WBIWT.! ,, . . .: 8 242202-88 Figure 10-34. Basic Burst Read Cycle 2·210 I Intel486TM PROCESSOR FAMILY 10.3.2.1 Non-Cacheable Burst Operation If CACHE# is asserted on a read cycle, it indicates that the processor will follow with BLAST # high if KEN# is returned active. However, the converse is not true. The Write-Back Enhanced IntelDX2 processor may elect to read-burst data, which are identified as non-cacheable by either CACHE# or KEN#. In this case, BLAST# is also high in the same cycle as the first BRDY # (in clock four). To improve performance, the memory controller should try to complete the cycle as a burst cycle. The assertion of CACHE # on a write cycle signifies a replacement or snoop write-back cycle. These cycles consist of four doubleword transfers (either bursts or non-burst). The signals KEN # and WB/WT # are not sampled during write-back cycles because the processor does not attempt to redefine the cacheability of the line. 10.3.2.2 Burst Cycle Signal Protocol The signals from ADS# through BLAST#, which are shown in Figure 10-34, have the same function and timing in both Standard and Enhanced Bus modes. Burst cycles can be up to 16-bytes long (four aligned doublewords) and can start with anyone of the four doublewords. The sequence of the addresses are determined by the first address and the sequence follows the order shown previously in Table 10-8. The burst order for reads is the same as the burst order for writes. (See section 10.2.4.2, "Burst and Cache Line Fills.") An attempted line fill, which is caused by a read miSS, is indicated by the assertion of CACHE# and W/R# to low. For a line fill to occur, the system must assert KEN # twice: one clock prior to the first BRDY # and one clock prior to last BRDY #. It takes only one assertion of KEN# to mark the line as noncacheable. I A write-back cycle of a cache line, due to replacement or snoop, is indicated by the assertion of CACHE# low and W/R# high. KEN# has no effect during write-back cycles. CACHE# is valid from the assertion of ADS# through the clock in which the first RDY# or BRDY# is returned. CACHE# is inactive at all other times. PCD behaves the same in Enhanced Bus mode as in Standard Bus mode, except that it is low during write-back cycles. The Write-Back Enhanced IntelDX2 processor samples WB/WT # once, in the same clock as the first BRDY #. This sampled value of WB/WT # is combined with PWT to bring the line into the internal cache, either as a write-back line or write-through line. 10.3.3 CACHE CONSISTENCY CYCLES The system performs snooping to maintain cache consistency. Snoop cycles can be performed under AHOLD, BOFF#, or HOLD, described in Table 10-11. The snoop cycle begins by checking whether a particular cache line has been "cached" and invalidates the line based on the state of the INV pin. If the Write-Back Enhanced IntelDX2 processor is configured in Enhanced Bus mode, the system must drive INV high to invalidate a particular cache line. The Write-Back Enhanced IntelDX2 processor does not have an output pin to indicate a snoop hit to an S-state line or an E-state line. However, the WriteBack Enhanced IntelDX2 processor will invalidate the line if the system snoop hits an S-state, E-state, or M-state line, provided INV was driven high during snooping. If INV is driven low during a snoop, a modified line will be written back to memory and will remain in the cache as a write-back line; a writethrough line also will remain in the cache as a writethrough line. 2-211 Intel486TM PROCESSOR FAMILY Table 10-11. Snoop Cycles under AHOLD, BOFF#, or HOLD AHOLD Floats the address bus. ADS # is asserted under AHOLD only to initiate a snoop write-back cycle. An ongoing burst cycle is completed under AHOLD. For non-burst cycles, a specific non-burst transfer (ADS#-RDY# transfer) is completed under AHOLD and fractured before the next assertion of ADS # . A snoop write-back cycle is reordered ahead of a fractured non-burst cycle and the non-burst cycle is completed only after the snoop write-back cycle is completed, provided there are no other snoop write-back cycles scheduled. BOFF# Overrides AHOLD and takes effect in the next clock. On-going bus cycles will stop in the clock following the assertion of BOFF # and resume when BOFF # is de-asserted, A snoop is the only bus cycle the Write-Back Enhanced IntelDX2TM processor responds to under BOFF #. Snoop write-back will be reordered ahead of the backed-off cycle. The snoop write-back cycle begins after BOFF # is decasserted followed by the backed-off cycle. HOLD HOLD is acknowledged only between bus cycles, except fora non·cacheable, non-bursted code prefetch cycle. In a non-cacheable, non-bursted code prefetch cycle, HOLD is acknowledged after the system returns RDY #. Once HOLD is active, the processor blocks all bus activities until the system releases the bus (by de-asserting HOLD). After asserting AHOLD or BOFF #, the external bus master driving the snoop cycle must wait for two clocks before driving the snoop address and asserting EADS#. If snooping is done under HOLD, the master performing the snoop must wait for at least one clock cycle before driving the snoop addresses and asserting EADS#. INV should be driven low during read operations to minimize invalidations, and INV should be driven high to invalidate a cache line during write operations. The WriteBack Enhanced IntelDX2 processor asserts HITM # if the cycle hits a modified line in the cache. This output Signal becomes valid two clock periods after EADS# is valid on the bus. HITM# remains asserted until the modified line is written back and will remain asserted until the RDY # or BRDY # of the snoop cycle is returned. Snoop operations could interrupt an ongoing bus operation in both the Standard Bus and Enhanced Bus modes. The WriteBack Enhanced IntelDX2 processor can accept EADS # In every clock period while in Standard Bus mode. In Enhanced Bus mode, the WriteBack Enhanced IntelDX2 processor can accept EADS#every other clock period or until a snoop hits an M-state line. The Write-Back Enhanced 2·212 IntelDX2 processor will not accept any further snoop cycles input until the previous snoop write-back op· eration is completed. All write-back cycles adhere to the burst address sequence ofO-4-8-C. The CACHE#, PWT, and PCD output pins are asserted and the KEN # and WB/WT # input pins are ignored. Write-back cycles can be either bursted or non·bursted. All write-back operations write 16 bytes of data to memory corresponding to the modified line that hit during the snoop. Note that the Write-Back Enhanced In· lelDX2 processor will accept BS8# and BS16# IIne·fill cycles, but not on replacement or snoopforced write-back cycles. 10.3.3.1 Snoop Collision with a Current Cache Line Operation The system can also perform snooping concurrent with a cache access and may collide with a current cache bus cycle. Table 10-12 lists some scenarios and the results of a snoop operation colliding with an on·going cache fill or replacement cycle. I Intel486™ PROCESSOR FAMILY Table 10-12. Various Scenarios of a Snoop Write-Back Cycle Colliding with an On-Going Cache Fill or Replacement Cycle Arbitration Control AHOLD Snoop to the Line That Is Being Filled Read all line fill data into cache line buffer. Update cache only if snoop occurred with INV = "0" No write-back cycle because the line has not been modified yet Snoop to a Different Line from the Line Being Filled Snoop to the Line That Is Being Replaced Snoop to a Different Line from the Line Being Replaced Complete fill if the cycle is bursted. Start snoop write-back. Complete replacement write-back if the cycle is bursted. Processor does not initiate a snoop write-back, but asserts HITM# until the replacement writeback is completed. Complete replacement write-back if it is a burst cycle. Initiate snoop write-back. Stop replacement write-back Stop replacement . write-back Wait for BOFF # to go inactive Wait for BOFF # to be de-asserted Initiate snoop writeback Initiate snoop writeback Processor does not continue replacement write-back Continue replacement write-back from point of interrupt If the cycle is nonbursted, the snoop write-back will be reordered ahead of the line fill. After the snoop writeback cycle is completed, continue with line fill BOFF# Stop reading line fill data Wait for BOFF # to go inactive. Continue read from backed off point Update cache only if snoop occurred with INV = "0" \ HOLD I Stop fill Wait for BOFF # to go inactive Do snoop write-back Continue fill from interrupt point If the replacement write-back is a nonburst cycle, the snoop write-back cycle is reIf the replacement ordered in front of the cycle is non-bursted, replacement cycle. the snoop write-back is After the snoop writere-ordered ahead of the back, the replacement replacement write-back write-back is continued cycle. The processor from the interrupt point. does not continue with the replacement writeback cycle. HOLD is not acknowledged until the current bus cycle (I.e., the line operation) is completed, except for a non-cacheable, non-bursted code prefetch cycle. Consequently there can be no collision with the snoop cycles using HOLD, except as mentioned earlier. In this case the snoop write-back is reordered ahead of an on-going non-burst, non-cached code prefetch cycle. After the write-back cycle is completed, the code prefetch cycle continues from the point of interrupt. 2-213 Intel486TM PROCESSOR FAMILY write-back cycle. The Write-Back Enhanced IntelDX2 processor does not guarantee a dead clock between cycles unless the secOnd cycle is a snoop-forced write-back cycle. This allows snoopforced write-backs to be backed off (BOFF #) when snooping under AHOLD. 10.3.3.2 Snoop under AHOLD Snooping under AHOLD begins by asserting AHOLD to force the Write-Back Enhanced IntelDX2 processor to float its address bus, as shown in Figure 1035. The ADS# for the write-back cycle is guaranteed to occur no sooner than the second clock following the assertion of HITM # (i.e., there is a dead clock between the assertion of HITM # and the first ADS# of the snoop write-back cycle.) HITM# is guaranteed to remain asserted until the RDY# or BRDY# corresponding to the last doubleword of the write-back cycle is returned. HITM# will be de-asserted from the clock edge in which the last BRDY # or RDY # for the snoop write-back cycle is returned. The write-back cycle could be a bursted or non-bursted. In either case, 16 bytes of data corresponding to the modified line that has a snoop hit is written back. When a line is written back, KEN #, WB/WT # , BS8#, and BS16# are ignored, and PWT and PCD are always low during write-back cycles. The driven next ADS # for a new cycle can occur immediately after the last RDY # or BRDY # of the 1 : 2 4 : 5 : 3 6 7 ~ 8 : 9 ; 10 : 11 : 12 : 13 ClK AHOLD:; , ; \ ....... ~ ............. ~ ........ {...•.• ,...,'CV; . , · •• ·.·7 ..... · .. 1 .......... ~ ............j.-.......... .,. .... . EAOS.;"':-..;.--.... ~_. I 'NV1ZZ7Z7~7j{7/////i///7///7//)///)///Z///7//Z ! : . i . . i . ! . , i : : : ..........; ........... + ............:.......... I- .......... :.•. ~ HITMI, , b.·········~·······~T~·····~TG=~*~*~~:~ A31-A4, A3-A2 {( i==Jf---i--~-+: ~=t:X : 0 i~ \.lJ , ' ! : ; : ; ADS., BLAST#' r···· CACHE* BRDY# i,. . . . . . ,.., . . . : ... I m . . , • "--+----/ (!/ZZZZ41ZZ{ZZ/(ZZ#ZZ4ZZZ{A ;177A:/llt.,:/11t,/ , I ' WIRt#, 242202-89 • To Processor •• Write-back from Processor Figure 10-35. Snoop Cycle Invalidating a Modified Line 2-214 I Intel486TM PROCESSOR FAMILY Snoop under AHOLD Overlaying a Line-Fill Cycle The assertion of AHOLD during a line fill is allowed on the Write-Back Enhanced IntelDX2 processor. In this case, when a snoop cycle is overlaid by an ongoing line-fill cycle, the chipset must generate the burst addresses internally for the line fill to complete, because the address bus will have the valid snoop address. The write-back mode is more complex compared to the write-through mode because of the possibility of a line being written back. Figure 10-36 shows a snoop cycle overlaying a line-fill cycle, when the snooped line is not the same as the line being filled. In Figure 10-36, the snoop to an M-state line causes a snoop write-back cycle. The Write-Back Enhanced IntelDX2 processor will assert HITM # two clocks after the EADS #, but will delay the snoop write-back cycle until the line fill is completed, because the line fill shown in Figure 10-36 is a burst cycle. In this figure, AHOLD is asserted one clock after ADS#. In 2 3 4: 5 6 the clock after AHOLD is asserted, the Write-Back Enhanced IntelDX2 processor will float the address bus (not the Byte Enables). Hence, the memory controller must determine burst addresses in this period. The chipset must comprehend the special ordering required by all burst sequences of the Write-Back Enhanced IntelDX2 processor. HITM # is guaranteed to remain active until the write-back cycle completes. If AHOLD continues to be asserted over the forced write-back cycle, the memory controller also must supply the write-back addresses to the memory. The Write-Back Enhanced .lntelDX2 processor always runs the write-back with an address sequence of 0-4-B-C. In general, if the snoop cycle overlays any burst cycle (not necessarily a line-fill cycle) the snoop writeback will be delayed because of the on-going burst cycle. First, the burst cycle goes to completion and only then does the snoop write-back cycle start. 7! 8 9 . 10 : 11 : 12 ,13 : 14 ' ClK AHOLD .:.J.. ;.....:........'....\\..."",--;'-"'---:---i---"'--'-""':"'---";"'--': EADS#, ·· ..i.... ···· .... i·· ...U . , r ... •••• • ... i· .......... :............. :.. . HITM#, , "r" . . ·~ . ·Ol r o, 'U' BLAST#' CACHE#~... : . : .. m: ..~ .. ,m .......... BROY#~ /116:/11f':71lf,:! ..... ;.:\ /!1A:117f,,771f.,:; W/R#t\ : :! 242202-CO • To Processor •• Write· back from Processor Figure 10-36. Snoop Cycle Overlaying a Line-Fill Cycle I 2-215 Intel486TM PROCESSOR FAMIL V AHOLD Snoop Overlaying a Non-Burst Cycle AHOLD Snoop to the Same Line That Is Being Filled When AHOLD overlays a non-burst cycle, snooping is based on the completion of the current non-bursted transfer (ADS#-RDY# transfer). Figure 10-37 shows a snoop cycle under AHOLD overlaying a non-burst line-fill cycle. HITM # is asserted two clocks after EADS #, and the non-burst cycle is fractured after the RDY# for a specific single transfer is returned. The snoop write-back cycle is re-ordered ahead of an on-going non-burst cycle. After the write-back cycle is completed, the' fractured nonburst cycle will continue. The snoop write-back ALWAYS precedes thecompletion of a fractured cy- . cle, regardless of the point at which AHOLD is deasserted, and AHOLD must be de-asserted before the fractured non-burst cycle can complete. 2 : 3 4 5: 6 7 8 A system snoop will not cause a write-back cycle to occur if the snoop hits a line while the line is being filled. The processor does not allow a line to be modified until the fill is completed (and a. snoop will only produce a write-back cycle for a modified line). Although a snoop to a line that is being filled will not produce a write-back cycle, the snoop still has an effect based on the following rules: 1. The processor always snoops the line being filled. 2. In all cases, the processor uses the operand that triggered the line fill. 3. If the snoop occurs when INV = "1", the processor never updates the cache with the fill data. 4. If the snoop occurs when INV = "0", the processor loads the line into the internal cache. 9: 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : 18 :19 : 20 : 21 : 22 : 23 : .......... J I 0... EADU " - " " - .. ':..-..-....,..,; .. INY .m . . . . .' .n ..... 1 n . . . . . '"'" . . . . . . . . : . . . . . . . . . I I I ,~ ~ .... ......... .. ·· .... ·t· ,.... "1 ...····ufo··· ..........,..... 100AZWOo. Y;::;:i<1WZlmZIZ@ZZk%l1lzzozWzmzolmlm/1]Wm)m).;oJm/l!Jh . ' , . I I t I ' , .m ..:....S""····_T_... _.. ~_·.... T_ .. _···-+T_·_m... T_ .. ··______- HITMII' . : . ! I I t I .. ·.... r"_·~_....... I j . ADSt; ....... !'r'-'--.:.....-'--"---'--~ I I ····,···········T········..· ~__-+~~__+ -__~~:=:~~~~E=:===~ , ~, 10, I RD'IIJ~ CACHO~.:.....-....... , I I I I 'I 4 I I' I I I I I C I I I \._:_/~~~ --'--'---"....,'-+--V .......... I ... WIfU~ ..I. .. : : : ........ L . ........ 1....... . . ...... J.. .... ,... . ....-:--'i-......;.-+----.;--;--;--;...., . : ..........;...........:....·......... ·7 I ...... 1........1........ ·1•••• ·· I .1..... ...... f. ........l ......... .1.... I, .. ......;........ m.imm .. m~ ......... m:.......... ;.......... ;... \\... ...;. . ;._;............;-:_.-i---i---..,,...-"';"'--; I ........ !.. ... 1. ... • 1.. ....... 1.. I .. .. 1 ........ 1 ...... 1... . ...... 1..... 242202-C1 • To Processor •• Write-back from Processor Figure 10-37. Snoop Cycle Overlaying a Non-Burst Cycle 2-216 I Intel486™ PROCESSOR FAMILY cycle is bursted, the replacement cycle goes to completion. Only then is the snoop write-back cycle initiated. Snoop during Replacement Write-Back If the cache contains valid data during a line fill, one of the cache lines may be replaced as determined by the LRU algorithm. If the line being replaced is modified, this line will be written back to maintain cache coherency. When a replacement write-back cycle is in progress, it might be necessary to snoop the line that is being written back. (See Figure 10-38.) If the replacement write-back cycle is a non-burst cycle, and if there is a snoop hit to the same line as the line being replaced, it will fracture the replacement write-back cycle after the RDY # for the current non-burst transfer is returned. The snoop writeback cycle will be reordered in front of the fractured replacement write-back cycle and will be completed under HITM #. However, after AHOLD is de-asserted the replacement write-back cycle is not completed. If the replacement write-back cycle is bursted and there is a snoop hit to the same line as the line that is being replaced, the on-going replacement cycle runs to completion. HITM# is asserted until the line is written back and the snoop write-back will not be initiated. In this case, the replacement write-back is converted to the snoop write-back, and HITM# is asserted and de-asserted without a specific ADS # to initiate the write-back cycle. If there is a snoop hit to the line that is different from the one being replaced, the non-burst replacement write-back cycle will be fractured, and the snoop write-back cycle will be reordered ahead of the replacement write-back cycle. After the snoop writeback is completed, the replacement write-back cycle will continue. If there is a snoop hit to a different line from the line being replaced, and if the replacement write-back 3 : 4 , 5 , 6 : 7 : 8 : 9 : 10 : 11 AHOLD ~,' '..m...... :.·... _..... I 'j'" 18: , EADSII , : Ij... .........!I ........... . ........................ 'NV/ZZZZZZZZZZYCo..z7zzRZI)/ZZ)ZZZ7ZZ!7ZZA : .:\"..··.. ; ..........···;..·..·....... 7 ' . : . I 1 I I I . I I 1 HITtMI , ..t' .............( ·· ... ··..T······ 1 Al1-A4~~ ~ Al-A2XT") .. I i·REPt.ACE . -'j ;.X,.........-~--' ; '-Ti.", ; \ i..T .. i0r--"'=ic:--"";X'-~---r_., I ADSII~ ! : BLAST#' ( . . . ·. ~·G,· . · :.· . ·. . I CACHE#~ i i , ......: , r' ·········T············T .. ·········f BRDY#~\ WIR#, 242202-C2 • To Processor Figure 10-38. Snoop to the Line That Is Being Replaced I 2-217 Intel486™ PROCESSOR FAMILY 10.3.3.3 Snoop under BOFF# BOFF# is capable of fracturing any transfer, burst or non·burst. The output pins (see Table 3·8 and Table 3·9) of the Write· Back Enhanced IntelDX2 processor will be floated in the clock period following the as· sertion of BOFF #. If the system snoop hits a modi· fied line using BOFF #, the snoop write·back cycle will be reordered ahead of the current cycle. BOFF # must be de·asserted for the processor to perform a snoop write-back cycle and resume the fractured cy· cle. The fractured cycle resumes with a new ADS# and begins with the first uncompleted transfer. Snoops are permitted under BOFF #, but write·back cycles will not be started until BOFF# is de·assert· ed. Consequently, multiple snoop cycles can occur under a continuously asserted BOFF #, but only up to the first asserted HITM#. The system begins snooping by driving EADS # and INV in clock six. The assertion of HITM# in clock eight indicates that the snoop cycle hit a modified line and the cache line will be written back to memo· ry. The assertion of HITM # in clock eight and CACHE# and ADS# in clock ten identifies the beginning of the snoop write-back cycle. ADS # is guaranteed to be asserted no sooner than two clock periods after the assertion of HITM #. Write-back cycles always use the four-doubleword address sequence of 0-4-8-C (burst or non-burst). The snoop write-back cycle begins upon the de-assertion of BOFF # with HJTM # asserted throughout the duration of the snoop write-back cycle. If the snoop cycle hits a line that is different from the line being filled, the cache line fill will resume after the snoop write-back cycle completes, as shown in Figure 10-39. Snoop under BOFF # during Cache Line Fill As shown in Figure 10·39, BOFF # fractured the second transfer of a non·burst cache line·fill cycle. 2 3 4: 5 : 6 7 8 9: 10 . 11 : 12 : 13 : 14 : 15 : 16 : 17 .18 : 19 : 20: 21 : 22 23. ClK !. . v.. . .;. . ... ""Tn 'NV0mZZ/7zzz'WZzzzzl.!'!::;::xZzzz),jzzlzm7zzZZZZZZZZZZZ(lZZzdmmzomvWZZJR1Iim!Zm; .-----.--r--~~'~ EAOUI ... ! ........... r ........ : r ... ·..... ·t· I I I I : lI~EFllL 0( ; I '!".... HITMIIJ' Al10M I h ..... .. I I i ; : '\: ~ .. : ,WRn;E BACK CYqt.E . ··t·.. .. ,.... 1 .. .... t .. i m: i ' ..... !. I / , X I LINEFlllCYCL~ CaNT. X~-'----'-- ! , 8LAST.~ ................. t , . ············· .... ·t···· ····t···· ~~~-~--'-~--~~--~~T ... ~.. -.-....~ .. !.~~~-..-....-....-.....--'-~--~~--~-...~,....-....~ ... I ! ! : , I RDW~~ \ I i BRDV.'--:--'----'----'-L.. -.. .J. ...:..... -. .. ..i.J...-.~.I-.-.-' ....'.... -.....-....'-I ....-... -'-.....-....-L..~ .....~..... i .......... : I WIN;\ ! ! I f"'I\. ~ 1 ..... , .. .: I f I \ I ~. "-t-' t '-r-'! ; . . . . . . ' ........ I... ..I. ....... .. I I / I \ ... ' _ _ _ _ _ _ _ _ _ _ _...;.__.:......_ _ 242202-C3 • To Processor Figure 10-39. Snoop under BOFF# during a Cache Line-Fill Cycle 2-218 I Intel486TM PROCESSOR FAMILY An ADS # is always issued when a cycle resumes after being fractured by BOFF #. The address of the fractured data transfer is reissued under this ADS #, and CACHE# is not issued unless the fractured operation resumes from the first transfer (e.g., first doubleword). If the system asserts BOFF # and RDY # simultaneously, as shown in clock four on Figure 10-39, BOFF # dominates and RDY # is ignored. Consequently, the Write-Back Enhanced IntelDX2 processor accepts only up to the x4h doubleword, and the line fill resumes with the xOh doubleword. ADS# initiates the resumption of the line-fill operation in clock period 15. HITM # is de-asserted in the clock period following the clock period in which the last RDY # or BRDY # of the write-back cycle is returned. Hence, HITM # is guaranteed to be de-asserted before the ADS # of the next cycle. being filled (burst or non-burst), the Write-Back Enhanced IntelDX2 processor will not assert HITM # and will not issue a snoop write-back cycle, because it did not modify the line, and the line fill resumes upon the de-assertion of BOFF #. However, the line fill will be cached only if INV is driven low during the snoop cycle. Snoop under BOFF # during Replacement WriteBack If the system snoop under BOFF # hits the line that is currently being replaced (burst or non-burst), the entire line is written back as a snoop write-back line, and the replacement write-back cycle is not continued. However, if the system snoop hits to a different line than the one currently being replaced, the replacement write-back cycle will continue after the snoop write-back cycle has been completed. Figure 10-40 shows a system snoop hit to the same line as the one being replaced (non-burst). Figure 10-39 also shows the system returning RDY # to indicate a non-burst line-fill cycle. Bursted cache line-fill cycles behave similar to non-bursted cache line-fill cycles when snooping using BOFF # . If the system snoop hits the same line as the line 1 2 3 4 5: 6 8 7 9: 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : 18 : 19 : · . UUltUlJ CLKI : . q/ I ···--···············1··························1······..... j ........................•................. L..J EADS# . INV A3-A2 AnS# I I I . ........................j ........... j. ····T I •...........1 ·························1·· ..... 'j .07!(!m;m;70mmmI(Z!1!1!7I(Z7I{OIfi(JOOZIi,/II/(lO(//1I(!!1(17O(!1/: HITM. A31-A4 . ...........'..........:no n .....: ... \"".- I ........... -l'h .......... ~h ••• I "T"' ........j =x ~EPL ~B ~'7v'rzr;/'7Z)-r:ZTZr-·.. ............... j ········T .. ······r·· W;;;R:;:;iT.I;::-F-"'B.. AG'*':K"C"'Y~GL;;:E,..-""""'Xl..,-_ _-,-_,--,-_-,---, =x ~EPL ~B ) , 0 d. ;-\~Z ....I.... BLAST•• ··IJI7IZI{1 "II ... I ~--"'--;--;-'--i---i--~--;' ............. -.... I "'r I ................j ... ,...... j •...•.•.•.•; •..•.•.•. j. I I ·····f "'1 CACHE. ;-'\ I / '--,--,,~ I~. I I ··········r··· .................. "/" I ···I··········f ······ .. ··1 ··········1 ROY'~ BROY•• WIRJI ........ y... I. ~\"';--+---'--'--,.---'---'-;;-'---,....1 I : ··-;-··········T········· ..·1'······· I ""1" ............. ..,.I .. 242202-C4 • To Processor Figure 10-40. Snoop under BOFF # to the Line that is Being Replaced I 2-219 Intel486TM PROCESSOR FAMILY 10.3.3.4 Snoop under HOLD Snoop under HOLD during Cache Line Fill HOLD can only fracture a non·cacheable, non·burst· ed code prefetch cycle. For all other cycles, the Write-Back Enhanced IntelDX2 processor will not assert HLDA until the entire current cycle is completed. If the system snoop hits a modified line under HLDA during a non-cacheable, non-burstable code prefetch, the snoop write-back cycle will be reordered ahead of the fractured cycle. The fractured non-cacheable, non-bursted code prefetch resumes with an ADS# and begins with the first uncompleted transfer. Snoops are permitted under HLDA, but write-back cycles will not occur until HOLD is de-asserted. Consequently, multiple snoop cycles are permittedunder a continuously asserted HLDA only up to the first asserted HITM #. As shown in Figure 10-41, HOLD (asserted in clock two) does not fracture the bursted cache line-fill cycle until the line fill is completed (in clock five). Upon completing the line fill in clock five, the Write-Back Enhanced IntelDX2 processor asserts HLDA and the system begins snooping by driving EADS # and INV in the following clock period. The assertion of HITM # in clock nine indicates that the snoop cycle has hit a modified line and the cache line is written back to memory. The assertion of HITM # in clock nine and CACHE# and ADS# in clock 11 identifies the beginning of the snoop write-back cycle. The snoop write-back cycle begins upon the de-assertion of HOLD, and HITM# is asserted throughout the duration of the snoop write-back cycle. : 1 CLK 2: 3 4: 5 : 6 7: 8 : 9 : 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 18: 19 : tLtLtlJ , . , .... H~O~ , 'I j HLDA;-.:_-'-_+-__-;_··_····_····..,··:·.} . 7 : .........j. ! I ! .......... : . \ ••• : .•••••••••••••••••••••••••••• ; •••••••••••••• 1.••••••••••• ~ ••.•••••••••• : •••••••••••• ; ••••••••••••.• ~ •••• : •••••• ; •. , •.•..••••.• ; •••••••••••••• : 'j'" EADS#~:-~-4---~-~-~:~r~-~-~-~----~~-~--+-~-.~,--4 , , INV , ' m.m. . ' . m··:eW. 11!/J X : ° : X3'Jctx::]J<---;;-----r---; UNEF _ lL...,...,...' I ----; m.l. i I I BLAST.:f .•.......• 'C .... I I ·f , ...... .,. ....................... f CACHE# ~... - ....-..........;....-.--'--........- ..-...;. . . . -....-.........:....-...~--'--I, ...~: ...: BROY#, .......... 1'".... 01..· ! , AOS#~ , , W/R# I i ......... .,..... , , ··········1··········, I i ! .. "T'" ...... "f ··········1 ···········r·· ._.J_•.. _ .•.._.....;.., ..._•..._ _ _ .••_•. ,_ •••_••••_•..._.1••_•..._ ...._'._ ..•. _ ...•.;.. .. .1_ •.•• _.. _ .._.1••_•._•.•._...._1..._.·_····_:-171"".~_",-_~~_--i-.....\".: ....••..i ............'.:..........•..1 242202-C5 • To Processor Figure 10-41. Snoop under HOLD during Line Fill 2-220 I Intel486™ PROCESSOR FAMILY If HOLD is asserted during a non-cacheable, nonbursted code prefetch cycle, as shown in Figure 10-42, the Write-Back Enhanced IntelDX2 processor will issue HLDA in clock seven (which is the clock period in which the next RDY # is returned). I 1 2 3 1 4 : 5 6 If the system snoop hits a modified line, the snoop write-back cycle will begin after HOLD is released. After the snoop write-back cycle is completed, an ADS # is issued and the code prefetch cycle resumes. 7: 8 : 9 : 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : 18 : 19 : CLK : +----'-_-'----:-_--;.----'--11 HOlD ,.... ......:... \\.....:,--+---...:---+-...;---i--.;----;---+---;-___; ; --;--..,.---;,...----r---;--T-~-,..___; HLDA •;-,-----,,----;--"7--;--,--,... . : I.............. T · ·1 ·•.\,-. - ...,.' EADSII ! I INV • ')" A3-A2 /Z 7Zlo;m07/(lr!ljllzlo~rrZZ'r1/I"T"Ilf.-;-.?n-II,...,..Z1TT07771'r1/j"T"lZ.-;-.7?n-O,...,..j?rTjJTTjj77:OTIjZ-h(j77/j71?"'lIrrj)I7/lrri .. . . '. . \.'.-. ,. :--,---,.--..,.-,..--:----;:~.Li HITM# ; All-A4 . . ,. ~,.~r ... -.. ~-~-~-~-~~-~--+-~-~-.~ I" .::;:x-:-___ ~~-;;;;:=;:;;;-;:=-;-'--' .. ~ .. ,. P_REF_.,....ET_CH_C_Y_CLE-,.,_ _ ~~!I(IIT!lX =;:x 0 , WRITE-BACK CYCLE ;,Xi·..~ =-r-; ~~ I X~-:--4--,--'XC:=J8c;:J-7-:-.~T=. =.~. .,=(e!ZI(~/ZOZ/:zIX'-...,...-:0'-i-~ ADStl BLASTtI CACHEtI I I ·······1·· ·········1············( -~ ~ ... ......... : ........... : ......... ! .. .. . 1 T· .. I ......, .............,............. , ROY# .1 BROYtI I I ··"1···· r····· ....... \ ... :~ : .•......... ... ..•......•......... ......... :, ~ ... 242202-C6 • To Processor Figure 10-42. Snoop using HOLD during a Non-Cacheable, Non-Burstable Code Prefetch I 2-221 Intel486TM PROCESSOR FAMILY Snoop under HOLD during Replacement Write-Back Collision of snoop cycles under a HOLD during the replacement write-back cycle can never occur, because HLDA is asserted only after the replacement write-back cycle (bursted or non-bursted) is completed. 10.3.4 LOCKED CYCLES In both Standard and Enhanced Bus modes the Write-Back Enhanced IntelDX2 processor architecture supports atomic memory access. A programmer can modify the contents of a memory variable and be assured that the variable will not be accessed by another bus master between the read of the variable and the update of that variable. This function is provided for instructions that contain· a LOCK prefix, and also for instructions that implicitly perform locked read modify write cycles. In hardware, the LOCK function is implemented through the LOCK # pin, which indicates to the system that the processor is performing this sequence of cycles, and that the processor should be allowed atomic access for the location accessed during the first locked cycle. A locked operation is a combination of one or more read cycles followed by one or more write cycles with the LOCK # pin asserted. Before a locked read .cycle is run, the processor first determines if the cor~esponding line is in the cache. If the ,line is present In the cache, and is in an E or S state, it is invalidated. If the line is in the M state, the processor does a write-back and then invalidates the line. A locked cycle to an M, S, or E state line is always forced out to the bus. If the operand is misaligned across cache· lines, the processor could potentially run two write back cycles before starting the first locked read. In this case the sequence of bus cycles is: write back, write back, locked read, locked read, locked write and the final locked write. Note that although a total of six cycles are generated, the LOCK # pin will be active only during the last four cycles, as shown in Figure 10-43. LOCK# will not be de-asserted if AHOLD is assert- . ed in the middle of a locked cycle. LOCK # will remain asserted even if there is a snoop write-back during a locked cycle. LOCK # will be floated if BOFF# is asserted in the middle. of a locked cycle. However, it will be driven LOW again when the cycle restarts after BOFF #. Locked read cycles are never transformed into line fills, even if KEN # is returned active. If there are back to back locked cycles, the Write-Back Enhanced IntelDX2 processor does not insert a dead clock between these two cycles. HOLD is recognized if there are two back to back locked cycles, and· LOCK # will float when HLDA is asserted. elK ADS# ~_..;.I...JX·,-..,..'_-'-' I.....:...~+'~Xl...-i-:_~_ ADOR CACHE# WIR# ] : ~ I I .. ml 7r-~--'" DATA RDvtllBRDY# lOCK# 1Z?77~Z>\"l1'0' , .: r·. I ···'J7?A·. · . ··I$·······;··Z·. ~ :r 242202-C7 • To Processor •• From Processor Figure 10-43. Locked Cycles (Back to Back) 2-222 I Intel486TM PROCESSOR FAMILV is completed the locked cycle will continue. But during all this time (including the write-back cycle), the LOCK# signal remains asserted. 10;3.4.1 Snoop/Lock Collision If there is a snoop cycle overlaying a locked cycle, the snoop write-back cycle will fracture the locked cycle. As shown in Figure 10-44, after the read portion of the locked cycle is completed, the snoop write-back starts under HITM #. After the write-back Because HOLD is not acknowledged if LOCK # is asserted, snoop-lock collisions are restricted to AHOLD and BOFF # snooping. CLK ADS#~ . , "1'" r ' "", , AHOLD~ I , mm .. ' .... : .. m.' .......... :m \;" t ~ \!!V ' o RDY#lBRDY# , .. ···1.··· ..... I •••• j ADDR~-,_*~!~X77777Z7~~_*_*~__-,__, I I ! EADS#(............. ;. :.. '+-1. ;. · · . n'T .... HITM#, I ·1 LOCKII ~\...-;"'1_·· CACHE#' , ....1.... ··.· .. ·-1······ .. ··•··. " . T· I I I I ··........·.. ·T .. ·· n.: .. . , \~~~--~~~--~-7~/, :.... I. .~ _+'_-;-.'---,----;.---'----,--:--i-,_-;-_--;-'--i----i--;-..<'(..... ~ .. , I ··1·· ·······T·· ·······T······· m . . . . . . . . . .: ~ W/R#:"""-,--~--i~-i-'._ _-;-_.,---,-...JZ; .......... \ 242202-CB • Td Processor •• From Processor Figure 10-44. Snoop Cycle Overlaying a Locked Cycle I 2-223 Intel486TM PROCESSOR FAMILY 10.3.5 FLUSH OPERATION The Write-Back Enhanced IntelDX2 processor executes a flush operation when the FLUSH # pin is active, and no outstanding bus cycles, such as a line fill or write back, are being processed. In the Enhanced Bus mode, the processor first writes back all the modified lines to external memory. After the write-back is completed, two special cycles are generated, indicating to the external system that the write-back is done. All lines in the internal cache are invalidated after all the write back cycles are done. Depending on the number of modified lines in the cache, the flush could take a minimum of 1280 bus clocks (2560 processor clocks) and up to a maximum of 5000 + bus clocks to scan the cache, perform the write backs, invalidate the cache, and :T1 T1 ; T2 : T2 : T2 run the flush acknowledge cycles. FLUSH # is implemented as an interrupt in the Enhanced Bus mode, and will be recognized only on an instruction boundary. Write-back system designs should look for the flush acknowledge cycles to recognize the end of the. flush operation. Figure 10-45 shows the flush operation of the Write-Back Enhanced IntelDX2 processor, when configured in the Enhanced Bus mode. If the processor is in Standard Bus mode, the processor will not issue special acknowledge cycles in response to the FLUSH # input, although the internal. cache is invalidated. The invalidation of the cache in this case, takes only two bus clocks. T2; T1 ClK I ADS# I RDY#lBRDY# I FlUSH# I ! T1 • T2 : T1 : T2 , T1 , T1 : I , . - . ~. I ........ ;::::.: : ......... ! ......... :" :"': . . . ... :....... : Xi$'-l$% E""""""""'" ADDR, MlIOIt. D/C#. I WR#. BEl-O#. CACHE# i BLAST# ,...... DATA .. . =.=. =. ············!····.. ·······i··· r::: . :=.. ~ ~ ~.,-.:-.--: . =.--=-.! 242202-C9 Figure 10-45. Flush Cycle 2-224 I Intel486™ PROCESSOR FAMILY Descriptor read cycle, which is burst by the system returning 8RDY #. 10.3.6 PSEUDO LOCKED CYCLES In Enhanced Bus mode, PLOCK# is always driven inactive for both burst and non-burst cycles. Hence, it is possible for other bus masters to gain control of the bus during operand transfers that take more than one bus cycle. A 64-bit aligned operand can be read in one burst cycle or two non-burst cycles if 8S8# and 8S16# are not asserted. Figure 10-46 shows a 64-bit floating point operand or Segment 3 4 8 567 10.3.6.1 Snoop under AHOLD during PseudoLocked Cycles AHOLD can fracture a 64-bit transfer if it is a nonburst cycle. If the 64-bit cycle is burst, as shown in Figure 10-46, the entire transfer goes to completion and only then does the snoop write-back cycle start. 9: 10 : 11 : 12 : 13 ,14 ; 15 16; 17 : 18 : 19 CLK AHOLO~ \'--..;----+-7--~-T-/ HITM# A31-A4 =X_'~--,,_,~ ~--~-+--~~--~-- 1 WRITE·BAGKCYGLE A3-A2 =x ADS# ,"""\..iJ'~'----'----'--~----'-"--,,\...2L "~" evlZz7;x BLAST# , . . . . . CACHE# .. '0': . . . ," , ,., '" ""T'" ~I....---,-J/ , "or" 0; . '. ········1 , X--..:..---..;....---,---i-~. ~....:.....--.:._.;.-_--,.._ _-; ... .. ....... ... 1 .", r--'--'"--',-"_ _ _-'-_-'-_~ ~I, ··························r r-" PLOCK# ' , wIR#3\....--:----;.__..:--;.,_._.--i''.....! BRDY# .....-~"","'" .. '''''''i ... "j , ••.•••••••••.. j 242202-DO • To Processor Figure 10-46. Snoop under AHOLD Overlaying Pseudo-Locked Cycle I 2-225 Intel486TM PROCESSOR FAMILY 10.~.6.2 Write-Back Enhanced IntelDX2 processor writes back the modified line to memory (if snoop hits to modified line). If the 64-bit transfer is non-bursted, the Write-Back Enhanced IntelDX2 processor can issue HLDA in between bus cycles for a 64-bit transfer. Snoop under Hold during PseudoLocked Cycles As shown in Figure 10-47, HOLD will not fracture the 64-bit burst transfer.' The Write-Back Enhanced IntelDX2 processor will not issue HLDA until clock four. After the 64-bit transfer is completed, the CLK='-. . .. 4 : 5 ,. , 7 6 .....,---'--"---......, , HOLD ~.+ 8 : 9 : 10 11 12: 13 : 14 : 15 16 , 17 : 18 ' 19 : IIfr f IJI1 ···· d i ' ~1.....:.,-.....,.--;--;--:...--,-7--.;---:--+--+--..;..-...:...-....:..-..J. , \L__~__~__~~~~__~__~~__~··_····_·····~'··_·~__~ HLDA,..'_--.;..._...;...._......)/, , EADS#, INV :m/mmOIlZ7illdVOVVWzmZ11/JZ/mlllmZ7zmzm7!WzWomzm I ' HITMI; I A3-A2 : : : , . , I r I , r \ :I ......... , .......................j...... '--.,,' A31-A4 . . I , 'n777777.1 ' . : : I : I , {-JI ~IT~CY~I.£~~'---i-w=R::;lrec::..-BA=;CrK.:.C.=Y.:.Cr>=E=--.:....X'-.-'r--,-----T""--i-........, p: 0 el(!W;z AOS#~ , \ .... : ...) , , '--r--' Q ' BlAST#, , CACHE#~ .. / I ' '-."T"--,-~ PLOCK#' ,." ........ "j •••• '.""1" .• ' , r" .. f , .. . . . . . '. . s''--0--_-L ........ ' .,,,. BROY#~:--~-~, ~ _o~,x:tx=PcP~,--~--;.~.;.............;..........;. X,-'.;..: I . ..... .;..... ' ........ I" , 'J'" ...... "1"" ········r....·.. ····T· "'T . ,." .. WIR# ,--'_-'-_-'-_-'-_-'---'_-'-_ _ _...:.....,/, 242202 .. 01 • To Processor Figure 10-47. Snoop under HOLD Overlaying Pseudo-Locked Cycle 2-226 I Intel486™ PROCESSOR FAMILY snoop hit under BOFF #, the snoop write-back operation will begin after BOFF # is de-asserted. The 64-bit write cycle resumes after the snoop writeback operation completes. 10.3.6.3 Snoop under BOFF# Overlaying a Pseudo-Locked Cycle BOFF # is capable of fracturing any bus operation. As shown in Figure 10-48, BOFF# fractured a current 64-bit read cycle in clock four. If there is a 9 10: 11 ,12 : 13 : 14 : 15 16, 17 . 18 , 19 B~F#~:--~~--~\L~~~~~I ~~~:--~~----~--~~ , INV r ' ---r---r------~--~'--x===x~----_r,--_r---r:--~--~--~---'--__----~__~--~__ . HITM#, A31-A40: :M-8IT'READCYClE, 1 A3-A2 \,~:-·--~~'--~--~--~--~I , ' WRITE-BAcK cYCLE :Err ~qwml/7X . COHXc...:_..;.-......:.. ~.....;....',----:...Jx:::::==})=···'=·····=······~···'·=····=·····~'ezvu.fZiz"(lI72~~'--o-.:...c-'--..Jx....;..........;..._ . ' I I I AD~~~--~----~--~--~--~~ t ! : . . ~ t ~~ BlAST#' CACHE# REAO i t · ... 1.. ~1.."T'_---r..)1 PLOCK#', \'---,-,----,--11 ....... ")"" .. ············1 "1 I RDY#~ I , : " I .... ······T··· . t BRDY# ----J..----'---~--~--..- ..~...i-.---,'--..-...--..-.. -._... ,L' ~ .1..,. - .... --. . .I..,.... --. .. - ...-'-.,- .•.. - _._:"-1./ WIR# ~~ .. ____-'-__-'-__-'-..._.._ .....:..,_-'-_ .. _...._.'.:. .. _... _.. _.':..'_ .. .. I. \'--'------'----242202-D2 • To Processor Figure 10-48. Snoop under BOFF # Overlaying a Pseudo-Locked Cycle I 2-227 Intel486TM PROCESSOR FAMIL V 11.0 TESTABILITY Testing in the Intel486 processor can be divided into two categories: Built-in Self Test (BIST) and external testing. The BIST tests the non-random logic, control ROM (CRaM), translation lookaside buffer (TLB) and on-chip cache memory. External tests can be run on the TLB and the on-chip cache. The Intel486 processor also has a test mode in which all outputs are tri-stated. 11.1 Built-In Self Test (BIST) The BIST is initiated by holding the AHOLD (address hold) HIGH for 1 CLK after RESET goes from HIGH to LOW, as shown in Figure 9.6. No bus cycles will be run by the Intel486 processor until the BIST is concluded. Note that for the Intel486 processor, the RESET must be active for 15 clocks with or without BIST enabled for warm resets. SRESET should not be driven active (Le., high) when entering or during BIST. See Table 11-1 for approximate clocks and maximum completion times for different Intel486 processors. The results of BIST is stored in the EAX register. The Intel486 processor has successfully passed the BIST if the contents of the EAX register are zero. If the results in EAX are not zero, then the BIST has detected a flaw in the Intel486 processor. The Intel486 processor performs reset and begins normal. operation at the completion of the BIST. The non-random logic, control ROM, on-chip cache and translation lookaside buffer (TLB) are tested during the BIST. The cache portion of the BIST verifies that the cache is functional and that it is possible to read and write to the cache. The BIST manipulates test registers TR3, TR4 and TR5 while testing the cache. These test registers are described in section 11.2, "On-Chip Cache Testing." The cache testing algorithm writes a value to each cache entry, reads the value back, and checks that the correct value was read back. The algorithm may be repeated more than once for each of the 512 cache entries using different constants. The IntelDX4 processor has 1024 cache entries. All other Intel486 processors have 512 cache entries. The TLB portion of the BIST verifies that the TLB is functional and that it is possible to read and write to the TLB. The BIST manipulates test registers TR6 and TR7 while testing the TLB. TR6 and TR7 are described in section 11.3.2, "TLB Test Registers TR6 and TR7." 11.2 On-Chip Cache Testing The on-chip cache testability hooks are designed to be accessible during the BIST and for assembly language testing of the cache. The Intel486 processor contains a cache fill buffer and a cache read buffer. For testability writes, data must be written to the cache fill buffer before it can be written to a location in the cache. Data must be read from a cache. location into the cache read buffer before the processor can access the data. The cache fill and cache read buffer are both 128 bits wide. 11.2.1 CACHE TESTING REGISTERS TR3, TR4 AND TR5 Figure 11-1 shows the three 'cache testing registers: the Cache Data Test Register (TR3), the Cache Status Test Register (TR4) and the Cache Control Test Register (TR5). External access to these registers is provided through MOV reg, TREG and MOV TREG, reg instructions. Table 11-1. Maximum BIST Completion Time Core Clock Freq. Approximate Clocks Approximate Time for Completions Intel486™ SX 25 MHz 1.05 million 42 milliseconds IntelSX2TM 50 MHz 0.6 million 24 milliseconds Intel486 DX 33 MHz 50 MHz '. 1.05 million 32 milliseconds IntelDX2TM 0.6 million 24 milliseconds IntelDX4™ 75 MHz 1.6 million 22 milliseconds Processor Type 2-228 I Intel486TM PROCESSOR FAMILY 31 0 TR3 Cache Data ! Test Register Data 31 11 10 9 :2 Tag 31 ~ 8 7 3 2 1 0 6 5 4 TR4 Cache Status Test Register LRU Bits Valid Bits (used only (used only during reads) during reads) 4 11 10 3 o 2 ------------,------r-----, Set Select Entry Select __________L -__ Control ~L_ __ ~ TRS Cache Control Test Register 242202-D3 Figure 11-1. Cache Test Registers (AU Intel486TM Processors Except the IntelDX4™ Processor) 31 31 31 Figure 11-2. IntelDX4TM Processor Cache Test Registers I 2-229 Intel486TM PROCESSOR FAMILY Cache Data Test Register: TR3 The cache fill buffer and the cache read buffer can only be accessed through TR3. Data to be written to the cache fill buffer must first be written to TR3. Data read from the cache read buffer must be loaded into TR3. TR3 is 32 bits wide while the cache fill and read buffers are 128 bits wide. 32 bits of data must be written to TR3 four times to fill the cache fill buffer. 32 bits of data must be read from TR3 four times to empty the cache read buffer. The entry select bits in TR5 determine which 32 bits of data TR3 will access' in the buffers. Cache Status Test Register: TR4 TR4 handles tag, LRU and valid bit information during cache tests. TR4 must be loaded with a tag and a valid bit before a write to the cache. After a read from a cache entry, TR4 contains the tag and valid bit from that entry, and the LRU bits and four valid bits from the accessed set. Note that the IntelDX4 processor has one less bit in the TR4 TAG field. (See Figure 11-1.) Cache Control Test Register: TR5 TR5 specifies which testability operation will be performed and the set and entry within the set which will be accessed. The set select field determines which will be accessed. Note that the IntelDX4 processor has an 8-bit set select field and 256 sets. All other Intel486 processors have a 7-bit set select field and 128 sets. (See Figure 11-1.) The function of the two entry select bits depends on the state of the control bits. When the fill or read buffers are being accessed, the entry select bits point to the 32-bit location in the buffer being accessed. When a cache location is specified, the entry select bits point to one of the four entries in a set. (Refer to Table 11-2.) Five testability functions can be performed on the cache. The two control bits in TR5 specify the operation to be executed. The five operations are: 1. Write cache fill buffer 2. Perform a cache testability write 3. Perform a cache testability read 4. Read the cache read buffer 5. Perform a cache flush 2-230 Table 11-2 shows the encoding of the two control bits in TR5 for the cache testability functions. Table 11-2 also shows the functionality of the entry and set select bits for each control operation. The cache tests attempt to use as much of the normal operating circuitry as possible. Therefore, when cache tests are being performed, the cache must be disabled (the CD and NW bits in control register 0 (CRO) must be set to 1 to disable the cache. (See section 7.0, "On-Chip Cache.") 11.2.2 CACHE TESTING REGISTERS FOR THE INTELDX4 PROCESSOR The cache testing registers for the IntelDX4 processor differ slightly from the other Intel486 processors. TR3 in the IntelDX4 processor is identical to other Intel486 processors. TR4 in the IntelDX4 processor uses bits 31 to 12 for the Tag field, and bit 11 is unused. TR5 uses bits 11 to 4 for the Set Select field. The Test Registers for the IntelDX4 processor are shown in Figure 11-2. NOTE: Software written for the Intel486 processor for testing the cache using the Test Register will produce failures due to the changes in the TAG bits and Set Select bits for the IntelDX4 processor. Rewrite the code to take into account the 20 TAG bits and 8 Set Select bits to address the larger cache. 11.2.3 CACHE TESTABILITY WRITE A testability write to the cache is a two step process. First the cache fill buffer must be loaded with 128 bits of data and TR4 loaded with the tag and valid bit. Next the contents of the fill buffer are written to a cache location. Loading the fill buffer is accomplished by first writing to the entry select bits in TR5 and setting the control bits in TR5 to 00. The entry select bits identify one of four 32-bit locations in the cache fill buffer to put 32 bits of data. Following the write to TR5, TR3 is written with 32 bits of data which are immediately placed in the cache fill buffer. Writing to TR3 initiates the write to the cache fill buffer. The cache fill buffer is loaded with 128 bits of data by writing to TR5 and TR3 four times using a different entry select location each time. I Intel486TM PROCESSOR FAMILY Table 11-2. Cache Control Bit Encoding and Effect of Control Bits on Entry Select and Set Select Functionality Control Bits Operation Entry Select Bits Function Set Select Bits Bit 1 Bit 0 0 0 Enable: Fill Buffer Write Read Buffer Read Select 32-bit location in fill/read buffer 0 1 Perform Cache Write Select an entry in set Select a set to write to 1 0 Perform Cache Read Select an entry in set Select a set to read from 1 1 Perform Cache Flush TR4 must be loaded with the tag and valid bit (bit 10 in TR4) before the contents of the fill buffer are written to a cache location. The IntelDX4 processor has a 20-bit tag in TR4. All other Intel486 processors use a 21-bit tag in TR4. The contents of the cache fill buffer are written to a cache location by writing TR5 with a control field of 01 along with the set select and entry select fields. The set select and entry select field indicate the location in the cache to be written. The normal cache LRU update circuitry updates the internal LRU bits for the selected set. Note that a cache testability write can only be done when the cache is disabled for replaces (the CD bit is control register 0 is reset to 1). Care must be taken when directly writing to entries in the cache. If the entry is set to overlap an area of memory that is being used in external memory, that cache entry could inadvertently be used instead of the external memory. This is exactly the type of operation that one would desire if the cache were to be used as a high speed RAM. Also, a memory reference (or any external bus cycle) should not occur in between the move to TR4 and the move to TR5, in order to avoid having the value in TR4 change due to the memory reference. - - - two-bit entry select. The IntelDX4 processor has a seven-bit select field. All other Intel486 processors have an eight-bit select field. In response to the write to TR5, TR4 is loaded with the 21-bit tag field and the single valid bit from the cache entry read. TR4 is also loaded with the three LRU bits and four valid bits corresponding to the cache set that was accessed. The cache read buffer is filled with the 128-bit value which was found in the data array at the specified location. The contents of the read buffer are examined by performing four reads of TR3. Before reading TR3 the entry select bits in TR5 must loaded to indicate which of the four 32-bit words in the read buffer to transfer into TR3 and the control bits in TR5 must be loaded with 00. The register read of TR3 will initiate the transfer of the 32-bit value from the read buffer to the specified general purpose register. Note that it is very important that the entire 128-bit quantity from the read buffer and also the information from TR4 be read before any memory references are allowed to occur. If memory operations are allowed to happen, the contents of the read buffer will be corrupted. This is because the testability operations use hardware that is used in normal memory accesses for the Intel486 processor whether the cache is enabled or not. 11.2.4 CACHE TESTABILITY READ A cache testability read is a two step process. First the contents of the cache location are read into the cache read buffer. Next the data is examined by reading it out of the read buffer. Reading the contents of a cache location into the cache read buffer is initiated by writing TR5 with the control bits set to 10 and the desired set select and I 11.2.5 FLUSH CACHE The control bits in TR5 must be written with 11 to flush the cache. None of the other bits in TR5 have any meaning when 11 is written to the control bits. Flushing the cache will reset the LRU bits and the valid bits to 0, but will not change the cache tag or data arrays. 2-231 Intel486TM PROCESSOR FAMILY When the cache is flushed by writing to TR5 the special bus cycle indicating a cache flush to the external system is not run. (See section 10.2.11, "Special Bus Cycles.") For normal operation, the cache should be flushed with the instruction INVD (Invalidate Data Cache) instruction or the WBINVD (Writeback and Invalidate Data Cache) instruction. 11.2.6 ADDITIONAL CACHE TESTING FEATURES FOR ENHANCED BUS (WRITE-BACK) MODE When in Enhanced Bus (write-back) mode, the Write-Back Enhanced IntelDX2 cache testing is a superset of the Standard Bus (write-through) mode. The additional cache testing features for the WriteBack Enhanced IntelDX2 processor are summarized belo,":,: There are two state bits per cache line (VH and VL) instead of one (V). The assignment of VH and Vl state bits is listed in the Table 11-3. Table 11-3. State Bit ASSignments for. the Write-Back Enhanced Inte1DX2\ Processor 2-232 State VH,VL M E S I 1, 1 0,1 1,0 0,0 When the Write-Back Enhanced IntelDX2 processor is in standard mode, the VH state assignments are identical to the V state assignments of the IntelDX2 processor,. which only support S and I states. TR3 is the same as described above for both Standard and Enhanced Bus modes. TR4 is the same as described above for the IntelDX2 processor in Standard Mode. However, in Enhanced Bus mode, the cache line state bits of all four lines of the set are no longer available, to avoid a conflicting definition of state bits for the selected entry. The entry's state bits are moved to positions 0 and 1. Bit 10 is reserved for the possible extension of the tag. The changes to TR4 for Enhanced Bus mode are shown in Figure 11-3. TR5 is the same as it is for the IntelDX2 processor in standard mode. In Enhanced Bus mode, control bit TR5.SlF (bit 13) is added to allow 1,1 of TR5.CTl (bits 0 and 1) to perform two different kinds of cache flushes. When SlF = 0, CTl = 1,1 performs a singleclock invalidate of all lines in the cache, which will not write back M-state lines. In the M state, if SlF = 1, the specific line addressed will be written back and invalidated. The state of SlF is significant only when CTl = 1,1. The changes to TR5 for Enhanced Bus mode are shown in Figure 11-4. I Intel486™ PROCESSOR FAMILV Standard TR4 Enhanced TR4 242202-D5 Figure 11-3. TR4 Definition for Standard and Enhanced Bus Modes for the Write-Back Enhanced IntelDX2™Processor Standard TR5 Enhanced TR5 242202-D6 Figure 11-4. TR5 Definition for Standard and Enhanced Bus Modes for the Write-Back Enhanced IritelDX2TM Processor 11.3 Translation Lookaside Buffer (TLB) Testing The Intel486 processor TLB testability hooks are similar to those in the Intel386 processor. The testability hooks have been enhanced to provide added test features and to include new features in the Intel486 processor. The TLB testability hooks are designed to be accessible during the BIST and for assembly language testing of the TLB. 11.3.1 TRANSLATION LOOKASIDE BUFFER ORGANIZATION The Intel486 processor TLB is 4-way set associative and has space for 32 entries. The TLB is logically split into three blocks shown in Figure 11-5. I The data block is physically split into four arrays, each with space for eight entries. An entry in the data block is 22 bits wide containing a 20-bit physi- . cal address and two bits for the page attributes. The page attributes are the PCO (page cache disable) bit and the PWT (page write-through) bit. Refer to section 7.6, "Page Cacheability," for a discussion of the PCO and PWT bits. The tag block is also split into four arrays, one for each of the data arrays. A tag entry is 21 bits wide containing a 17-bit linear address and four protection bits. The protection bits are valid (V), user/supervisor (U/S), read/write (R/W) and dirty (0). The third block contains eight three bit quantities used in the pseudo least recently used (LRU) replacement algorithm. These bits are called the LRU bits. Unlike the on-chip cache, the TLB will replace a valid line even when there is an invalid line in a set. 2-233 Intel486™ PROCESSOR FAMILY ~ I 8 Tag 17 Bits Page Protection Physical Address 20 Bits L . . .____-'-~_it_~i_ts_ Page Attributes 2 Bits t I 8 Entries ~--------' ~ .... 1,----_____I ,----I_____ L.....----.....II'----_____ L.....----.....I IL.....----..... [J :Ist RU. Bits 8 242202-07 Figure 11·5. TLB Organization 11.3.2 TLB TEST REGISTERS TR6 AND TR7 The two TLB test registers are shown in Figure 11-6. TR6 is the command test register and TR7 is the data test register. External access to these registers is provided through MOV reg,TREG and MOV TREG,reg instructions. Command Test Register: TR6 TR6 contains three bit fields, a 20-bit linear address (bits 12-31), seven bits for the TLB tag protection bits (bits 5-11) and one bit. (bit 0) to define the type of operation to be performed on the TLB. The 20-bit linear address forms the tag information used in the TLB access. The lower three bits of the linear address select which of the eight sets are accessed. The upper 17 bits of the linear address form the tag stored in the tag array. TR6 contains the tag information and control information used in a TLB test. Loading TR6 with tag and control information initiates a TLB write or lookup test. I Intel486TM PROCESSOR FAMILY 12 11 10 9 31 8 7 6 o 5 4 TR6 TLB Command Test Register Linear Address 31 TR7 Physical Address TLB Data Test Register Replacement Pelnter Select (Writes) Hit Indication (Lookup) Replacemant Pointer (Wrnos) Hn location (Lookup) 242202-D8 Figure 11-S. TLB Test Registers The seven TLB tag protection bits are described below. V: The valid bit for this TLB entry 0,0#: The dirty bit for/from the TLB entry U,U#: The user/supervisor bit for/from the TLB entry The read/write bit forlfrom the TLB entry miss or hit during a TLB lookup operation. The forced miss or hit will occur regardless of the state of the actual bit in the TLB. The meaning of these pairs of bits is given in Table 11-4. The operation bit in TA6 determines if the TLB test operation will be a write or a lookup. The function of the operation bit is given in Table 11-5. Two bits are used to represent the 0, U/S and A/W bits in the TLB tag to permit the option of a forced Table 11-4. Meaning of a Pair. of TR6 Protection Bits TRS Protection Bit (B) TRS Protection Bit# (B#) Meaning on TLB Write Operation Meaning on TLB Lookup Operation 0 0 1 1 0 1 0 1 Undefined Write 0 to TLB TAG Bit B Write 1 to TLB TAG Bit B Undefined Miss any TLB TAG Bit B Match TLB TAG Bit B if 0 Match TLB TAG Bit B if 1 Match any TLB TAG Bit B I 2-235 Intel486™ PROCESSOR FAMILY Table 11-5. TR6 Operation Bit Encoding TR6 Bit 0 TLB Operation to Be Performed 0 1 TLBWrite 'TLB Lookup Data Test Register: TR7 enhancement over TLB testing in the Intel386 processor is that paging need not be disabled while executing testability writes or lookups. Note that any time one TLB set contains the same linear address in more than one of its entries, looking up that linear address will give unpredictable results. Therefore a single linear address should not be written to one TLB set more than once. TR7 contains the information stored or read from the data block during a TLB test operation. Before a TLB test write, TR7 contains the physical address and the page attribute bits to be stored in the entry. After a TLB test lookup hit, TR7 contains the physical address, page attributes, LRU bits and entry location from the access. Table 11-7. Encoding of Bit 4 of TR7 on Lookups TR7 contains a 20-bitphysical address (bits 12-31), PLD bit (bit 11), PWT bit (bit 10), and three bits for the LRU bits (bits 7 -9). The LRU bits in TR7 are only used during a TLB lookup test. The functionality of TR7 bit 4 differs for TLB writes and lookups. The encoding of bit 4 is defined in Table 11-6 and Table 11-7. Finally, TR7 contains.two bits (bits 2-3) to specify a TLB replacement pointer or the location of a TLB hit. 11.3.3 TLB WRITE TEST Table 11-6. Encoding of Bit 4 of TR7 on Writes TR7 Bit4 Replacement Pointer Used on TLB Write 0 1 Pseudo-LRU Replacement Pointer Data Test Register Bits 3:2 A replacement pointer is used during a TLB write. The pointer indicates which of the four entries in an accessed set is to be written. The replacement pointer can be specified to be the internal LRU bits or bits 2-3 in TR7. The source of the replacement pointer is specified by TR7 bit 4. The encoding of bit 4 during a write is given by Table 11-6. Note that both testability writes and lookups affect the state of the internal LRU bits regardless of the replacement pointer used. All TLB write operations (testability or normal operation) cause the written entry to become the most recently used. For example, during a testability write with the replacement pointer specified by TR7 bits 2-3, the indicated entry is written and that entry becomes the most recently used as specified by the internal LRU bits. There are two TLB testing operations: write entries into the TLB, and perform TLB lookups. One major 2-236 TR7 Bit 4 Meaning after TLB Lookup Operation 0 1 TLB Lookup Resulted in a Miss TLB Lookup Resulted in a Hit To perform a TLB write TR7 must be loaded followed by a TR6 load. The register operations must be performed in this order because the TLB operation is triggered by the write to TR6. TR7 is loaded with a 20-bit physical address and values for PCD and PWT to be written to the data portion of the TLB. In addition, bit 4 of TR7must be loaded to indicate whether to use TR7 bits 3-2 or the internal LRU bits as the replacement pointer on the TLB write operation. Note that the LRU bits in TR7 are not used in a write test. TR6 must be written to initiate the TLB write operation. Bit 0 in TR6 must be reset to zero to indicate a TLB write. The 20-bit .linear address and the seven page protection bits must also be written in TR6 to specify the tag portion of the TLB entry. Note that the three least significant bits of the linear address specify which of the eight sets in the data block will be loaded with the physical address data. Thus only 17 of the linear address bits are stored in the tag array. 11.3.4 TLB LOOKUP TEST To perform a TLB lookup it is only necessary to write the proper tags and control information into TR6. Bit o in TR6 must be set to 1 to indicate a TLB lookup. TR6 must be loaded with a 20-bit linear address and the seven protection bits. To force misses and matches of the individual protection bits on TLB lookups, set the seven protection bits as specified in Table 11-4. I intel® A TLB lookup operation is initiated by the write to TR6. TR7 will indicate the result of the lookup operation following the write to TR6. The hit/miss indication can be found in TR7 bit 4 (see Table 11-7). TR7 will contain the following information if bit 4 indicated that the lookup test resulted in a hit. Bits 2-3 will indicate in which set the match occurred. The 22 most significant bits in TR7 will contain the physical address and page attributes contained in the entry. Bits 9-7 will contain the LRU bits associated with the accessed set. The state of the LRU bits is previous to their being updated for the current lookup. If bit 4 in TR7 indicated that the lookup test resulted in a miss the remaining bits in TR7 are undefined. Again it should be noted that a TLB testability lookup operation affects the state of the LRU bits. The LRU bits will be updated if a hit occurred. The entry which was hit will become the most recently used. 11.4 Tri-State Output Test Mode The Intel486 processor provides the ability to float all its outputs and bidirectional pins, except for the VOLOET pin in the IntelOX4 processor. This includes all pins floated during bus hold as well as pins which are never floated in normal operation of the chip (HLOA, BREQ, FERR# and PCHK#). When the Intel486 processor is in the tri-state output test mode external testing can be used to test board connections. The tri-state test mode is invoked if FLUSH # is sampled active at the falling edge of RESET. FLUSH # is an asynchronous signal. When driven, FLUSH # should be asserted for 2 clocks before and . 2 clocks after RESET is de-asserted. If FLUSH # is driven synchronously, the tri-state output test mode is initiated by. driving FLUSH# so that it is sampled active in the clock prior to RESET going low and ensuring that specified setup and hold times are met. The outputs are guaranteed to tri-state no later than 10 clocks after RESET goes low (see Figure 9.6). The Intel486 processor remains in the tri-state test mode until the next RESET. 11.5' Intel486 Processor Boundary Scan (JTAG) The Intel486 processor provides additional testability features compatible with the IEEE Standard Test I Intel486™ PROCESSOR FAMILY Access Port and Boundary Scan Architecture (IEEE Std. 1149.1). (Note that the Intel486 SX processor in PGA package does not have JTAG capability.) The test logic. provided allows for testing to insure that compone'nts function correctly, that interconnections between various components are correct, and that various components interact correctly on the printed .circuit board. The boundary scan test logic consists of a boundary scan register and support logic that are accessed through a test access port (TAP). The TAP provides a simple serial interface that makes it possible to test all signal traces with only a few probes. The TAP can be controlled via a bus master. The bus master can be either automatic test equipment or a component (PLO) that interfaces to the four-pin test bus. 11.5.1 BOUNDARY SCAN ARCHITECTURE The boundary scan test logic contains the following elements: • Test access port (TAP), consisting of input pins TMS, TCK, and TOI; and output pin TOO. • TAP controller, which interprets the inputs on the test mode select (TMS) line and performs the corresponding operation. The operations performed by the TAP include controlling the instruction and data registers within the component. • Instruction register (IR), which accepts instruction codes shifted into the test logic on the test data input (TOI) pin. The instruction codes are used to select the specific test operation to be performed or the test data register to be accessed. • Test data registers: The Intel486 processor contains three test data registers: Bypass register (BPR), Oevice Identification register (010), and Boundary Scan register (BSR). The instruction and test data registers are separate shift-register paths connected in parallel and have a common serial data input and a common serial data output connected to the TAP signals, TOI and TOO, respectively. 11.5.2 DATA REGISTERS The Intel486 processor contains the two required test data registers; bypass register and boundary 2-237 Intel486TM PROCESSOR FAMILY scan register. In addition, they also have a device identification register. Each test data register is serially connected to TOI and TOO, with TOI connected to the most significant bit and TOO connected to the least significant bit of the test data register. Oata is shifted one stage (bit position within the register) on each rising edge of the test clock (TCK). In addition the Intel486 processor contains a runbist register to support the RUNBIST boundary scan instruction. 11.5.2.1 Bypass Register The Bypass Register is a one-bit shift register that provides the minimal length path· between TOI and TOO. This path can be selected when no test operation is being performed by the component to allow rapid movement of test data to and from other components on the board. While the bypass register is selected data is transferred from TOI to TOO without inversion. 11.5.2.2 Boundary Scan Register The Boundary Scan Register is a single shift register path containing the boundary scan cells that are connected to all input and output pins of the Intel486 processor. Figure 11-7 shows the logical structure of the boundary scan register. While output cells determine the value of the signal driven on the corresponding pin, input cells only capture data; they do not affect the normal operation of the device. Oata is transferred without inversion from TOI to TOO through the boundary scan register during scanning. The boundary scan register can be operated by the EXTEST and SAMPLE instructions. The boundary scan register order is described in section 11.5.5 "Boundary Scan Register Bits and Bit Orders." ---------------------------. p---------. BOUNDARY SCAN REGISTER SYSTEM BIDIRECTIONAL PIN SYSTEM LOGIC INPUT SYSTEM LOGIC 1---" SYSTEM 3-STATE OUTPUT TCK -_ .. .. _--TOI TOO 242202-D9 Figure 11-7. Logical Structure of Boundary Scan Register 2-238 I Intel486™ PROCESSOR FAMILY 11.5.2.1 Device Identification Register The Device Identification ufacturer's identification and version code. Table sponding to the Intel486 Register contains the man· code, part number code, 11-8 lists the codes correprocessor. 11.5.2.2 Runbist Register The Runbist Register is a one bit register used to report the results of the Intel486 processor BIST when it is initiated by the RUNBIST instruction. This register is loaded with a "1" prior to invoking the BIST and is loaded with "0" upon successful completion. 11.5.3 INSTRUCTION REGISTER The Instruction Register (IR) allows instructions to 'be serially shifted into the device. The instruction selects the particular test to be performed, the test data register to be accessed, or both. The instruction register is four (4) bits wide. The most significant bit is connected to TDI and the least significant bit is connected to TDO. There are no parity bits associated with the Instruction register. Upon entering the Capture-IR TAP controller state, the Instruction register is loaded with the default instruction "0001," SAMPLE/PRELOAD. Instructions are shifted into the instruction register on the rising edge of TCK while the TAP controller is in the SHIFT-IR state. I '11.5.3.1 Boundary Scan Instruction Set The Intel486 processor supports all three mandatory boundary scan instructions (BYPASS, SAMPLE/ PRELOAD, and EXTEST) along with two optional instructions (IDCODE and RUNBIST). Table 11-9 lists the Intel486 processor boundary scan instruction codes. The instructions listed as PRIVATE cause TDO to become enabled in the Shift-DR state and cause '0' to be shifted out of TDO on the rising edge of TCK. Execution of the PRIVATE instructions will not cause hazardous operation of the Intel486 processor. EXTEST The instruction code is "0000." The EXTEST instruction allows testing of circuitry external to the component package, typically board interconnects. It does so by driving the values loaded into the Intel486 processor's boundary scan register out on the output pins corresponding to each boundary scan cell and capturing the values on Intel486 processor input pins to be loaded into their corresponding boundary scan register locations. I/O pins are selected as input or output, depending on the value loaded into their control setting locations in the boundary scan register. Values shifted into input latches in the boundary scan register are never used by the internal logic of the Intel486 processor. 2-239 Intel486TM PROCESSOR FAMILY Table 11-8. Boundary Scan Component Identification Codes Processor Type Intel Vee Version 1=3.3V Architecture Family Model O=5V Type MFGIO Intel=OO9H 1st Boundary Bit Scan 10 (Hex) Intel486TM SX processor (3.3V) xxxx' 1 000001 0100 00010 00000001001 1 x8282013H Intel486 SX processor (3.3V, 2X ClK) xxxx' 1 000001 0100 00010 00000001001 1 x8282013H Intel486 SX processor (5V) xxxx' 0 000001 0100 00010 00000001001 1 x0282013H Intel486SX processor (5V,2XCLK) xxxx' 0 000001 0100 00010 00000001001 1 x0282013H IntelSX2TM processor xxxx' 0 000001 0100 00101 00000001001 1 x0286013H Intel486 OX processor (3.3V) xxxx' 1 000001 0100 00001 00000001001 1 x8281013H Intel486 OX processor (3.3V, 2X ClK) xxxx' 1 000001 0100 00001 00000001001 1 x8281013H Intel486 OX processor (5V) xxxx' 0 000001 0100 00001 00000001001 1 x0281013H Intel486 OX processor (5V,2XCLK) xxxx' 0 000001 0100 00001 00000001001 1 x0281013H IntelOX2TM processor (3.3V) xxxx' 1 000001 0100 00101 00000001001 1 x8285013H IntelOX2 processor (5V) xxxx' 0 000001 0100 00101 00000001001 1 x0285013H Write-Back Enhanced IntelOX2 processor (3.3V) xxxx' 1 qOOO01 0100 00111 00000001001 1 x8287013H Write-Back Enhanced IntelOX2 processor (5V) xxxx' 0 000001 0100 00111 00000001001 1 x0287013H IntelOX4TM processor (3.3V) xxxx' 1 000001 0100 01000 00000001001 1 x8288013H NOTE: 'Contact Intel for details 2·240 I Intel486TM PROCESSOR FAMILY Table 11-9. Boundary Scan Instruction Codes Instruction Code Instruction Name 0000 EXT EST 0001 SAMPLE 0010 IDCODE 0011 PRIVATE 0100 PRIVATE 0101 PRIVATE 0110 PRIVATE 0111 PRIVATE 1000 RUNBIST 1001 PRIVATE 1010 PRIVATE 1011 PRIVATE 1100 PRIVATE 1101 PRIVATE 1110 PRIVATE 1111 BYPASS mODE BYPASS RUNBIST NOTE: After using the EXTEST instruction, the Intel486 processor must be reset before normal (non-boundary scan) use. SAMPlE/ The instruction code is "0001." The PRELOAD SAMPLE/PRELOAD has two functions that it performs. When the TAP controller is in the Capture-DR state, the SAMPLE/PRELOAD instruction allows a "snap-shot" of the normal operation of the component without interfering with that normal operation. The instruction causes boundary scan register cells associated with outputs to sample the value being driven by the Intel486 processor. It causes the cells associated with inputs to sample the value being driven into the Intel486 processor. On both outputs and inputs the sampling occurs on the rising edge of TCK. When the TAP controller is in the Update-DR state, the SAMPLE/PRELOAD instruction preloads data to the device I pins to be driven to the board by executing the EXTEST instruction. Data is preloaded to the pins from the boundary scan register on the falling edge of TCK. The instruction code is "0010." The IDCODE instruction selects the device identification register to be connected to TDI and TOO, allowing the device identification code to be shifted out of the device on TOO. Note that the device identification register is not altered by data being shifted in on TDI. The instruction code is "1111." The BYPASS instruction selects the bypass register to be connected to TDI and TOO, effectively bypassing the test logic on the Intel486 processor by reducing the shift length of the device to one bit. Note that an open circuit fault in the board level test data path will cause the bypass register to be selected following an instruction scan cycle due to the pull-up resistor on the TDI input. This has been done to prevent any unwanted interference with the proper operation of the system logic. The instruction code is "1000." The RUNBIST instruction selects the one (1) bit runbist register, loads a value of "1" into the runbist register, and connects it to TOO. It also initiates the builtin self test (BIST) feature of the Intel486 processor, which is able to detect approximately 60% of the stuck-at faults on the Intel486 processor. The Intel486 processor ac/dc specifications for Vee and ClK must be met and RESET must have been asserted at least once prior to executing the RUNBIST boundary scan instruction. After loading the RUNBIST instruction code in the instruction register, the TAP controller must be placed in the RunTest/Idle state. BIST begins on the first rising edge of TCK after entering the Run-Test/Idle state. The TAP controller must remain in the Run-Test/Idle state until BIST is completed. It requires 1.2 million clock (ClK) cycles to complete BIST and report the result to the runbist register. After completing the 1.2 million clock (ClK) cycles, the value in the runbist register should be shifted out on 2-241 Intel486TM PROCESSOR FAMIL V TOO during the Shift-DR state. A value of "0" being shifted out on TOO indicates 81ST successfully completed. A value of "1" indicates a failure occurred. After executing the RUN81ST instruction, the Intel486 processor must be reset prior to normal operation. 11.5.4 TEST ACCESS PORT (TAP) CONTROLLER The TAP controller is a synchronous, finite state machine. It controls the sequence of operations of the test logic. The TAP controller changes state only in response to the following events: 1. a rising edge of TCK 2. power-up. The value of the test mode state (TMS) input signal at a rising edge of TCK controls the sequence of the state changes. The state diagram for the TAP controller is shown in Figure 11-8. Test designers must consider the operation of the state machine in order to design the correct sequence of values to drive on TMS. 242202-EO Figure 11-8. TAP Controller State Diagram 2-242 I Intel486TM PROCESSOR FAMILV 11.5.4.1 Test-Logie-Reset State 11.5.4.5 Shift-DR State In this state, the test logic is disabled so that normal operation of the device can continue unhindered. This is achieved by initializing the instruction register such that the IDCODE instruction is loaded. No mat· ter what the original state of the controller, the controller enters Test-Logic·Reset state when the TMS input is held high (1) for at least five rising edges of TCK. The controller remains in this state while TMS is high. The TAP controller is also forced to enter this state at power-up. In this controller state, the test data register connected between TDI and TOO as a result of the current instruction shifts data one stage toward its serial output on each rising edge of TCK. The instruction does not change in this state. When the TAP controller is in this state and a rising edge is applied to TCK, the controller enters the Exit1-DR state if TMS is high or remains in the ShiftDR state if TMS is low. 11.5.4.2 Run-Test/Idle State A controller state between scan operations. Once in this state, the controller remains in this state as long as TMS is held low. In devices supporting the RUNBIST instruction, the BIST is performed during this state and the result is reported in the runbist register. For instruction not causing functions to execute during this state, no activity occurs in the test logic. The instruction register and all test data registers retain their previous state. When TMS is high and a rising edge is applied to TCK, the controller moves to the Select-DR state. 11.5.4.6 Exit 1-DR State This is a temporary state. While in this state, if TMS is held high, a rising edge applied to TCK causes the controller to enter the Update-DR state, which terminates the scanning process. If TMS is held low and a rising edge is applied to TCK, the controller enters the Pause-DR state. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. 11.5.4.3 Select-OR-Scan State 11.5.4.7 Pause-DR State This is a temporary controller state. The test data register selected by the current instruction retains its previous state. If TMS is held low and a rising edge is applied to TCK when in this state, the controller moves into the Capture-DR state, and a scan sequence for the selected test data register is initiated. If TMS is held high and a rising edge is applied to TCK, the controller moves to the Select-IR-Scan state. The pause state allows the test controller to temporarily halt the shifting of data through the test data register in the serial path between TDI and TOO. An example of using this state could be to allow a tester to reload its pin memory from disk during application of a long test sequence. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. The instruction does not change in this state. 11.5.4.4 Capture-DR State In this state, the boundary scan register captures input pin data if the current instruction is EXT EST or SAMPLE/PRELOAD. The other test data registers, which do not have parallel input, are not changed. The instruction does not change in this state. When the TAP controller is in this state and a rising edge is applied to TCK, the controller enters the Exit1-DR state if TMS is high or the Shift-DR state if TMS is low. I The controller remains in this state as long as TMS is low. When TMS goes high and a rising edge is applied to TCK, the controller moves to the Exit2-DR state. 11.5.4.8 Exit2-DR State This is a temporary state. While in this state, if TMS is held high, a rising edge applied to TCK causes the controller to enter the Update-DR state, which terminates the scanning process. If TMS is held low and a rising edge is applied to TCK, the controller enters the Shift-DR state. 2-243 Intel486TM PROCESSOR FAMILY The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. 11.5.4.9 Update-DR State The boundary scan register is provided with a latched parallel output to prevent changes at the parallel output while data is shifted in response to the EXTEST and SAMPLE/PRELOAD instructions. When the TAP controller is in this state and the boundary scan register is selected, data is latched onto the parallel output of this register from the shiftregister path on the falling edge of TCK. The data held at the latched parallel output does not change other than in this state. All test data registers selected by the current instruction retains its previous value during this state. The instruction does not change in this state. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. When the controller is in this state and a rising edge is applied to TCK, the controller enters the Exit1-IR state if TMS is held high, or remains in the Shift-IR state if TMS is held low. 11.5.4.13 Exit1-IR State This is a temporary state. While in this state, if TMS is held high, a rising edge applied to TCK causes the controller to enter the Update-IR state, which terminates the scanning process. If TMS is held low and a rising edge is applied to TCK, the controller enters the Pause-IR state. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. 11.5.4.10 Select-IR-Scan State This is a temporary controller state. The test data register selected by the current instruction retains its previous value. If TMS is held low and a rising edge is applied to TCK when in this state, the controller moves into the Capture-IR state, and a scan sequence for the instruction register is initiated. If TMS is held high and a rising edge is applied to TCK, the controller moves to the Test-Logic-Reset state. The instruction does not change in this state. 11.5.4.11 Capture-IR State In this controller state the shift register contained in the instruction register loads the fixed value "0001" on the rising edge of TCK. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. When the controller is in this state and a rising edge is applied to TCK, the controller enters the Exit1-IR state if TMS is held high, or the Shift-IR state if TMS is held low. 11.5.4.12 Shift-IR State 11.5.4.14 Pause-IR State The pause state allows the test controller to temporarily halt the shifting of data through the instruction register. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. The controller remains in this state as long as TMS is low. When TMS goes high and a rising edge is applied to TCK, the controller moves to the Exit2-IR state. < 11.5.4.15 Exit2-IR State This is a temporary state. While in this state, if TMS is held high, a rising edge applied to TCK causes the controller to enter the Update-IR state, which terminates the scanning process. If TMS is held low and a rising edge is applied to TCK, the controller enters the Shift-IR state. The test data register selected by the current instruction retains its previous value during this state. The instruction does not change in this state. In this state the shift register contained in the instruction register is connected between TDI and TOO and shifts data one stage towards its serial output on each rising edge of TCK. 2-244 I Intel486™ PROCESSOR FAMIL V 11.5.4.16 Update-IR State The instruction shifted into the instruction register is latched onto the parallel output from the shift-regis· ter path on the falling edge of TCK. Once the new instruction has been latched, it becomes the current instruction. Test data registers selected by the new current instruction retain the previous value. 11.5.5 BOUNDARY SCAN REGISTER BITS AND BIT ORDERS The boundary scan register contains a cell for each pin, as well as cells for control of 1/0 and tri-state pins. Intel486 DX and IntelDX2 Processor Boundary Scan Register Bits . The following is the bit order of the Intel486 OX and IntelOX2 processor boundary scan register (from left to right and top to bottom. See notes below): TOO +- A2, A3, A4, A5, UP#, A6, A7, A8, A9, A10,A11,A12,A13,A14,A15,A16,A17,A18, A19,A20,A21,A22,A23,A24,A25,A26,A27, A28, A29, A30, A31, OPO, DO, 01, 02, 03, 04, 05,06,07,OP1,08,09,010,011,012,013, 014, 015, OP2, 016, 017, 018, 019, 020, 021, 022, 023, OP3, 024, 025, 026, 027, 028, 029, 030, 031, STPCLK#, IGNNE#, FERR#, SMI#, SMIACT#, SRESET, NMI, INTR, FLUSH#, RESET, A20M#, EAOS#, PCO, PWT, O/C#, M/IO#, BE3#, BE2#, BE1#,BEO#,BREQ,W/R#,HLO~CLKR~ Intel486 SX and IntelSX2 Processor Boundary Scan Register Bits The following is the bit order of the Intel486 SX and IntelSX2 processor boundary scan register (from left to right and top to bottom. See notes below): TOO +- A2, A3, A4, A5, UP#, A6, A7, A8, A9, A10,A11,A12,A13,A14,A15,A16,A17,A18, A19,A20,A21,A22,A23,A24,A25,A26,A27, A28, A29, A30, A31, OPO, DO, 01, 02, 03, 04, 05, 06, 07, OP1,. 08, 09, 010, 011, 012, 013, 014, 015, OP2, 016, 017, 018, 019,020, 021, 022, 023, OP3, 024, 025, 026, 027, 028, 029, 030, 031, STPCLK#, Reserved, Reserved, SMI#, SMIACT#, SRESET, NMI, INTR, FLUSH#, RESET, A20M#, EAOS#, PCO, PWT, O/C#, M/IO#, BE3#, BE2#, BE1 #, BEO#, BREQ, W/R#, HLOA, CLK, Reserved, AHOLO, HOLD, KEN#, ROY#, BS8#, BS16#, BOFF#, BROY#, PCHK#, LOCK#, PLOCK#, BLAST#, AOS#, MISCCTL, BUSCTL, ABUSCTL, WRTL +- TOI served, AHOLO, HOLD, KEN #, ROY #, BS8 #, BS16#, BOFF#, BROY#, PCHK#, LOCK#, PLOCK#, BLAST#, AOS#, MISCCTL, BUSCTL, ABUSCTL, WRTL +- TOI 50-MHz Intel486 DX Processor Boundary Scan Register Bits The following is the bit order of the 50-MHz Intel486 OX processor boundary scan register (from left to right and top to bottom. See notes below): TOO +- A2, A3, A4, A5, UP#, A6, A7, A8, A9, A10,A11,A12,A13,A14,A15,A16,A17,A18, A19,A20,A21,A22,A23,A24,A25,A26,A27, A28, A29, A30, A31, OPO, DO, 01, 02, 03, 04, 05,06,07, OP1, 08,09, 010, 011, 012, 013, 014, 015, OP2, 016, 017, 018, 019, 020, 021, 022, 023, OP3, 024, 025, 026, 027, 028, 029, 030, 031, IGNNE#, FERR#, NMI, INTR, FLUSH#, RESET, A20M#, EAOS#, PCO, PWT, O/C#, MIIO#, BE3#, BE2#, BE1#,BEO#,BREQ,W/R#,HLO~CLK,R~ served, AHOLO, HOLD, KEN#, ROY#, BS8#, 8S16#, BOFF#, BROY#, PCHK#, LOCK#, PLOCK #, BLAST # , ADS #, MISCCTL, BUSCTL, ABUSCTL, WRTL +- TOI I 2-245 Intel486TM PROCESSOR FAMIL V Write-Back Enhanced IntelDX2 Boundary Scan Register Bits Processor The following is the bit order of the Write-Back Enhanced IntelDX2 processor boundary scan register (from left to right and top to bottom. See notes below): TOO +- A2, A3, A4, A5, UP#, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, A23, A24, A25, A26, A27; A28,A29,A30,A31,DPO,DO,D1,D2,D3,D4, D~DaD~DP1,D8,D~D10,D11,D12,D1~ D14, D15, DP2, D16, D17, D18, D19, D20, D21, D22, D23, DP3, D24, D25, D26, D27, D28, D29, D30, D31, STPCLK#, IGNNE#, INV, CACHE#, FERR#, SMI#, WB/WT#, HITM#, SMIACT#, SRESET, NMI, INTR, FLUSH#, RESET, A20M#, EADS#, PCD, PWT, D/C#, M/IO#, BE3#, BE2#, BE1 #, BEO#, BREQ, W/R#, HLDA, CLK, Reserved, AHOLD, HOLD, KEN#, RDY#, BS8#, BS16#, BOFF#, BRDY#, PCHK#, LOCK#, BLAST # , ADS#, MISCCTL, PLOCK#, BUSCTL, ABUSCTL, WRTL +- TDI IntelDX4 Processor Boundary Scan Register Bits The following is the bit order of the IntelDX4 processor boundary scan register (from left to right and top to bottom. See notes below): TDO +- A2, A3, A4, A5,-UP#, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21,A22,A23,A24,A25,A26,A27, A28, A29, A30, A31, DPO, DO, D1, D2, D3, D4, D5, D6, D7, DP1, D8, D9, D10, D11, D12, D13, D14, D15, DP2, D16, D17, D18, D19, D20, D21, D22, D23, DP3, D24, D25, D26, D27, D28, D29, D30, D31, STPCLK#, IGNNE#, FERR#, SMI#, SMIACT#, SRESET, NMI, INTR, FLUSH#, RESET, A20M#, EADS#, PCD, PWT, D/C#, M/IO#, BE3#, BE2#, BE1 #, BEO#, BREQ, W/R#, HLDA, CLK, AHOLD, HOLD, KEN #, RDY #, CLKMUL, BS8#, BS16#, BOFF#, BRDY#, PCHK#, LOCK#, PLOCK#, BLAST # , ADS#, MISCCTL, BUSCTL, ABUSCTL, WRTL +TDI 2-246 NOTES: "Reserved" corresponds to no connect "NC" or "INC" signals on the Intel486 processor. All the *CTL cells are control cells that are used to select the direction of bidirectional pins or tri-state output pins. If '1' is loaded into the control cell (*CTL), the associated pin(s) are tri-stated or selected as input. The following lists the control cells and their corresponding pins. 1. WRCTL controls 'the D31-DO and DP3DPO pins. 2. ABUSCTL controls the A31-A2 pins. , 3. BUSCTL controls the ADS #, BLAST # , PLOCK#, LOCK#, WR#, BEO#, BE1 #, BE2#, BE3#, MIO#, DC#, PWT, and PCD pins. 4. MISCCTL controls the PCHK#, HLDA, and BREQ pins. 11.5.6 TAP CONTROLLER INITIALIZATION The TAP controller is automatically initialized when a device is powered up. In addition, the TAP controller can be initialized by applying a high signal level on the TMS input for five TCK periods. 11.5.7 BOUNDARY SCAN DESCRIPTION LANGUAGE (BSDL) FILES See Appendix D for an example of a BSDL file for Intel486 processors. I Intel486™ PROCESSOR FAMILV 12.0 DEBUGGING SUPPORT 12.3 Debug Registers The Intel486 processor provides several features that simplify the debugging process. The three categories of on-chip debugging aids are: 1. Code execution breakpoint opcode (OCCH), 2. Single-step capability provided by the TF bit in the flag register, and 3. Code and data breakpoint capability provided by the Debug Registers DRO-3, DR6, and DR7. The Debug Registers are an advanced debugging feature of the Intel486 processor. They allow data access breakpOints as well as code execution breakpoints. Because the breakpoints are indicated by on-chip registers, an instruction execution breakpoint can be placed in ROM code or in code shared by several tasks, neither of which can be supported by the INT3 breakpoint opcode. 12.1 BreakpOint Instruction A single-byte-opcode breakpoint instruction is available for use by software debuggers. The breakpoint opcode is OCCH, and generates an exception 3 trap when executed. In typical use, a debugger program can "plant" the breakpoint instruction at all desired code execution breakpoints. The single-byte breakpoint opcode is an alias for the two-byte general software interrupt instruction, INT n, where n = 3. The only difference between INT 3 (OCCh) and INT n is that INT 3 is never IOPL-sensitive, while INT n is IOPL-sensitive in Protected Mode and Virtual 8086 Mode. 12.2 Single-Step Trap If the single-step flag (TF, bit 8) in the EFLAG register is found to be set at the end of an instruction, a single-step exception occurs. The single-step exception is auto vectored to exception number 1. Precisely, exception 1 occurs as a trap after the instruction following the instruction which set TF. In typical practice, a debugger sets the TF bit of a flag register image on the debugger's stack. It then typically transfers control to the user program and loads the flag image with a signal instruction, the IRET instruction. The single-step trap occurs after executing one instruction of the user program. Because exception 1 occurs as a trap (that is, it occurs after the instruction has already executed), the CS:EIP pushed onto the debugger's stack points to the next unexecuted instruction of the program being debugged. An exception 1 handler, merely by ending with an IRET instruction, can therefore efficiently support single-stepping through a user program. I The Intel486 processor contains six Debug Registers, providing the ability to specify up to four distinct breakpoints addresses, breakpoint control options, and read breakpoint status. Initially after reset, breakpoints are in the disabled state. Therefore, no breakpoints will occur unless the debug registers are programmed. Breakpoints set up in the Debug Registers are auto vectored to exception number 1. 12.3.1 LINEAR ADDRESS BREAKPOINT REGISTERS (DRO-DR3) Up to four breakpoint addresses can be specified by writing into Debug Registers DRO-DR3, shown in Figure 12-1. The breakpOint addresses specified are 32-bit linear addresses. Intel486 processor hardware continuously compares the linear breakpoint addresses in DRO-DR3 with the linear addresses generated by executing software (a linear address is the result of computing the effective address and adding the 32-bit segment base address). Note that if paging is not enabled the linear address equals the physical address. If paging is enabled, the linear address is translated to a physical 32-bit address by the on~chip paging unit. Regardless of whether paging is enabled or not, however, the breakpoint registers hold linear addresses. 12.3.2 DEBUG CONTROL REGISTER (DR7) A Debug Control Register, DR7 shown in Figure 12-1, allows several debug control functions such as enabling the breakpoints and setting up other control options for the breakpoints. The fields within the Debug Control Register, DR7, are as follows: 2-247 intel® Intel486TM PROCESSOR FAMILY 0 1615 o Linear Address ORO Breakpoint 1 Linear Address DR1 Breakpoint 2 Linear Address DR2 Breakpoint 3 Linear Address DR3 DR4 DR5 DR6 DR7 31 0 1615 rm indicates Intel Reserved; Do not define. 242202-E1 NOTE: See section 4.2. 7 Figure 12-1. Debug Registers LENi (breakpoint length specification bits) A 2·bit LEN field exists for each of the four break· points. LEN specifies the length of the associated breakpoint field. The choices for data breakpoints are: 1 byte, 2 bytes, and 4 bytes. Instruction execu· tion breakpoints must have a length of 1 (LENi = 00). Encoding of the LENi field is as described in Table 12-1. The LENi field controls the size of breakpoint field i by controlling whether all low-order linear address bits. in the breakpoint address register are used to detect the breakpoint event. Therefore, all breakpOint fields are aligned; 2-byte breakpoint fields begin on Word boundaries, and 4-byte breakpoint fields begin on Dword boundaries. Figure 12-2 is an example of various size breakpoint fields. Assume the breakpoint linear address in DR2 is 00000005H. In that situation, the Figure 12-2 indicates the region of the breakpoint field for lengths of 1, 2, or 4 bytes. RWi (memory access qualifier bits) A 2-bit RW field exists for each of the four breakpoints. The 2-bit RW field specifies the type of usage which must occur in order to activate the associated breakpoint. 2-248 Table 12-1. LENi Encoding Usage of Least Significant Bits in Breakpoint Address Register i, (i = 0-3) LENi Encoding Breakpoint Field Width 00 1 byte All 32-bits used to specify a single-byte breakpoint field. 01 2 bytes A1-A31 used to specify a two-byte, word-aligned breakpoint field. AO in Breakpoint Address Register is not used. 10 Undefined-do not use this encoding 11 4 bytes A2-A31 used to specify a four-byte, dword-aligned breakpoint field. AO and A 1 in Breakpoint Address Register are not used. I Intel486TM PROCESSOR FAMILY Note that instruction execution breakpoints are taken as faults (I.e., before the instruction executes), but data breakpoints are taken as traps (I.e., after the data transfer takes place). DR2=OOOOOOOSH; LEN2 = OOB 31 0 Using LENi and RWi to Set Data Breakpoint i 00000008H 00000004H bkpt fld2 OOOOOOOOH DR2=OOOOOOOSH;LEN2 If a data access entirely or partly falls within the data breakpoint field, the data breakpoint condition has occurred, and if the breakpoint is enabled, an exception 1 trap will occur. =01 B a 31 00000008H ~ bkpt fld2 -t 00000004H OOOOOOOOH DR2=OOOOOOOSH; LEN2 =11 B a 31 A data breakpoint can be set up by writing the linear address into DRi (i = 0-3). For data breakpoints, RWi can = 01 (write-only) or 11 (write/read). LEN can = 00, 01, or 11. 00000008H r---~----~----~---4 ~ bkpt fld2 -t 00000004H r---~----~----~---4 OOOOOOOOH Using LENi and RWi to Set Instruction Execution Breakpoint i An instruction execution breakpoint can be set up by writing address of the beginning of the instruction (including prefixes if any) into DRi (i = 0-3). RWi must = 00 and LEN must = 00 for instruction execution breakpoints. If the instruction beginning at the breakpoint address is about to be executed, the instruction execution breakpoint condition has occurred, and if the breakpoint is enabled, an exception 1 fault will occur before the instruction is executed. Note that an instruction execution breakpoint ad- . dress must be equal to the beginning byte address of an instruction (including prefixes) in order for the instruction execution breakpoint to occur. 242202-E2 Figure 12-2. Size Breakpoint Fields Table 12-2. RW Encoding RW Encoding Usage Causing Breakpoint 00 01 10 11 Instruction execution only Data writes only Undefined-do not use this encoding Data reads and writes only RW encoding 00 is used to set up an instruction execution breakpoint. RW encodings 01 or 11 are used to set up write-only or read/write data breakpoints. I GO (Global Debug Register access detect) The Debug Registers can only be accessed in Real Mode or at privilege level a in Protected Mode. The GO bit, when set, provides extra protection against any Debug Register access even in Real Mode or at privilege level a in Protected Mode. This additional protection feature is provided to guarantee that a software debugger can have full control over the Debug Register resources when required. The GO bit, when set, causes an exception 1 fault if an instruction attempts to read or write any Debug Register. The GO bit is then automatically cleared when the exception 1 handler is invoked, allowing the exception 1 handler free access to the debug registers. 2-249 Intel486TM PROCESSOR FAMILY GE and LE (Exact data breakpoint match, global and local) The breakpoint mechanism of the Intel486 processor differs from that of the Intel386 processor. The Intel486 processor always does exact data breakpoint matching, regardless of GE/LE bit settings. Any data breakpoint trap will be reported exactlyafter completion of the instruction that caused the o'perand transfer. Exact reporting is provided by forcing the Intel486 processor execution unit to wait for completion of data' operand transfers before beginning execution of the next instruction. ' When the Intel486 processor performs a task switch, the LE bit is cleared. Thus, the LE bit supports fast task switching out of tasks, that have enabled the exact data breakpoint match for their tasklocal breakpoints. The LE bit is cleared by the Intel486 processor during a task switch, to avoid having exact data breakpoint match enabled in the new task. Note that exact data breakpoint match must be re-enabled under software control. The Intel486 processor GE bit is unaffected during a task switch. The GE bit supports exact data breakpoint match that is to remain enabled during all tasks executing in the system. Note that instruction execution breakpoints, are always reported exactly. Gi and LI (breakpoint .enable, global and local) If either Gi or Li is set then the associated breakpoint. (as defined by the linear address in DRi, the length in LENi and the usage criteria in RWi) is enabled. If either Gi or Li is set, and the Intel486 processor detects the ith breakpoint condition, then the exception 1 handler is invoked. When the Intel486 processor performs a task switch to a new Task State Segment (TSS), all Li bits are cleared. Thus, the Li bits support fast task switching out of tasks that use some task-local breakpoint registers. The Li bits are cleared by the Intel486 processor during a task switch, to avoid spurious exceptions in the new task. Note that the breakpoints must be re-enabled under software control. 2-250 Allintel486 processor Gi bits are unaffected during a task switch. The Gi bits support breakpoints that are active in all tasks executing in the system. 12.3.3 DEBUG STATUS REGISTER (DR6) A Debug Status Register, DR6 shown in Figure 12-1, allows the exception 1 handler to easily determine why it was invoked. Note the exception 1 handler can be invoked as a result of one of several events: 1. ORO Breakpoint fault/trap. 2. DR1 Breakpoint fault/trap. 3. XDR2 Breakpoint fault/trap. 4. XDR3 Breakpoint fault/trap. . 5. XSingle-step (TF) trap. 6. XTask switch trap. 7. XFault due to attempted debug register access when GD=1. The Debug Status Register contains single-bit flags for each of the possible events invoking exception 1. Note below that some of these events are faults (exception taken before the instruction is executed), while other events are traps (exception taken after the debug events occurred). The flags in DR6 are set by the hardware but never cleared by hardware. Exception 1 handler software should clear DR6 before returning to the user program to avoid future confusion in identifying the source of exception 1. The fields within the Debug Status Register, DR6, are as follows: Bi (debug fault/trap due to breakpoint 0-3) Four breakpoint indicator flags, BO-B3, correspond one-to-one with the breakpoint registers in DRODR3. A flag Bi is set when the condition described by DRi,LENi, and RWi occurs. If Gi or Li is set, and if the ith breakpoint is detected, the Intel486 processor will invoke the exception 1 handler. The exception is handled as a fault if an instruction execution' breakpoint occurred, or as a trap if a data breakpoint occurred. I Intel486™ PROCESSOR FAMILV IMPORTANT NOTE: A flag Bi is set whenever the hardware detects a match condition on enabled breakpoint i. Whenever a match is detected on at least one enabled breakpoint i, the hardware immediately sets all Bi bits corresponding to breakpoint conditions matching at that instant, whether enabled or not. Therefore, the exception 1 handler may see that multiple Bi bits are set, but only set Bi bits corresponding to enabled breakpoints (Li or Gi set) are true indications of why the exception 1 handler was invoked. BO (debug fault due to attempted register access when GO bit set) This bit is set if the exception 1 handler was invoked due to an instruction attempting to read or write to the debug registers when GO bit was set. If such an event occurs, then the GO bit is automatically cleared when the exception 1 handler is invoked, allowing handler access to the debug registers. BT (debug trap due to task switch) This bit is set if the exception 1 handler was invoked due to a task switch occurring to a task having an Intel486 processor TSS with the T bit set. Note the task switch into the new task occurs normally, but before the first instruction of the task is executed, the exception 1 handler is invoked. With respect to the task switch operation, the operation is considered to be a trap. 12.3.4 USE OF RESUME FLAG (RF) IN FLAG REGISTER The Resume Flag (RF) in the flag word can suppress an instruction execution breakpoint when the exception 1 handler returns to a user program at a user address which is also an instruction execution breakpoint. BS (debug trap due to single-step) This bit is set if the exception 1 handler was invoked due to the TF bit in the flag register being set (for single-stepping). I 2-251 Intel486™ PROCESSOR FAMIL V 13.0 INSTRUCTION SETSUMMARV This section describes the Intel486 processor instruction set. Detailed information on the CPUID instruction can be found in Appendix B: Feature Determination. Further details of the instruction encoding are then provided in section 13.1 , which describes the entire encoding structure and the definition of all fields occurring within the Intel486 processor instructions. 13.1 Instruction Encoding 13.1.1 OVERVIEW All instruction encodings are subsets of the general instruction format shown in Figure 13-1. Instructions consist of one or two primary opcode bytes, possibly an address specifier consisting of the "mod rim" byte and "scaled index" byte, a displacement if required, and an immediate data field if required. Within the primary opcode or opcodes, smaller encoding fields may be defined. These fields vary according to the class of operation. The fields define such information as direction of the operation, size of the displacements, register encoding, or sign extension. Almost all instructions referring to an operand in memory have an addressing mode byte following the primary opcode byte{s). This byte, the mod rim byte, specifies the address mode to be used. Certain encodings of the mod rim byte indicate a second addressing byte, the scale-index-base byte, follows the mod rim byte to fully specify the addressing mode. Addressing modes can include a displacement immediately following the mod rim byte, or scaled index byte. If a displacement is present, the possible sizes are 8, 16 or 32 bits. If the instruction specifies an immediate operand, the immediate operand follows any displacement bytes. The immediate operand, if specified, is always the last field of the instruction. Figure 13-1 illustrates several of the fields that can appear in an instruction, such as the mod field and the rim field, but the figure does not show all fields. Several smaller fields also appear in certain instructions, sometimes within the opcode bytes themselves. Table 13-1 is a complete list of all fields appearing in the Intel486 processor instruction set. Following Table 13-1 are detailed tables for each field. I ITT T T TT TTl T T T T T T TTl mod T T T rIm ss index base I d32i161 al none data321161al none 7 070765320765320 '- -v-opcode (one or two bytes) (T represents an opcodebit.) /~~~~ "mod rIm" byte "s-i-b' byte ~---------~---------/ address displacement (4, 2, 1 bytes or none) immediate data (4. 2, 1 bytes or none) register and'address mode specifier 242202-E3 Figure 13·1. General Instruction Format 2-252 I Intel486™ PROCESSOR FAMILY Table 13-1. Fields within Intel486™ Processor Instructions Field Name Description Number of Bits w Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits) 1 d Specifies Direction of Data Operation 1 s Specifies if an immediate Data Field Must be Sign-Extended 1 reg General Register Specifier 3 mod rim Address Mode Specifier (Effective Address can be a General Register) 2 for mod; 3 for rim ss Scale Factor for Scaled Index Address Mode 2 index General Register to be used as Index Register 3 base General Register to be used as Base Register 3 sreg2 Segment Register Specifier for CS, SS, OS, ES 2 sreg3 Segment Register Specifier for CS, SS, OS, ES, FS, GS 3 tttn For Conditional Instructions, Specifies a Condition Asserted or a Condition Negated 4 NOTE: Table 13-15 through Table 13-19 show encoding of individual instructions. 13.1.2 32-BIT EXTENSIONS OF THE INSTRUCTION SET With the Intel486 processor, the 8086/80186/80286 instruction set is extended in two orthogonal directions: 32-bit forms of all 16-bit instructions are added to support the 32-bit data types, and 32-bit addressing modes are made available for all instructions referencing memory. This orthogonal instruction set extension is accomplished having a Default (D) bit in the code segment descriptor, and by having 2 prefixes to the instruction set. Whether the instruction defaults to operations of 16 bits or 32 bits depends on the setting of the 0 bit in the code segment descriptor, which gives the default length (either 32 bits or 16 bits) for both operands and effective addresses when executing that code segment. In the Real Address Mode or Virtual 8086 Mode, no code segment descriptors are used, but a 0 value of 0 is assumed internally by the Intel486 processor when operating in those modes (for 16-bit default sizes compatible with the 80861 80186/80286). Two prefixes, the Operand Size Prefix and the Effective Address Size Prefix, allow overriding individually the Default selection of operand size and I effective address size. These prefixes may precede any opcode bytes and affect only the instruction they precede. If necessary, one or both of the prefixes may be placed before the opcode bytes. The presence of the Operand Size Prefix and the Effective Address Prefix will toggle the operand size or the effective address size, respectively, to the value "opposite" from the Default setting. For example, if the default operand size is for 32-bit data operations, then presence of the Operand Size Prefix toggles the instruction to 16-bit data operation. As another example, if the default effective address size is 16 bits, presence of the Effective Address Size prefix toggles the instruction to use 32-bit effective address computations. These 32-bit extensions are available in all Intel486 processor modes, including the Real Address Mode or the Virtual 8086 Mode. In these modes the default is always 16 bits, so prefixes are needed to specify 32-bit operands or addresses. For instructions with more than one prefix, the order of prefixes is unimportant. Unless specified otherwise, instructions with 8-bit and 16-bit operands do not affect the contents of the high-order bits of the extended registers. 2-253 Intel486TM PROCESSOR FAMILV 13.1.3 ENCODING OF INTEGER INSTRUCTION FIELDS Table 13-4. Encoding of reg Field when the w Field Is Present In Instruction Within the instruction are several fields indicating register selection, addressing mode and so on. The exact encodings of these fields are defined immedi· ately ahead. Register Specified by reg Field during 16-Bit Data Operations: 13.1.3.1 Encoding of Operand Length (w) Field For any given instruction performing a data operation, the instruction is executing as a 32·bit operation or a 16·bit operation. Within the constraints of the operation size, the w field encodes the operand size as either one byte or the full operation size, as shown in the table below. Table 13-2. Encoding of Operand Length (w) Field wField Operand Size during 16-Bit Data Operations Operand Size during 32-Bit Data Operations 0 8 Bits 8 Bits 1 16 Bits 32 Bits 13.1.3.2 Encoding of the General Register (reg) Field The general register is specified by the reg field, which may appear in the primary opcode bytes, or as the reg field of the "mod rim" byte, or as the rim field of the "mod rim" byte. Table 13-3. Encoding of reg Field when the w Field Is Not Present in Instruction reg Field 000 001 010 011 100 101 110 111 2·254 Register Selected during 16-Bit Data Operations Register Selected during 32-Bit Data Operations AX CX OX BX SP BP SI EAX ECX EDX EBX ESP EBP ESI EDI 01 reg 000 001 010 011 100 101 110 111 Function of w Field (whenw = 0) (when w = 1) AL CL DL BL AH CH DH BH AX CX OX BX SP BP SI 01 Register Specified by reg Field during 32-Blt Data Operations reg 000 001 010 011 100 101 110 111 Function of w Field (when w = 0) (whenw = 1) AL CL DL BL AH CH DH BH EAX ECX EDX EBX ESP EBP ESI EDI 13.1.3.3 Encoding of the Segment Register (sreg) Field The sreg field in certain instructions is a 2-bit field allowing one of the four 80286 segment registers to be specified. The sreg field in other instructions is a 3-bit field, allowing the Intel486 processor FS and GS segment registers to be specified. I Intel486TM PROCESSOR FAMILV Table 13·5. 2·Bit sreg2 Field 2·bit sreg2 Field Segment Register Selected 00 01 10 11 ES CS SS DS Table 13·6. 3·Bit sreg3 Field 3·bit sreg3 Field Segment Register Selected 000 001 010 011 100 101 110 111 ES CS SS DS FS GS do not use do not use 13.1.3.4 Encoding of Address Mode Except for special instructions, such· as PUSH or POP, where the addressing mode is pre-determined, the addressing mode for the current instruction is specified by addressing bytes following the primary opcode. The primary addressing byte is the "mod rl m" byte, and a second byte of addressing information, the "s-i-b" (scale·index·base) byte, can be specified. The s·i-b byte (scale·index·base byte) is specified when using 32·bit addressing mode and the "mod rl m" byte has rim = 100 and mod = 00,01 or 10. When the sib byte is present, the 32-bit addressing mode is a function of the mod, SS, index, and base fields. The primary addressing byte, the "mod rim" byte, also contains three bits (shown as TTT in Figure 131) sometimes used as an extension of the primary opcode. The three bits, however, may also be used as a register field (reg). When calculating an effective address, either 16-bit addressing or 32·bit addressing is used. 16·bit ad· dressing uses 16·bit address components to calcu· late the effective address while 32-bit addressing uses 32·bit address components to calculate the ef· fective address. When 16-bit addressing is used, the "mod rim" byte is interpreted as a 16·bit addressing mode specifier. When 32·bit addressing is used, the "mod rim" byte is interpreted as a 32·bit addressing mode specifier. Tables 13-7,13-8, and 13·9 define all encodings of all 16-bit addressing modes and 32·bit addressing modes. 2-255 Intel486™ PROCESSOR FAMILY Table 13-7. Encoding of 16-Blt Address Mode with "mod rIm" Byte mod rIm Effective Address mod rIm Effective Address 00000 OS:[BX+SIl 10000 OS:[BX+SI+d16] 00001 OS:[BX+OIl 10001 OS:[BX + 01+ d16] 00010 SS:[BP+SIl 10010 SS:[BP+SI+d16] 00011 SS:[BP+OIl 10011 SS:[BP+ 01 + d16] 00100 OS:[SIl 10100 OS:[SI + d16] 00101 OS: [011 10101 OS: [01 + d16] 00110 OS:d16 10110 SS:[BP+d16] OS:[BX+d16] 00111 OS:[BX] 10111 01000 OS: [BX + SI + dS] 11 000 register-see below 01001 OS: [BX + 01 + dS] 11001 register-see below 01 010 SS: [BP + SI + dB] 11010 register-see below 01 011 SS:[BP+OI+dB] 11 011 register-see below 01100 OS:[SI+dB] 11 100 register-see below 01101 OS:[OI+dB] 11 101 register-see below 01110 SS:[BP+dB] 11 110 register-see below 01 111 OS:[BX+dS] 11 111 register-see below Register Specified by rIm during 16-Blt Data Operations , Register Specified by rIm during 32-Bit Data Operations Function of w Field mod rIm (whenw.=O) (whenw= 1) 11000 AL AX 11 001 CL 11010 OL Function of w Field mod rIm (when w=O) (when w=1) 11000 AL EAX CX 11001 CL ECX OX 11010 OL EOX 11 011 BL BX 11 011 BL EBX 11100 AH SP 11 100 AH ESP 11 101 CH BP 11 101 CH EBP 11 110 OH SI 11 110 OH ESI 11 111 BH 01 11 111 BH EOI 2-256 I Intel486TM PROCESSOR FAMILY Table 13-8. Encoding of 32-Bit Address Mode with "mod rIm" Byte (No "s-i-b" Byte Present) mod rIm Effective Address mod rIm Effective Address 00000 OS: [EAX] 10000 DS:[EAX + d32] 00001 OS: [ECX] 10001 DS:[ECX + d32] 00010 OS: [EDX] 10010 OS: [EDX + d32] 00011 OS: [EBX] 10011 OS: [EBX + d32] 00100 s-i-b is present 10100 s-i-b is present 00101 DS:d32 10101 SS:[EBP+d32] 00110 OS: [ESI] 10110 OS: [ESI + d32] 00111 DS:[EDI] 10 111 OS: [ED I + d32] 01000 DS:[EAX+dS] 11000 register-see below 01001 DS:[ECX+dS] 11001 register-see below 01010 DS:[EDX+dS] 11010 register-see below 01011 OS: [EBX + dS] 11 011 register-see below 01100 s-i-b is present 11 100 register-see below 01101 SS:[EBP+dS] 11 101 register-see below 01110 OS: [ESI + dS] 11 110 register-see below 01 111 OS: [EDI + dS] 11 111 register-see below Register Specified by reg or rIm during 16-Bit Data Operations: Register Specified by reg or rIm during 32-Bit Data Operations: Function of w Field Function of w Field mod rIm I (whenw=O) (whenw=1) mod rIm (when w=O) (when w = 1) 11000 AL AX 11000 AL EAX 11001 CL CX 11001 CL ECX 11010 DL OX 11 010 DL EDX 11 011 BL BX 11 011 BL EBX 11 100 AH SP 11100 AH ESP 11 101 CH BP 11 101 CH EBP 11 110 DH SI 11 110 DH ESI 11 111 BH 01 11 111 BH EDI 2-257 Intel486TM PROCESSOR FAMILY Table 13-9. Encoding of 32-Bit Address Mode ("mod rIm" Byte and "s-i-b" Byte Present) mod base Effective Address ss Scale Factor 00000 OS: [EAX + (scaled index)] 00 x1 00001 OS: [ECX + (scaled index)] 01 x2 00010 OS:[EOX + (scaled index)] 10 x4 00011 OS:[ESX + (scaled index)] 11 xB 00100 . SS: [ESP + (scaled index)] Index Index Register 00101 OS: [d32 + (scaled index)] EAX 00110 OS: [ESI + (scaled index)] 000 OS: [EOI + (scaled index)] 001 ECX 00111 010 EOX 01000 OS: [EAX + (scaled index) + dB] 011 01 001 OS: [ECX + (scaled index) + dB] 100 01010 OS: [EOX + (scaled index) + dB] 101 ESP 01011 OS: [ESX + (scaled index) + dB] SS: [ESP + (scaled index) + dB] 110 ESI 01 100 SS: [ESP + (scaled index) + dB] 111 EOI 01 101 01110 OS: [ESI + (scaled index) + dB] 01 111 OS: [EOI + (scaled index) + dB] 10000 OS: [EAX + (scaled index) + d32] 10001 OS: [ECX + (scaled index) + d32] 10010 OS: [EOX + (scaled index) + d32] 10011 OS: [ESX + (scaled index) + d32] 10100 SS: [ESP + (scaled index) + d32] 10101 SS: [ESP + (scaled index) + d32] 10110 OS: [ESI + (scaled index) + d32] 10111 OS: [EOI + (scaled index) + d32] NOTE: Mod field in "mod "s-i-b" byte. 2-258 EBX no index reg" "IMPORTANT NOTE: When index field is 100, indicating "no index register," then S5 field MUST equal 00. If index is 100 and ss does not equal 00, the effective address is undefined. rim" byte; ss, index, base fields in I Intel486™ PROCESSOR FAMIL V 13.1.3.5 Encoding of Operation Direction (d) Field Table 13-12. Encoding of Conditional Test (tun) Field In many two-operand instructions the d field is present to indicate which operand is considered the source and which is the destination. Mnemonic Table 13-10. Encoding of Operation Direction (d) Field Direction of Operation d 0 Register/Memory -+- Register "reg" Field Indicates Source Operand; "mod r/m" or "mod ss index base" Indicates Destination Operand 1 Register -+- Register/Memory "reg" Field Indicates Destination Operand; "mod r/m" or "mod ss index base" Indicates Source Operand 13.1.3.6 Encoding of Sign-Extend (5) Field The s field occurs primarily to instructions with immediate data fields. The s field has an effect only if the size of the immediate data is 8 bits and is being placed in a 1S-bit or 32-bit destination. Table 13-11. Encoding of Sign-Extend (5) Field Effect on Immediate Data 8 S Effect on Immediate Data 16132 0 None None 1 Sign-Extend Data 8 to Fill 1S-bit or 32·bit Destination None 13.1.3.7 Encoding of Conditional Test (tttn) Field For the conditional instructions (conditional jumps and set on condition). tttn is encoded with n indicat· ing to use the condition (n = 0) or its negation (n = 1). and ttt giving the condition to test. 0 NO B/NAE NB/AE E/Z NEINZ BE INA NBE/A S NS PIPE NP/PO L/NGE NL/GE LE/NG NLE/G Condition tUn Overflow No Overflow Below/Not Above or Equal Not Below/Above or Equal Equal/Zero Not Equal/Not Zero Below or Equal/Not Above Not Below or Equal/Above Sign Not Sign Parity/Parity Even Not Parity/Parity Odd Less Than/Not Greater or Equal Not Less Than/Greater or Equal Less Than or Equal/Greater Than Not Less or Equal/Greater Than 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 13.1.3.8 Encoding of Control or Debug or Test Register (eee) Field For the loading and storing of the Control, Debug and Test registers. Table 13-13. Encoding of Control or Debug or Test Register (eee) Field eee Code Reg Name When Interpreted as Control Register Field: 000 010 011 CRO CR2 CR3 When Interpreted as Debug Register Field: 000 001 010 011 110 111 DRO DR1 DR2 DR3 DRS DR7 When. Interpreted as Test Register Field: 011 100 101 110 111 TR3 TR4 TR5 TRS TR7 Do not use any other encoding I 2·259 Intel486™ PROCESSOR FAMILY Table 13-14. Encoding of Floating-Point Instruction Fields Optional Instruction First Byte Second Byte .. s-i-b disp rim s-i-b disp OPA 1 mod 2 11011 MF OPA mod 3 11011 d P OPA 1 1 4 11011 0 0 1 1 1 1 OP 5 11011 0 1 1 1 1 1 OP 15-11 10 9 8 7 6 5 43210 Instructions for the FPU assume one of the five forms .shown in the following table. In all cases, instructions are at least two bytes long and begin with the bit pattern 11011 B. OP = Instruction opcode, possible split into two fields OPA and OPB MF = Memory Format 00-32-bit real 01 -32-bit integer 10-64-bit real 11 -16-bit integer P= d= R XOR d = Pop O-Do not pop stack 1-Pop stack after operation Destination O-Destination is ST(O) 1-Destination is ST(i) O-Destination (op) Source R XOR d = 1-Source (op) Destination Register stack element i ST(i) = 000 = Stack top 001 = Second stack element 111 = Eighth stack element mod (Mode field) and rim (Register/Memory specifier) have the same interpretation as the corresponding fields of the integer instructions. s-i-b (Scale Index Base) byte and disp (displacement) are optionally present in instructions that have mod and rim fields. Their presence depends on the values of mod and rim, as for integer instructions. 2-260 OPB rim 11011 13.1.4 ENCODING OF FLOATING POINT INSTRUCTION FIELDS 1 Fields OPB OPB ST(i) 13.2 Clock Count Summary To calculate elapsed time for an instruction, multiply the instruction clock count, as listed in Table 13-15 through Table 13-19 by the processor core clock period (e.g., 10 ns for a 100-MHz IntelDX4 processor). 13.2.1 INSTRUCTION CLOCK COUNT ASSUMPTIONS The Intel486 processor instruction core clock count tables give clock counts assuming data and instruction accesses hit in the cache. The combined instruction and data cache hit rate is over 90%. A cache miss will force the Intel486 processor to run an external bus cycle. The Intel486 processor 32-bit burst bus is defined as r·b-w. Where: r = The number of bus clocks in the first cycle of a burst read or the number of clocks per data cycle in a non-burst read. b = The number of bus clocks for the second and subsequent cycles in a burst read. w = The number of bus clocks for a write. The clock counts in the cache miss penalty column assume a 2-1-2 bus. For slower buses add r-2 clocks to the cache miss penalty for the first dword accessed. Other factors also affect instruction clock counts. I Intel486TM PROCESSOR FAMIL V Instruction Clock Count Assumptions 1. The external bus is available for reads or writes at all times. Else add bus clocks to reads until the bus is available. 2. Accesses are aligned. Add three core clocks to each misaligned access. 3. Cache fills complete before subsequent accesses to the same line. If a read misses the cache during a cache fill due to a previous read or prefetch, the read must wait for the cache fill to complete. If a read or write accesses a cache line still being filled, it must wait for the fill to complete. 4. If an effective address is calculated, the base register is not the destination register of the preceding instruction. If the base register is the destination register of the preceding instruction add 1 to the core clock counts shown. Back-to-back PUSH and POP instructions are not affected by this rule. 5. An effective address calculation uses one base register and does not use an index register. However, if the effective address calculation uses an index register, 1 core clock may be added to the clock count shown. 6. The target of a jump is in the cache. If not, add r clocks for accessing the destination instruction of a jump. If the destination instruction is not completely contained in the first dword read, add a maximum of 3b bus clocks. If the destination instruction is not completely contained in the first 16 byte burst, add a maximum of another r + 3b bus clocks. I 7. If no write buffer delay, w bus clocks are added only in the case in which all write buffers are full. 8. Displacement and immediate not used together. If displacement and immediate used together, 1 core clock may be added to the core clock count shown. 9. No invalidate cycles. Add a delay of 1 bus clock for each invalidate cycle if the invalidate cycle contends for the internal cache/external bus when the Intel486 processor needs to use it. 10. Page translation hits in TLB. A TLB miss will add 13, 21 or 28 bus clocks + 1 possible core clock to the instruction depending on whether the Accessed and/ or Dirty bit in neither, one or both of the page entries needs to be set in memory. This assumes that neither page entry is in the data cache and a page fault does not occur on the address translation. 11. No exceptions are detected during instruction execution. Refer to Interrupt core Clock Counts Table for extra clocks if an interrupt is detected. 12. Instructions that read multiple consecutive data items (Le. task switch, POPA, etc.) and miss the cache are assumed to start the first access on a 16-byte boundary. If not, an extra cache line fill may be necessary which may add up to (r + 3b) bus clocks to the cache miss penalty. 2-261 Intel486TM PROCESSOR FAMILY Table 13-15. Clock Count Summary Instruction Format Cache Hit Penalty if Cache Miss Notes INTEGER OPERATIONS = Move: MOV reg1 to reg2 1000 100w: 11 reg1 reg2 reg2 to reg1 1000 101w: 11 reg1 reg2 1 memory to reg 1000 100w: mod reg rim 1 immediate to reg 1100 011w: 11000 reg: immediate data 1 or 1 1011 W reg: immediate data 1 Immediate to Memory 110001 w : mod 000 rim: displacement immediate 1 Memory to Accumulator 1010 OOOw: full displacement 1 Accumulator to Memory 1010 001w: full displacement 1 MOVSX/MOVZX 2 2 = Move with Sign/Zero Extension reg2 to reg1 00001111: 1011 z11w: 11 reg1 reg2 3 memory to reg 00001111: 1011 z11w:modregr/m 3 reg 11111111: 11110reg 4 or 01010 reg 1 memory 11111111: mod 110r/m 4 immediate 0110 10s0 : immediate data 1 01100000 11 reg 10001111: 11 000 reg 4 1 or 01011 reg 1- 2 memory 10001111 : mod 000 rim 5 2 1 01100001 9 7/15 16/32 2 z instruction 0 MOVZX 1 MOVSX PUSH = Push PUSHA POP = Push All 1 1 = Pop POPA = Pop All XCHG = Exchange reg1 with reg2 1000 011w : 11 reg1 reg2 3 2 Accumulator with reg 10010 reg 3 2 Memory with reg 1000 011w: mod reg rim 5 2 2-262 I Intel486TM PROCESSOR FAMILY Table 13·15. Clock Count Summary (Continued) . Penalty Instruction Format Cache Hit If Cache Notes Miss INTEGER OPERATIONS (Continued) NOP = No Operation 10010000 LEA = Load EA to Register 10001101: mod reg rIm 1. no index register 1 with index register 2 Instruction ADD = Add ADC = Add with Carry AND = Logical AND OR ;.. Logical OR SUB = Subtract SBB = Subtract with Borrow XOR = Logical Exclusive OR TIT 000 010 100 001 101 011 110 reg1 to reg2 OOTT TOOw : 11 reg1 reg2 reg2 to reg1 00TTT01w: 11 reg1 reg2 1 memory to register 00TTT01w: mod reg rIm 2 2 register to memory OOTT TOOw : mod reg·r/m 3 6/2 U/L immediate to register 1000 OOsw : 11 TIT reg: immediate register 1 immediate to Accumulator OOTT T1 Ow : immediate data 1 immediate to memory 1000 OOsw: mod TIT rIm: immediate data 3 6/2 U/L 6/2 U/L 6/2 U/L Instruction INC - Increment DEC = Decrement 1 TIT 000 001 '1 reg 1111111w: 11 TIT reg or 01 TIT reg 1 memory 1111111w: mod TIT rIm 3 Instruction NOT - Logical Complement NEG = Negate TIT 010 011 reg 1111 011w: 11 TIT reg 1 memory f111011w:modTITr/m 3 I 2-263 Intel486TM PROCESSOR FAMILY Table 13·15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty If Cache Miss Notes INTEGER OPERATIONS (Continued) = Compare CMP reg 1 with reg2 0011 100w: 11 reg1reg2 reg2 with reg1 0011101w: 11 reg1 reg2 1 memory with register 0011100w: mod reg rIm 2 2 register with memory 0011101w: mod reg r/m 2 2 immediate with register 1000 OOsw : 11 111 reg: immediate data 1 immediate with acc. 0011 110w: immediate data 1 immediate with memory 1000 OOsw : mod 111 r /m : immediate data 2 TEST 1 = Logical Compare reg 1 and reg2 1000 010w: 11 reg1 reg2 1 memory and register 1000 010w: mod reg rIm 2 immediate and register 1111 011 w : 11 000 reg: immediate data 1 immediate and acc. 1010100w: immediate data 1 immediate and memory 1111 011w: mod 000 rIm : immediate data 2 MUL 2 2 2 = Multiply (unsigned) acc. with register 1111 011 w :11 100 reg Multiplier-Byte Word Dword acc. with memory 13/18 13/26 13/42 1111 011 w : mod 100 rIm Multiplier-Byte Word Dword IMUL 13/18 13/26 13/42 1 1 1 MN/MX,3 MN/MX.3 MN/MX,3 = Integer Multiply (unsigned) acc. with register 1111 011w: 11101 reg Multiplier-Byte Word Dword acc. with memory reg1 with reg2 13/18 13/26 13/42 MN/MX,3 MN/MX,3 MN/MX,3 13/18 13/26 13/42 MN/MX,3 MN/MX,3 MN/MX,3· 13/18 13/26 13/42 MN/MX,3 MN/MX,3 MN/MX,3 1111 011w: mod 101 rim Multiplier-Byte Word Dword 00001111: 10101111: 11 reg1 reg2 Multiplier-Byte Word Dword 2-264 MN/MX,3 MN/MX,3 MN/MX,3 I Intel486TM PROCESSOR FAMILV Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty If Cache Miss Notes 13/18 13/26 13/42 1 1 1 MN/MX,3 MN/MX,3 MN/MX,3 INTEGER OPERATIONS (Continued) IMUL = Integer Multiply (unsigned), (Continued) register with memory 00001111 :10101111 : mod reg rim Multiplier-Byte Word Dword reg1 with imm. to reg2 0110 10s1 : 11 reg1 reg2: immediate data Multiplier-Byte Word Dword memo with imm. to reg. 13/18 13/26 13/42 MN/MX,3 MN/MX,3 MN/MX,3 13/18 13/26 13/42 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 5/5 516 6/12 MN/MX,3 MN/MX,3 MN/MX,3 0110 10s1 : mod reg rim: immediate data Multiplier-Byte Word Dword For the IntelDX4TM Processor Only: IMUL = Integer Multiply (signed) ace. with register 1111 011w: 11101 reg Multiplier-Byte Word Dword ace. with memory 1111 011w: mod 101 rim Multiplier-Byte Word Dword reg 1 with reg2 00001111: 10101111: 11 reg1 reg2 Multiplier-Byte Word Dword register with memory 00001111 : 10101111 : mod reg rim Multiplier-Byte Word Dword reg1 with imm. to reg2 0110 10s1 : 11 reg1 reg2: immediate data Multiplier-Byte Word Dword memo with imm. to reg. Multiplier-Byte Word Dword I 0110 10s1 : mod reg rim: immediate data 2-265 Intel486™ PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Penalty Instruction Format Cache Hit if Cache Notes Miss INTEGER OPERATIONS (Continued) DIV = Divide (unsigned) acc. by register 1111 011w: 11110reg Divisor-Byte Word Dword acc. by memory 16 24 40 1111 011 w : mod 110 rIm Divisor-Byte Word Dword IDIV = 16 24 40 Integer Divide (signed) acc. by register 1111 011w: 11111 reg Divisor-Byte Word Dword acc. by memory 19 27 43 1111 011w:mod111 rIm Divisor-Byte Word Dword 20 28 44 CBW = Convert Byte to Word 1001 1000 3 CWD = Convert Word to Dword 1001 1001 3 Instruction ROL - Rotate Left ROR = Rotate Right RCL = Rotate Through Carry Left RDR = Rotate Through Carry Right SHL/SAL = Shift Logical! Arithmetic Left SHR = Shift Logical Right SAR = Shift Arithmetic Right TTI 000 001 010 011 100 101 111 Not Through Carry (ROL, ROR, SAR, SHL, and SHR) reg by 1 1101 OOOw: 11 TTT reg memory by 1 1101 OOOw: mod TIT rIm 4 reg byCL 1101 001w: 11 TTT reg 3 3 memorybyCL 1101001w:modTTTr/m 4 reg by immediate count 1100 OOOw: 11 TTT reg: imm. 8-bit data 2 mem by immediate count 1100 OOOw: mod TIT rIm: imm. 8-bit data 4 2-266 6 6 6 I Intel486TM PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty If Cache Miss Notes INTEGER OPERATIONS (Continued) Through Carry (RCL and RCR) reg by 1 1101 OOOw: 11 TTTreg memory by 1 1101 OOOw: mod TTT rIm reg byCL 1101 001w: 11 TTT reg 4 8/30 memorybyCL 1101 001w: mod TTT rIm 9/31 MN/MX,5 reg by immediate count 1.100 OOOw : 11 TTT reg: imm. 8·bit data 8/30 MN/MX,4 mem by immediate count 1100 OOOw: mod TTT rIm: imm. 8-bit data 9/31 MN/MX,5 Instruction SHLD = Shift Left Double SHRD = Shift Right Double 3 MN/MX,4 TTT 100 101 register with immediate 00001111: 10TTT100: 11 reg2 reg1 : imm. 8-bit data 2 memory with immediate 00001111 : 1OTT T100 : mod reg : imm. 8-bit data rIm 3 register by CL 0000 1111 : 1OTT T1 01 : 11 reg2 reg1 3 memorybyCL 00001111 : 1OTT T101 : mod reg rIm 4 BSWAP = Byte Swap 6 00001111: 11001 reg 6 5 1 XADD = Exchange and Add reg1, reg2 00001111 : 1100 OOOw: 11 reg2 reg1 3 memory, reg 00001111 : 1100 OOOw: mod reg rIm 4 6/2 U/L 2 6 CMPXCHG = Compare and Exchange reg1, reg2 0000 1111 : 1011 OOOw : 11 reg2 reg1 memory, reg 00001111 : 1011 OOOw: mod reg rIm 6 7110 CONTROL TRANSFER (within segment) Note: Times are jump takenlnot taken Jcccc = Jump on cccc 8-bit displacement 0111 tttn : 8-bit disp. 3/1 TINT,23 full displacement 0000 1111 : 1000 ttln : full displacement 3/1 TINT,23 Note: Times are jump takenlnot taken SETecce = Set Byte on cccc (Times are ccce true/false) reg 0000 1111 : 1001 ttln : 11 000 reg 4/3 memory 00001111 : 1001 ttln : mod 0000 rIm 3/4 I 2-267 Intel486™ PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty if Cache Miss Notes CONTROL TRANSFER (within segment) (Continued) Mnemonic cccc Condition Overflow No Overflow Below/Not Above or Equal Not Belowl Above or Equal Equal Zero Not Equal/Not Zero Below or Equal/Not Above Not Below or Equal/ Above Sign Not Sign ParitylParity Even Not ParitylParity Odd Less ThanlNot Greater or Equal Not Less ThanlGreater or Equal Less Than or Equal/Greater Than Not Less Than or Equal/Greater Than 0 NO BINAE NB/AE E/Z NE/NZ BE/NA NBE/A S NS PIPE NP/PO lINGE NLIGE LE/NG NLE/G = ttln 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 1110 0010 : 8-bit disp. 7/6 L/NL,23 Loop with ZerolEqual 1110 0001 : 8-bit disp. 9/6 LlNL,23 1110 0000 : 8-bit disp. 9/6 LINL,23 1110 0011 : 8-bit disp. 8/5 TINT,23 JECXZ = Jump on ECX Zero 1110 0011 : 8-bit disp. (Address Size Prefix Differentiates JCXZ for JECXZ) 8/5 T/NT,23 LOOP LOOP CX Times LOOPZILOOPE = LOOPNZ/LOOPNE JCXZ JMP = = Loop While Not Zero Jump on CX Zero = Unconditional Jump (within segment) Short 11101011 : 8-bit disp. 3 7,23 Direct 1110 1001 : full displacement 3 7,23 Register Indirect 1111 111.1 : 11 100 reg 5 Memory Indirect 11111111 : mod 100 rim 5 CALL 7,23 5 7 = Call (within segment) Direct 1110 1000 : full displacement 3 7,23 Register Indirect 11111111: 11 010 reg 5 7,23 Memory Indirect 11111111 : mod 010 reg 5 RET 5 7 = Return from CALL (within segment) Adding Immediate to SP ENTER LEAVE 2-268 11000011 5 5 11000010: 16-bit disp. 5 5 = Enter Procedure Level = 0 Level = 1 Level (L) > 1 1100 1000 : 16-bit disp., 8-bit level = Leave Procedure 11001001 14 17 17+3L 5 8 1 I Intel486TM PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty if Cache Miss 3/9 3/9 0/3 2/5 RV/P,9 Notes MULTIPLE-SEGMENT INSTRUCTIONS MOV = Move reg. to segment reg. 1000 1110 : 11 sreg3 reg memory to segment reg. 10001110: mod sreg3 rIm segment reg. to reg. 1000 1100: 11 sreg3 reg 3 segment reg. to memory 1000 1100: mod sreg3 rIm 3 segment reg. (ES, es, SS, or DS) OOOsreg 211 a 3 segment reg. (FS or GS) 0000 1111 : 10 sreg3001 3 = PUSH POP = RV/P,9 Push Pop segment reg. (ES, es, SS, or DS) OOOsreg 2111 3/0 2/5 RV/P,9 segment reg. (FS or GS) 0000 1111 : 10 sreg3001 3/9 2/5 RV/P,9 11000101 : mod reg rIm 6/12 7/10 RV/P,9 7/10 RV/P,9 LOS = LES = Load Pointer to ES 1100 0100 : mod reg rIm 6/12 LFS = Load Pointer to FS 00001111: 1011 0100: mod reg rIm 6/12 7/10 RV/P,9 Load Pointer to OS LGS = Load Pointer to GS 00001111 :1011 0101 : mod reg rIm 6/12 7/10 RV/P,9 LSS = Load Pointer to SS 0000 1111 : 1011 0010: mod reg rIm 6/12 7/10 RV/P,9 18 2 R,7,22 20 35 69 77+4X 37+TS 38+TS 3 6 17 17+n 3 3 P,9 P,9 P,9 P,11,9 P,10,9 P,10,9 CALL = Call Direct intersegment 1001 1010: unsigned full offset, selector to same level thru Gate to same level to inner level, no parameters to inner level, x parameters (d) words toTSS thru Task Gate Indirect intersegment 1111 1111 : mod all rIm to same level thru Gate to same level to inner level, no parameters to inner level, x parameters (d) words' toTSS thru Task Gate RET = 8 R,7 10 13 24 24+n 10 10 P,9 P,9 P,9 P,11,9 P,10,9 P,10,9 13 8 R,7 17 35 9 12 P,9 P,9 14 8 R,7 18 36 9 12 P,9 P,9 Return from CALL intersegment 11001010 to same level to outet lever intersegment adding imm. toSP to same level to outer level I 17 20 35 69 77+4X 37+TS 38+TS 1100 1010: 16-bit disp. 2-269 Intel486TM PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Instruction Format Penalty if Cache Miss Notes 17 2 R,7,22 19 32 42+TS 43+TS 3 6 3 3 P.9 P.9 P,10,9 P,10,9 13 9 R,7,9 18 31 41+TS 42+TS 10 13 10 10 P,9 P,9 P,10,9 P,10,9 Cache Hit MULTIPLE-SEGMENT INSTRUCTIONS (Continued) JMP = Unconditional Jump Direct intersegment 1110 1010 : unsigned full offset. selector to same level thru Call Gate to same level thruTSS thru Task Gate Indirect intersegment 1111 1111 : mod 011 rim to same level thru Call Gate to same level thru TSS thru Task Gate BIT MANIPULATION BT = Test Bit register, immediate 0000 1111 : 1011 1010 : 11 100 reg: imm. 8-bit data 3 memory, immediate 00001111: 10111010: mod 100 rim: imm. 8-bit data 3 regl, reg2 00001111: 10100011 : 11 reg2 regl 3 memory, reg 00001111 : 10100011 : mod reg rim 8 Instruction BTS = Test Bit and Set BTR = Test Bit and Reset BTC = Test Bit and Compliment 1 2 TIT 101 110 111 register, immediate 00001111 : 1011 1010: 11 TIT reg imm. 8-bit data 6 memory, immediate 00001111 : 1011 1010: mod TIT rim imm. 8-bit data 8 regl, reg2 0000 1111 : tOTT TO 11 : 1 1 reg2 regl 6 memory, reg 00001111 : 10TTTOll : mod reg rim 13 u/L regl, reg2 0000 1111 : 1011 1100: 11 reg2 regl 6/42 MN/MX, 12 memory, reg 00001111 : 1011 1100: mod reg rim 7143 regl, reg2 00001111: 10111101: 11 reg2 regl 6/103 memory, reg 00001111 :10111101 : mod reg rim 7/104 BSF BSR u/L = Scan Bit Forward 2 MN/MX, 15 = Scan Bit Reverse 2-270 MN/MX, 14 1 MN/MX, 15 I Intel486™ PROCESSOR FAMILY Table 13·15. Clock Count Summary (Continued) Instruction Cache Hit Penalty If Cache Miss Notes 8 6 16 1010111w 5 2 Format STRING INSTRUCTIONS CMPS = Compare Byte Word LODS = Load Byte/Word to ALI AX/EAX 1010011w MOVS = Move Byte/Word 1010010w 7 2 SCAS = Scan Byte/Word 1010111w 6 2 STOS = Store Byte/Word from AL/ AX/EX XLAT = Translate String 1010101w 5 11010111 4 16 2 REPEATED STRING INSTRUCTIONS Repeated by Count in CX or ECX (C = Count in CX or ECX) REPE CMPS = Compare String (Find Non-match) 11110011: 1010011w C=O C>O REPNE CMPS = Compare String (Find Match) = Load String = Move String 16, 18 REPNE SCAS = Scan String (Find ALI AX/EAX) 16 16,19 5 7+5c 20 5 7+5c 20 1111 0010: 1010 111 w C=O C>O = Store String 1 11110011: 1010 111w C=O C>O I 5 7+4c 5 13 12+3c REPE SCAS = Scan String (Find Non·ALI AX/EAX) C=O C>O 16,17 11110010: 1010 010w C=O C=1 C>1 REP STOS 5 7+7c 11110010: 1010 110w C=O C>O REP MOVS 16,17 11110010: 1010011w C=O C>O REP LODS 5 7+7c 11110010: 1010 101w 5 7+4c 2-271 Intel486TM PROCESSOR FAMILY Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty If Cache Miss Notes FLAG CONTROL CLC = Clear Carry Flag = Set Carry Flag CMC = Complement Carry Flag CLD = Clear Direction Flag STD = Set Direction Flag CLI = Clear Interrupt Enable Flag STI = Set Interrupt Enable Flag 11111000 2 STC 11111001 2 11110101 2 11111011 5 LAHF = Load AH Into Flag 1001 1111 3 SAHF = Store AH into Flag 10011110 2 PUSHF = Push Flags 1001 1100 4/3 RV/P POFF = Pop Flags 10011101 9/6 RV/P 00110111 3 11111100 2 1111 1101 2 11111010 5 DECIMAL ARITHMETIC AAA = ASCII Adjust to Add = ASCII Adjust for Subtract 0011 1111 3 11010100: 0000 1010 15 AAD = ASCII Adjust for Divide 11010101 : 0000 1010 = Decimal Adjust for Add 00100111 DAS = Decimal Adjust for Subtract 14 DAA 2 AAS AAM = ASCII Adjust for Multiply 00101111 2 PROCESSOR CONTROL INSTRUCTIONS HLT = Halt 11110100 4 I MOV = Move To and From Control/Debug/Test Registers CRO from register 0000 1111 : 00100010 : 11 000 reg 17 CR2/CR3 from register 0000 1111 : 00100010 : 11 eee reg 4 Reg from CRO·3 0000 1111 : 00100000: 11 eee reg 4 DRO-3 from register 0000 1111 : 00100011 : 11 eee reg 10 DR6-7 from register 00001111 : 0010 001.1 : 11 eee reg 10 Register from DR6-7 00001111 : 0010 0001 : 11 eee reg 9 Register from DRO-3 0000 1111 : 0010 0001 : 11 eee reg 9 TR3 from register 00001111 : 0010 0110: 11 011 reg 4 TR4-7 from register 00001111 :00100110: 11 eeereg 4 Register from TR3 00001111 :00100100:11 011 reg 3 Register from TR4-7 00001111 : 0010 0100: 11 eee reg 4 2-272 2 I Intel486TM PROCESSOR FAMILV Table 13-15. Clock Count Summary (Continued) Penalty Instruction Format Cache Hit if Cache Miss Notes PROCESSOR CONTROL INSTRUCTIONS (Continued) CPUID EAX EAX = CPU Identification = = 00001111 : 10100010 1 0, >1 14 9 CLTS = Clear Task Switched Flag 0000 1111 : 0000 0110 7 INVD = Invalidate Data Cache 00001111 :00001000 4 WBINVD = Write-Back and Invalidate Data Cache 00001111 : 0000 1001 INVLPG 2 5 = Invalidate TLB Entry INVLPG memory 00001111 : 0000 0001 : mod 111 rim H/NH 12/11 PREFIX BYTES Address Size Prefix LOCK = Bus Lock Prefix Operand Size Prefix 01100111 1 11110000 1 01100110 1 Segment Override Prefix CS: 00101110 1 DS: 00111110 1 ES: 00100110 1 FS: 01100100 1 GS: 01100101 1 SS: 00110110 1 PROTECTION CONTROL ARPL = Adjust Requested Privilege Level From register 01100011 : 11 regl reg2 9 From memory 01100011 : mod reg rim 9 From register 00001111 : 0000 0010: 11 regl reg2 11 3 From memory 00001111: 0000 0010: mod reg rim 11 5 00001111 : 0000 0001 : mod 010 rim 12 5 00001111 : 00000001 : mod 011 rim 12 5 Table register from reg. 0000 1111 : 0000 0000 : 11 010 reg 11 3 Table register from memo 00001111 : 00000000: mod 010 rim 11 6 LAR = Load Access Rights LGDT = Load Global Descriptor Table register LIDT = Load Interrupt Descriptor Table register LLDT I = Load Local Descriptor 2-273 Intel486TM PROCESSOR FAMIL V Table 13-15. Clock Count Summary (Continued) Penalty Instruction Format Cache Hit if Cache Notes Miss PROTECTION CONTROL (Continued) = Load Machine Status Word LMSW From register 0000 1111 : 0000 0001 : 11 110 reg 13 From memory 00001111 :00000001: mod 110r/m 13 1 From register 0000 1111 : 0000 0011 : 11 reg1 reg2 10 3 From memory 00001111 : 0000 0011 : mod reg rIm 10 6 From register 0000 1111 : 0000 0000 : 11 011 reg 20 From memory 00001111 : 0000 0000: mod 011 rIm 20 LSL = Load Segment Limit LTR = Load Task Register = Store Global Descriptor Table SGDT 00001111 : 0000 0001 : mod 000 rIm 10 = Store Interrupt Descriptor Table SlOT 00001111 : 0000 0001 : mod 001 rIm 2 = Store Local Descriptor Table SLOT To register 0000 1111 : 0000 0000 : 11 000 reg 2 To memory 0000 1111 : 0000 0001 : mod 000 rIm 3 SMSW = Store Machine Status Word To register 00001111: 0000 0001 : 11 000 reg 2 To memory 00001111 :00000001 : mod 100 rIm 3 STR = Store Task Register To register 00001111 : 0000 0000: 11 001 rIm 2 To memory 00001111 : 0000 0000 : mod 001 rIm 3 Register 00001111 :00000000: 11100r/m 11 3 Memory 00001111 : 0000 0000: mod 100 rIm 11 7 To register 00001111 :00000000:11101 rIm 11 3 To memory 00001111 : 0000 0000: mod 101r/m 11 7 VERR VERW = Verify Read Access = Verify Write Access INTERRUPT INSTRUCTIONS INTn = Interrupt Type n 11001101: type INT+ 410 RV/P,21 INT3 = Interrupt Type 3 11001100 INT+O 21 2-274 I Intel486TM PROCESSOR FAMILV Table 13-15. Clock Count Summary (Continued) Instruction Format Cache Hit Penalty If Cache Miss Notes INTERRUPT INSTRUCTIONS (Continued) INTO = Interrupt 4 if Overflow Flag Set 11001110 INT+2 3 Taken Not Taken BOUND = Interrupt 5 if Detect Value Out Range 01100010: mod reg rim 7 If in range If out of range IRET = Interrupt Return INT+ 24 7 7 21 21 1100 1111 Real Mode/virtual Mode Protected Mode To same level To outer level To nested task (EFLAGS.NT = 1) RSM 21 21 15 8 20 36 TS+32 11 19 4 9 9 9,10 = Exit System Management Mode 00001111 :10101010 5MBASE Relocation Auto HALT Restart I/O Trap Restart External Interrupt NMI = Non-Maskable Interrupt Page Fault 452 456 465 INT+ 11 21 INT+3 21 INT+ 24 21 INT+8 INT+8 INT+9 INT+9 INT+8 INT+9 21 21 INT+50 INT+51 21 21 INT+50 INT+51 INT+50 INT+50 INT+51 INT+51 21 21 21 21 21 21 VM86 Exceptions CLI STI INTn PUSHF POPF IRET IN Fixed Port Variable Port OUT Fixed Port Variable Port INS OUTS REP INS REP OUTS I 21 21 2·275 Intel486TM PROCESSOR FAMILY Table 13-16. Task Switch Clock Counts Value forTS Method Cache Hit Miss Penalty VM/lntel486 Processor/286 TSS to Intel486 ProcessorTSS 162 55 VM/lntel486 Processor/286 TSS to 286 TSS 144 31 VM/lntel486 Processor/286 TSS to VM TSS 140 37 Table 13-17. Interrupt Clock Counts Value for INT Method Cache Hit Miss Penalty Real Mode 26 2 Protected Mode Interrupt/Trap gate, same level Interrupt/Trap gate, different level Task Gate 44 71 6 17 3 9 9 9,10 17 3 10 37 + TS Notes Virtual Mode Interrupt/Trap gate, different level :Task Gate 82 37 + TS Abbreviations Definition 16/32 16/32 bit modes UlL MN/MX L/NL RV/P unlocked/locked minimum/maximum loop/no loop real and virtual mode/protected mode real mode protected mode taken/not taken hit/no hit R P T/NT HINH NOTES (for Tables 13-17 through 13-19): 1. Assuming that the. operand address and stack address fall in different cache sets. 2. Always locked, no cache hit case. 3. Clocks = 10 + max(log2(lml),n) 4. Clocks = (qoutient(couintloperand length)} '7 + 9 = 8 if count s: operand length (8/16/32) 5. Clocks = {qoutient(couintloperand length)} '7+9 = 9 if count s: operand length (8/16/32) 6. Equal/not equal cases (penalty is the same regardless of lock) 7. Assuming that addresses for memory read (for indirection), stack puch/pop and branch fall in different cache sets. 8. Penalty for cache miss: add 6 clocks for every 16 bytes copied to new stack frame. 9. Add 11 clocks for each unaccessed descriptor load. 10. Refer to task switch clock counts table for value of TS. 11. Add 4 extra clocks to the cache miss penalty for each 16 bytes. 2-276 I Intel486™ PROCESSOR FAMILY For notes 12-13: (b = 0-3, non-zero byte number); (i = 0-1, non-zero nibble number); (n = 0-3, non-bit number in nibble); 12. Clocks = 8 + 4 (b + 1) + 3(i + 1) + 3(n + 1) = 6 if second operand = 0 13. Clocks = 9 + 4 (b + 1) + 3(i + 1) + 3(n + 1) = 7 if second operand = 0 For notes 14-15: (n = bit position 0-31) 14. Clocks = 7 + 3(32-n) = 6 if second operand = 0 15. Clocks = 8 + 3(32-n) = 7 if second operand = 0 16. Assuming that the two string addresses fall in different cache sets. 17. Cache miss penalty: add 6 clocks for every 16 bytes compared. Entire penalty on first compare. 18. Cache miss penalty: add 2 clocks for every 16 bytes of data. Entire penalty on first load. 19. Cache miss penalty: add 4 clocks for every 16 bytes moved. (1 clock for the first operation and 3 for the second) 20. Cache miss penalty: add 4 clocks for every 16 bytes scanned. (2 clocks each for first and second operations) 21. Refer to interrupt clock counts table for value of INT. ' 22. Clock count includes one clock for using both displacement and immediate. 23. Refer to assumption 6 in the case of a cache miss. 24. Virtual Mode Extensions are disabled. 25. Protected Virtual Interrupts are disabled. Table 13-18. I/O Instructions Clock Count Summary Protected Protected Virtual Mode Mode 86 Notes (CPLS:IOPL) (CPL>IOPL) Mode Format Real Mode IN = Input from: Fixed Port Variable Port 1110 01 Ow : port number 1110110w 14 14 9 8 29 28 27 27 OUT = Output to: . Fixed Port Variable Port 1110 011w: port number 1110110w 16 16 11 10 31 30 29 29 INS = Input Byte/Word from OX Port 0110110w 17 10 32 30 OUTS = Output Byte/Word to OX Port 0110111w 17 10 32 30 1 1111 0010 : 0110 11 Ow 16+8c 10+8c 30+8c 29+8c 2 REP OUTS = Output String 11110010:0110111w 17+5c 11 +5c 31 +5c 30+5c 3 Instruction REP INS = Input String NOTES: 1. Two clock cache miss penalty in all cases. 2. c = count in CX or ECX. 3. Cache miss penalty in all modes: Add 2 clocks for every 16 bytes. Entire penalty on second operation. I 2-277 Intel486™ PROCESSOR FAMILY Table 13-19. Floating Point Clock Count Summary Cache Hit Instruction Format Avg(Lower Range •.• Upper Range) Concurrent Execution Penalty if Cache Avg(Lower Miss Range ••• Upper Range) Notes DATA TRANSFER FLO = Real Load to ST(O) 32-bit memory 64-bit memory SO-bit memory ST(i) 11.011.0.01 11.011101 11011011 11011.001 : mod .0.00 rim: s-i-b/disp. :modOOOr/m:s-i-b/disp. : mod 101 rim: s-i-b/disp. : 11000 ST(i) 3 3 6 4 2 3 4 14.5(13-16) 11.5(9-12) 16.8(10-1B) 2 2 3 4 4(2-4) 7.8(2-8) 75(7.0-1.03) 4 7.7(2-8) FILD = Integer Load to ST(O) l6-bit memory 32-bit memory 64-bit memory 11011 111 : mod 000 rim: s-i-b/disp. 11011011 : mod 000 rim: s-i-b/disp. 11011 111 : mod 101 rim: s-i-b/disp. FBLD = BCD Load to ST(O) 11011 111 : mod 1DO rim: s-i"b/disp. FS1 = Store Real from ST(O) 32-bit memory 64-bit memory ST(i) 11.011.011 : mod .010 rim: s-i-b/disp. 11.011 1.01 : mod .010 rim: s-i-b/disp. 11.011 101 : 11001 ST(i) FSTP = Store Real from ST(O) and Pop 11.011 .011 : mod .011 rim: s-i-b/disp. 32-bit memory 64-bit memory SO·bit memory ST(i) 11.011 1.01 : mod .011 rim: s-i-b/disp. 11.011 .011 : mod 111 rim: s-i-b/disp. 11.0111.01: 11.0.01 ST(i) FIST = Store Integer from ST(O) 11.011 111 : mod .01.0 rim: s-i-b/disp. l6·bit memory 32·bit memory 11011 011 : mod .010 rim: s-i-b/disp. 7 8 3 1 2 7 8 6 3 1 2 33.4(29-34) 32.4(28-34) FISTP = Store Integer from ST(O) and Pop l6-bit memory 32·bit memory 64·bit memory 11.011 111 : mod .011 rim: s-i-b/disp. 11.011011: mod .011 rim: s-i-b/disp. 11.011111: mod 111 rim: s-i-b/disp. FBSTP = Store BCD from S1(0) and Pop 11.011111 : mod 110 rim: s-i-b/disp. FXCH = Exchange S1(0) and S1(1) 11011 .0.01 : 11.001 ST(i) 2-278 33.4(29-34) 33.4(29-34) 33.4(29-34) 175(172-176) 4 I Intel486™ PROCESSOR FAMILY Table 13·19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Format Avg(Lower Range •.. Upper Range) Concurrent Execution Penalty If Cache Avg(Lower Miss Range ••• Upper Range) Notes COMPARISON INSTRUCTIONS = Compare ST(O) with Real FCOM 32-bit memory 64-bit memory ST(i) 11011 000: mod 010 rim: s-i-b/disp. 11011 100: mod 010 rim: s-i-b/disp. 11011 000 : 11010 ST(i) = Compare ST(O) with Real and Pop 11011 000: mod 011 rim: s-i-b/disp. 64-bit memory 11011100: mod 011 rim: s-i-b/disp. 4 4 4 2 3 1 1 4 4 4 2 3 1 1 1 FCOMP 32-bit memory 11011 000: 11011 ST(i) ST(i) FCOMPP = Compare ST(O) with ST(1) and Pop Twice 11011110: 11011001 16-bit memory 32-bit memory 11011 110: mod 010 rim: s-i-b/disp. 11011 010: mod 010 rim: s-i-b/disp. 16-bit memory 32-bit memory 11011110: mod 011 rim: s-i-b/disp. 11011010: mod 011 rim: s-i-b/disp. 2 2 1 1 18(16-20) 16.5(15-17) 2 2 1 1 = Compare ST(O) with 0.0 11011 011 : 1110 0100 FUCOM = FUCOMP = 1 4 1 4 1 5 1 Unordered compare ST(O) with ST(I) and Pop 11011 101 : 11101 ST(i) FUCOMPP 4 Unordered compare ST(O) with ST(I) 11011 101 : 11100 ST(i) = Unordered compare ST(O) with ST(1) and Pop Twice 11011101: 111011001 = Examine ST(O) 11011 001 : 11100101 I 18(16-20) 16.5(15-17) = Compare ST(O) with Integer FICOMP FXAM 1 = Compare ST(O) with Integer FICOM FTST 5 8 2-279 Intel486TM PROCESSOR FAMIL V Table 13-19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Avg(Lower Range ... Upper Range) Format Concurrent Execution Penalty if Cache Avg (Lower Miss Range ... Upper Range) Notes CONSTANTS FLDZ = Load + 0.0 Into ST(O) 4 11011 001 : 1110 1110 : FLD1 = Load + 1.0 Into ST(O) 11011001: 1110 1000: 4 FLDP1 = Load 11' Into ST(O) 11011001: 1110 1011: 8 2 FLDL2T = Load log2(10) Into ST(O) 11011001: 1110 1001: 8 2 FLDL2E = Load log2(e) Into ST(O) 11011001: 1110 1010: 8 2 FLDLG2 = Load log10(2) Into ST(O) 11011 001 : 1110 1100 : 8 2 FLDLN2 = Load loge(2) Into ST(O) 11011001 :11101101: 8 2 ARITHMETIC FADD = Add Real with ST(O) ST(Ol ~ ST(Ol + 32·bit memory 11011 000: mod 000 ST(Ol ~ ST(Ol + 64-bit memory 11011 100: mod 000 ST(dl ~ ST(Ol rim: s-i-b/disp. 10(8-20) 2 7(5-17) rim: s-i-b/disp. 10(8-20) 3 7(5-17) + STeil 11011 dOO : 11000 ST(i) FADDP = Add real with ST(O) and Pop (ST(i) ~ ST(Ol 11011 110 : 11000 ST(i) : 10(8-20) 7(5-17) 10(8-20) 7(5-17) + ST(I» I Intel486TM PROCESSOR FAMILY Table 13-19. Floating Point Clock Count Summary (Continued) Concurrent Execution . Penalty if Avg(Lower Cache Avg(Lower Miss Range ... Range ..• Upper Upper Range) Range) Cache Hit Instruction Format Notes ARITHMETIC (Continued) FSUB = Subtract Real from ST(O) ST(O) ~ ST(O) - 32-bit memory 11011000: mod 100 rIm: s-i-b/disp. ST(O) ~ ST(O) - 64-bit memory ST(d) ~ ST(O) - ST(i) 11011 100: mod 100 rIm: s-i-b/disp. 11011 dOO : 11001 ST(i) ~ FSUBP = Subtract real from ST(O) and Pop (ST(i) 10(8-20) 2 7(5-17) 10(8-20) 3 7(5-17) 10(8-20) 7(5-17) 10(8-20) 7(5-17) ST(O)-ST(i» 11011110: 11001 ST(i) FSUBR = Subtract Real reversed (Subtract ST(O) from Real) ST(O) ~ 32-bit memory - ST(O) 11011 000: mod 101 ST(O) ~ 64-bit memory - ST(O) ST(d) ~ ST(i) - ST(O) 11011 100: mod 101 rim: s-i-b/disp. 10(8-20) 2 7(5-17) rim: s-i-b/disp_ 10(8-20) 3 7(5-17) 11011 dOO: 11001 ST(i) FSUBRP = Subtract Real reversed and Pop (ST(i) ~ 10(8-20) 7(5-17) 10(8-20) 7(5-17) ST(i)-ST(O» 11011 110: 11100 ST(i) FMUL = Multiply Real with ST(O) ST(O) ~ ST(O) X 32-bit memory rim: s-i-b/disp. 11 2 8 . 11011100:mod001 r/m:s-i-b/disp. 14 3 11 11011000: mod 001 ST(O) ~ ST(O) X 64-bit memory ST(d) ~ ST(O) X ST(i) 11011 dOO: 11001 ST(i) FMULP = Multiply ST(O) with ST(i) and Pop (ST(i) ~ 16 13 16 13 ST(O) x ST(i» 11011110: 11001 ST(i) FDIV = Divide ST(O) by Real ST(O) ~ ST(O)/ 32-bit memory 11011000: mod 110 rim: s-i-b/disp. 73 2 70 3 11011 100: mod 110 rim: s-i-b/disp. 73 3 70 3 11011 dOO : 11111 ST(i) 73 70 3 73 70 3 ST(O) ~ ST(O)/ 64-bit memory ST(d) ~ ST(O)/ ST(i) FDIVP = Divide ST(O) by ST(i) and Pop (ST(i) ~ 11011110: 11111 ST(i) I ST(O)/ST(I» 2-281 Intel486TM PROCESSOR FAMILY Table 13·19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Avg(Lower Range ••. Upper Range) Format Concurrent Execution Penalty If Cache Avg(Lower Miss Range ••• Upper Range) Notes ARITHMETIC (Continued) FDIVR = Divide real reversed (Real/ST(O» ST(O) - 32-bit memoryl ST(O) ST(O) - 64-bit memoryl ST(O) ST(d) - ST(i)1 ST(O) 11011 000: mod 111 rIm: s-i-b/disp. 11011100: mod 111 rIm: s-i-b/disp. 11011 dOO: 11110 ST(i) FDIVRP = Divide real reversed and Pop (ST(I) - 73 2 70 3 73 3 70 3 73 70 3 73 70 3 ST(I)/ ST(O» 11011 110: 11110 ST(i) FIADD = Add Integer to ST(O) ST(O) - ST(O) + ST(O) - ST(O) + 32-bit memory 16-bit memory 11011110: mod 000 rIm: s-i-b/disp. 11011010: mod 000 rim: s-i-b/disp. 24(20-35) 2 7(5-17) 22.5(19-32) 2 7(5-17) 24(20-35) 2 7(5-17) 22.5(19-32) 2 7(5-17) 24(20-35) 2 7(5-17) 22.5(19-32) 2 7(5-17) 25(23-27) 2 8 23.5(19-32) 2 8 FISUB = Subtract Integer from ST(O) ST(D) - ST(O) - 16-bit memory ST(D) - ST(D) - 32-bit memory 11011110: mod 100 rIm: s-i-b/disp. 11011 010: mod 100 rim: s-i-b/disp. FISUBR = Integer Subtract Reversed ST(D) - 16-bit memory - ST(O) ST(O) - 32-bit memory ~ ST(D) 11011 110: mod 101 rim: s-i-b/disp. 11011010: mod 101 rim: s-i-b/disp. FIMUL = Multiply Integer with ST(O) ST(D) - ST(D) X 16-bit memory ST(D) - ST(D) X 32-bit memory 11011 110: mod 101 rim: s-i-b/disp. 11011010: mod 001 rim: s-i-b/disp. 2-282 I Intel486™ PROCESSOR FAMILY Table 13·19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Format Avg (Lower Range •.. Upper Range) Concurrent Execution Penalty if Cache Avg (Lower Miss Range •.. Upper Range) Notes ARITHMETIC (Continued) FIDIV = Integer Divide ST(O) +- ST(O)/ 16-bit memory 11011110: mod 110 rim: s-i-b/disp. 87(85-89) 2 70 3 85.5(84-86) 2 70 3 87(85-89) 2 70 3 85.5(84-86) 2 70 3 ST(O) +- ST(O)/ 32-bit memory 11011010: mod 110 rim: s-i-b/disp. FIDVR = Integer Divide Reversed ST(O) +- 16-bit memory/ST(O) 11011110: mod 111 rim: s-i-b/disp. ST(O) +- 32·bit memory/ST(O) 11011010: mod 111 rim: s-i-b/disp. FSQRT = Square Root 11011001: 11111010 FSCALE FXTRACT 2 19(16-20) 4(2-4) 84(70-138) 2(2-8) 94.5(72-167) 5.5(2-18) 29.1 (21-30) 7.4(2-8) = Partial Reminder 11011 001 : 1111 1000 FPREM1 = Partial Reminder (IEEE) •. FRNDINT 11011 001 : 1111 0101 = Round ST(O) to Integer 11011 001 : 1111 1100 = Absolute value of ST(O) 11011 001 : 1110 0001 FCHS 31(30-32) = Extract Components of ST(O) 11011001: 1111 0100 FABS 70 = Scale ST(O) by ST(1) 11011001: 11111101 FPREM 85.5(83-87) 3 = Change Sign of ST(O) 11011 001 : 11100000 6 TRANSCENDENTAL FCOS = Cosine of ST(O) 11011001: 11111111 FPTAN 2 6,7 244(200-273) 70 6,7 289(218-303) 5(2-17) 6 = Partial Tangent of ST(O) 11011 001 : 1111 0010 FPATAN = Partial Arctangent 11011001: 11110011 I 241 (193-279) 2-283 Intel486TM PROCESSOR FAMILY Table 13·19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Format Avg(Lower Range •.• Upper Range) Concurrent Execution Penalty if Cache Avg(Lower Miss Range •.. Upper Range) Notes TRANSCENDENTAL (Continued) FSIN = Sine of ST(O) 241 (193·279) 2 6,7 11011 001 : 1111 1011 291 (243·329) 2 6,7 11011 001 : 1111 0000 242(140-279) 2 6 311 (196-329) 13 6 313(171-326) 13 6 11011 001 : 1111 1110 = FSINCOS = F2XM 1 FYL2X Sine and Cosine of ST(O) 2ST(O) - 1 = ST(1) x log2(ST(0)) 11011 001 : 1111 0001 = FYL2XP1 ST(1) x 1092(ST(0) + 1.0) 11011 001 : 1111 1001 PROCESSOR CONTROL FINIT = Initialize FPU 11011 001 : 1110 0011 FSTSW AX = = = = = 5 4 2 3 5 7 4 67 67 56 56 4 4 4 4 Clear exceptions 11011 011 : 11100010 FSTENV 3 Store control word 11011 001 : mod 111 rim: s-i-b/disp. FCLEX 5 Load control word 11011001 : mod 101 rim: s-i-b/disp. FSTCW 3 Store status word Into memory 11011101: mod 111 rim: s-i-b/disp. FLDCW 4 Store status word into AX 11011 111 : 11100000 FSTSW 17 = Store environment 11011 011 : mod 110 rim: s-i-b/disp. Real and Virtual Modes 16-bit address Real and Virtual Modes 32-bit address Protected Mode 16-bit address Protected Mode 32-bit address FLDENV = Load Environment 11011011: mod 100 rim: s-i-b/disp. Real and Virtual Modes 16-bit address Real and Virtual Modes 32-bit address Protected Mode 16-bit address Protected Mode 32-bit address 2-284 44 44 34 34 2 2 2 2 I Intel486™ PROCESSOR FAMILY Table 13-19. Floating Point Clock Count Summary (Continued) Cache Hit Instruction Format Avg(Lower Range ••. Upper Range) Concurrent Execution Penalty If Cache Avg(Lower Miss Range ••• Upper Range) Notes PROCESSOR CONTROL (Continued) FSAVE = Save State 11011101 : mod 110 rim: s·i·b/disp. Real and Virtual Modes 16-bit address Real and Virtual Modes 32-bit address Protected Mode 16-bit address Protected Mode 32-bit address 154 154 143 143 4 4 4 4 FRSTOR = Restore State 11011101 : mod 100 rim: s·i·bl Real and Virtual Modes 16-bit address Real and Virtual Modes 32-bit address Protected Mode 16-bit address Protected Mode 32-bit address 131 131 120 120 23 27 23 27 FINCSTP = Increment Stack Pointer 11011 001 : 1111 0111 3 FDECSTP = Decrement Stack Pointer 11011001: 11110110 3 11011 101 : 11000 STeil 3 FFREE = Free ST(i) FNOP = No Operations 11011101: 11010000 3 WAIT = Wait until FPU ready (minImax) 10011011 1/3 NOTES: 1. If operand is 0 clock counts = 27. 2. If operand is 0 clock counts = 28. 3. If CW.PC indicates 24-bit precision then subtract 38 clocks. If CW.PC indicates 53-bit precision then subtract 11 clocks. 4. If there is a numeric error pending from a previous instruction add 17 clocks. 5. If there is a numeric error pending from a previous instruction add 18 clocks. 6. The INT pin is polled several times while this function is executing to assure short interrupt latency. 7. If ABS(operand) is greater than Trl4 then add n clocks, where n=(operand/(TrI4». I 2-285 Intel486TM PROCESSOR FAMILY 14.0 DIFFERENCES BETWEEN INTEL486 PROCESSORS AND INTEL386 PROCESSORS The differences between Intel486 processors and Intel386 processors are due to performance enhancements. The differences are listed below. 1. Instruction clock counts have been reduced to achieve higher performance. (See section 13.0, "Instruction Set Summary.") 2. The Intel486 processor bus is significantly faster than the Intel386 processor bus. Differences include a 1X clock, parity support, burst cycles, cacheable cycles,. cache invalidate cycles and 8bit bus support. The Hardware Interface and Bus Operation sections (sections 9.0 and 10.0) cif the data sheet should be carefully read to understand the Intel486 processor bus functionality. 3. To support the on-chip cache bits have been added to control register 0 (CD and NW) (see section 4.2.3.1, "Control Registers"), new pins have been added to the bus (see section 9.0, "Hardware Interface") and new bus cycle types have been added (see section 10.0, "Bus Operation"). The on-chip cache needs to be enabled after reset by clearing the CD and NW bit in CRO. 4. Eight new instructions have been added: • • • • • Byte Swap (BSWAP) Exchange-and-Add (XADD) Compare and Exchange (CMPXCHG) Invalidate Data Cache (IIWD) Write-back and Invalidate Data Cache (WBINVD) • Invalidate TLB Entry (INVLPG) • Processor Identification (CPUID) • Resume (RSM) 5. Two bits defined in control register 3, the page table entries and page directory entries (PCD and PWD. (See section 6.4.2.5, "Page Directory/Table Entries.") 6. A page protection feature has been added. This feature required a new bit in control register 0 (WP) (See sections 4.2.3.1 "Control Registers" and 6.4.3 "Page Level Protection.") 7; An Alignment Check feature has been added. This feature required a bit in the flags register (AC) (section 4.2.2.3 "Flags Register") and a bit in control register 0 (AM) (section 4.2.3.1 "Control Registers"). 2-286 8. The replacement algorithm for the translation lookaside buffer has been changed from a random algorithm to a pseudo least recently used algorithm like that used by the on-chip cache. (See section 7.5 "Cache Replacement" for a description of the algorithm.) 9. Three testability registers, TR3, TR4 and TR5, have been added for testing the on-chip cache. TLB testability has been enhanced. (See section 11.0, "Testability.") 10. The prefetch queue has been increased from 16 bytes to 32 bytes. A jump always needs to execute after modifying code to guarantee correct execution of the new instruction. 11. After reset, the ID in the upper byte of the DX register is 04. 14.1 Differences between the Intel386 Processor with an Intel387TM Math CoProcessor and Intel486 OX, IntelDX2 and IntelDX4 Processors In addition to the previously mentioned enhancements, the Intel486 DX, IntelDX2 and IntelDX4 processors offer the following features: 1. The complete Intel387 math coprocessor instruction set and register set have been added. No 1/ cycles are performed during Floating Point instructions. The instruction and data pointers are set to 0 after FINIT/FSAVE. Interrupt 9 can no longer occur, interrupt 13 occurs instead. 2. Support for floating point error reporting modes to guarantee DOS compatibility. These modes require a bit in control register 0 (NE) (see section 4.2.3.1, "Control Registers") and pins (FERR # and IGNNE#). (See sections 9.2.15, "Numeric Error Reporting" and 10.2.14 "Floating Point Error' Handling. ") 3. In some cases FERR # is asserted when the next floating point instruction is encountered and in other cases it is asserted before the next floating point instruction is encountered, depending upon the execution state the instruction causing exception. (See sections 9.2.15, "Numeric Error Reporting" and 10.2.14, "Floating Point Error Handling.") For both of these cases, the Intel387 math coprocessor asserts ERROR # when the error occurs and does not wait for the next floating point instruction to·be encountered. 4. The contents of the base registers Including the floating poInt registers may be different after reset. o I Intel486™ PROCESSOR FAMILY 15.0 DIFFERENCES BETWEEN THE PGA, SQFP AND PQFP VERSIONS OF THE INTEL486 SX AND INTEL486 OX PROCESSORS The section lists the differences between PGA, SOFP, and POFP packages of the Intel486 SX and Intel486 OX processor. It also provides a quick pin reference table that is useful for converting a system design from one that uses a PGA package to one that uses an SOFP or POFP package. NOTE: The boundary scan feature is not supported in the Intel486 SX processor in 168-pin PGA package. See sections 3.0, "Pin Description," and 11.5, "lntel486 Processor Boundary Scan," for pinout differences. 15.1 2X Clock Mode The Intel486 processors offer 2X clock mode for systems that rely on dynamic frequency scaling for processor powe~ management. This product is not intended for the desktop computer. This 2X clock processor differs from the 1X clock processor in the following ways: Pin Assignment/Function: The 2X clock product has a CLK2 input, rather than the 1X clock product's CLK input. The CLK2 input must be synchronized to the system phase using the falling edge of RESET. (For reference, the pinout change from the existing Low Power Intel486 OX and SX processors is also shown. The CLKSEL pin is not used on the Intel486 processors, as it is on the existing Low Power Intel486 OX and SX processors.) Clock Control: The CLK2 input can be changed dynamically. The Stop Clock interrupt is handled in a different manner. AC Specifications: In general, the AC specifications for the 2X clock device will have slightly longer setups, holds, and maximum valid delays. This is consistent with the existing Low Power Intel486 OX and Intel486 SX processors. See section 15.1.5, "AC Specifications," for 2X clock mode AC specifications. Upgrades: There are no end user upgrade products planned for the 2X clock mode product. The UP# function is still provided for use by system designers that offer Intel486 SX to Intel486 OX processor upgrade cards. This section will explain the differences between the processor with the 2X clock mode and the processor with the 1X clock mode. 15.1.1 PIN ASSIGNMENTS The Intel486 processor with the 2X clock option is available in the 208-lead SOFP and 196-lead POFP packages. The pinout is identical to the Intel486 processor with the 1X clock option with the exception of the name of the clock input. The 1X clock input is called ClK and the 2X clock input is called ClK2. Table 15-1 shows the changes between the existing products and new products. Table 15-2 is a list of pin descriptions. Table 15-1. Pinout Differences for the 2X Clock Mode (Low Power) Processors (196-lead PQFP Package) Pin I Low Power Intel486™ SX Processor Intel486SX Processor low Power Intel486 DX Processor Intel486 Processor 75 NC STPCLK# NC STPCLK# 77 NC NC IGNNE# IGNNE# 81 NC NC FERR# FERR# 85 NC SMI# NC SMI# 92 NC SMIACT# NC SMIACT# 94 NC SRESET NC SRESET 127 CLKSEL NC CLKSEL NC 2-287 Intel486TM PROCESSOR FAMIL V 15.1.2 QUICK PIN REFERENCE Table 15·2. Pin Descriptions Symbol Type Name and Function ClK2 I CLK2 provides the fundamental timing for the processor. Both of the internal timing phases, phase-1 (ph1) and phase-2 (ph2), are provided by the external ClK2 input. All external timing parameters are specified with respect to the phase-1 rising edge of ClK2. RESET I The RESET input forces the processor to begin execution at a known state. The processor cannot begin execution of instructions until at least 1 ms after Vee and ClK2 have reached their proper AC and DC specifications. However, for soft resets, RESET should remain active for at least 30 ClK2 periods (equal to 15 internal processor ClK). The RESET pin should remain active during this time to insure proper processor operation. RESET is active HIGH. Reset is asynchronous, but must meet setup and hold times t20, t20a and t21 for recognition in any specific clock. For the 2X clock mode the ClK frequency is twice the frequency of the processor. RESET sets the 5MBASE descriptor to default address of 30000H. If the system uses 5MBASE relocation, then the SRESET pin shoiJld be used for soft resets. For the 2X clock mode, the falling edge of RESET synchronizes the processor internal clock phase. RESET must be used at power up and anytime the phase of the processor clock must be re-synchronized to the system phase. SRESET I The SRESET pin duplicates all the functionality of the RESET pin with the following two exceptions: 1. The 5MBASE register will retain its previous value. 2. If UP# is asserted, SRESET will not have an effect on the host processor. For soft resets, SRESET should remain active for at least 30 ClK2 periods (equal to 15 internal processor ClK). SRESET is active HIGH. SRESET is asynchronous but must meet setup and hold times t20 and t21 for recognition in any specific clock. 2-288 I Intel486TM PROCESSOR FAMILY 15.1.3 CLOCK CONTROL 15.1.3.1 Clock Generation The frequency of ClK2 is twice the internal frequency of the processor. The internal clock is comprised of two phases, "PH1" and "PH2". Each CLK2 period is a phase of the internal clock. Figure 15-1 illustrates the relationship between the CLK2 input and the internal· phases. All set-up, hold, float-delay and valid delay timings are referenced to the rising edge of phase 1 of CLK2. Thus it is important to synchronize the external circuitry with the phase of the CLK2 input. The internal processor clock phase is determined at the falling edge of the RESET input. RESET must meet the specified setup and hold times to correctly synchronize the internal clock phase. See Figure 15-2. INTERNAL PROCESSOR CLOCK PERIOD CLK2PERIOD CLK2 INTERNAL CPUCLK (hall Ihe Iraq. oICLK2) 242202-E4 Figure 15-1. ClK2 Signal and Internal Processor Clock INTERNAL CPUCLK CLK2 RESET 15CLK min SRESET J SS ,,\'--_______________ This edge has no relalionshi6 10 Inlemal.clock phase 242202-E5 Figure 15- 2. ClK2 and Internal Processor ClK VS. SRESET and RESET Timings I 2-289 Intel486™ PROCESSOR FAMILY 15.1.3.2 Stop Clock The processor with the 2X clock option does not rely on an internal Phase Lock Loop to generate the internal phase clocks. Therefore, the frequency of the CLK2 input can be changed dynamically or "on-thefly." The 2X clock mode, Intel486 processor provides an interrupt mechanism, STPCLK #, that places the proces!;lor into a known state. Although the frequency of the. CLK2 input can be dynamically changed between 0 MHz and the maximum operating frequency ofthe processor, operation between 0 MHz and 8 MHz is not tested. Stopping the CLK2 input with the processor in a known state requires use of the STPCLK # mechanism. When the processor rec. ognizes a STPCLK # request, the processor will stop execution on the next instruction boundary, stop the prefetcher, empty all internal pipelines and the write buffers, and· then generate a Stop Grant bus cycle. At this point the processor is in the Stop Grant state. The iising edge of STPCLK # will tell the processor that it can return to program execution at the instruction following the interrupted instruction. Unlike the normal interrupts, INTR and NMI, the STPCLK # interrupt. does not initiate interrupt acknowledge cycles or interrupt vector table reads. STPCLK# is active LOW and is provided with an internal pull-up resistor, STPCLK# is an asynchronous signal, but must remain active until the processor issues the Stop Grant'bus cycle. STPCLK# may be de-asserted at any time after the processor has issued the Stop Grant bus cycle. Note that STPCLK # should NOT be de-asserted before the processor has issued the Stop Grant bus cycle. STPCLK # must be de-asserted for a minimum of 5 clocks after RDY # is returned active for the Stop Grant bus cycle before being asserted again. 15.1.3.3 Clock Control State Diagram The' state diagram in Figure 15-3 shows the Stop Clock state transitions for the 2X clock mode. 2-290 I Intel486TM PROCESSOR FAMILY 4 Halt State 1 Normal Execution HALT Frequency can be changed dynamically CLK2 is running INTR, NMI, SMIII, RESET,SRESET EADS# STPCLK# asserted and Stop Grant Bus cycle generated 2 Stop Grant State STPCLK# de-asserted ~----------------------I STPClK# asserted and Stop Grant Bus Cycle generated STPCLK# de-asserted or RESET, SRESET HOLD/BOFF/AHOLD,EADS# respond ICC depends on CLK2 frequency Start CLK2 input Stop eLK input 3 Stop Clock State CLK2 input is stopped CPU will not respond to inputs 242202-E6 Figure 15-3. Stop Clock State Machine 2X Clock Mode Normal State This is the Normal operating state of the processor. During this state, the CLK2 input frequency can be changed dynamically or "on-the-fly" for power consumption control with no clock control latency. This capability provides a wide range of performance/ I power consumption options. Operation of the processor is tested between 8 MHz and the maximum operating frequency of the processor. Operation below 8 MHz is guaranteed by design, though is not 100 % tested. Operation at 0 MHz is tested when the stop clock protocol (STPCLK#) is used. 2-291 Intel486TM PROCESSOR FAMILY The processor enters the Stop Grant state in response to a STPCLK # interrupt. The processor will generate a Stop Grant bus cycle when it enters this state from the Normal state or the HALT state. The processor will not generate a Stop Grant bus cycle when it enters the Stop Grant state from the Stop Clock state. In the Stop Clock state, the processor does not respond to any stimulus. The processor must re-enter the Stop Grant state (CLK2 input must be restarted) in order to perform any bus actions such as HOLDI HLDA cycles, invalidates (AHOLD/EADS# or FLUSH # cycles), and BOFF #. It is recommended that CLK2 be restarted 2 clocks before and continue until 2 clocks after the transition of the HOLD, AHOLD, EADS#, FLUSH#, or BOFF# signals. While in the Stop Grant st~te, the pull-up resistors on STPCLK# and UP# are disabled internally. The system must continue to drive these inputs to the state they were in immediately before the processor entered the Stop Grant state. For minimum processor power consumption, all other input pins should be driven to their inactive level while the processor is in the Stop Grant state. The interrupt signals (SMI#, NMI, and INTR) will be recognized and serviced correctly if the input is held in the active state until the processor returns to the Stop Grant state. The Intel486 processor family requires that INTR be held active until the processor issues an interrupt acknowledge cycle in order to guarantee recognition. This condition also applies to the existing Intel486 processors. During the Stop Grant state, the processor will respond to HOLD, AHOLD and BOFF# normally and can perform cache invalidates. An active edge on either the SMI # or NMlinterrupts will be latched and will be serviced after the rising edge of STPCLK #. An INTR request will be serviced after the processor returns to the normal state as long as INTR is held active until the processor issues an interrupt acknowledge bus cycle. HALT State Stop Grant State Stop Clock State The processor enters the Stop Clock state when the system stops the CLK2 input. The system can stop the CLK2 input on either a logic high or a logic low. The CLK2 input must be restarted in the same state as when it was stopped. In other words, any CLK2 input state can be stretched. (See Figure 15-4 for detailS.) Processor operation at 0 MHz is tested only when the processor is in the Stop Clock state. The processor enters the HALT state from the Normal state on a HALT instruction. The system can place the processor into the Stop Grant state from the HALT state by asserting the STPCLK# input. The processor will generate a Stop Grant bus cycle when it enters the Stop Grant state. If the processor entered the Stop Grant state from the HALT state then it will return to the HALT state when the STPCLK # interrupt is de-asserted. When the processor re-enters the HALT state it will generate a HALT bus cycle. <1>1 Restart <1>2 CLK2 Stop '--~S <1>2 CLK2 242202-E7 Figure 15-4. CLK2 Phase Coherence in CLK2 Stop and Restart 2-292 I Intel486TM PROCESSOR FAMILY 15.1.3.3 Supply Current Model for Stop Clock Modes and Transitions Figure 15·5 illustrates the effect of different Stop Clock state transitions on the supply current. Icc A Change CLK2 Dynamically • Normal State o Stop Clock State Icc max B Not Tested • Icc max =1 mA @ 3.3V =2 mA @ 5V C Stop and Restart CLK2 within Stop Grant State (Fully tested) 2MHz BMHz 1SMHz 20MHz 25MHz 33MHz • guaranteed by design characterization. not 100% tested. 242202-EB Figure 15-5. Supply Current Model for Stop Clock Modes and Transitions 15.1.4 DC SPECIFICATIONS FOR 2X CLOCK OPTION 15.1.5 AC SPECIFICATIONS FOR 2X CLOCK OPTION For 2X clock DC specifications, refer to the 1X clock DC specifications. (See section 17.3, "DC Specifications.") The AC specifications given in the tables of this section consist of output delays, input setup requirements and input hold requirements. All AC specifications are relative to the rising edge of the phase 1 of the input system clock (CLK2), unless otherwise specified. (See Figures 15-6 through 15-8.) I 2-293 Intel486TM PROCESSOR FAMILY 15.1.15.1 3.3V AC Characteristic Table 15-3 is for 25- and. 33-MHz Intel486 SX and 33-MHz Intel486 DX processors in 2X Clock Mode. Table 15-3. 3.3V AC Characteristics (2X Clock) Functional operating range: Vee = 3.3V ±0.3V; TeASE = O°C to +85°C; CL specified Symbol Parameter Min Max Min = 50 pF, unless otherwise Max Unit Notes Frequency 25 33 MHz 1 CLK2 Frequency 50 66 MHz 1 t1 CLK2 Period 20 15 ns t2 CLK2 High Time 7 5 ns at2V t3 CLK2 Low Time 7 5 ns atO.8V t4 CLK2 Fall Time 2 2 ns 2V to 0.8V, Note 2 t5 CLK2 Rise Time 2 2 ns 0.8Vto2V, Note 2 17 ns 3 21 ns 2 ts A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, LOCK#, BREQ, "HLDA, SMIACT#, FERR# Valid Delay t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, LOCK#, BREQ, HLDA Float Delay ta PCHK # Valid Delay 3 24 3 23 ns taa BLAST #, PLOCK # Valid Delay 3 24 3 21 ns tg BLAST#, PLOCK# Float Delay 21 ns t10 DO-D31, DPO-DP3 Write Data Valid Delay 19 ns t11 DO-D31, DPO-DP3 Write Data Float Delay 21 ns t12 EADS# Setup Time 2-294 3 19 3 28 28 3 20 3 28 9 6 2 2 ns I Intel486TM PROCESSOR FAMILY Table 15·3. 3.3V AC Characteristics (2X Clock) (Continued) Functional operating range: Vcc = 3.3V ±0.3V; TCASE = O°C to +85°C; CL = 50 pF, unless otherwise specified Symbol Parameter Min Max Min Max Unit t13 EAOS#Hold Time 4 4 ns t14 KEN#, BS16#, BS8#SetupTime 9 6 ns t15 KEN#, BS16#, BS8#HoldTime 4 4 ns t16 ROY #, BROY # Setup Time 9 6 ns t17 ROY #, BROY # Hold Time 4 4 ns t18 HOLO, AHOLO Setup Time 11 7 ns t18a BOFF # Setup Time 11 9 ns t19 HOLO, AHOLO, BOFF # Hold Time 4 4 ns t20 FLUSH#, A20M#, NMI, INTR, SMI#, STPCLK#, SRESET, RESET, IGNNE#Setup Time 11 6 ns t20a RESET Falling Edge Setup Time 9 5 ns t21 FLUSH#, A20M#, NMI, INTR, SMI#, STPCLK#, SRESET, RESET, IGNNE#Hold Time 4 4 ns t22 00-031, OPO-OP3, M-A31 Read Setup Time 6 6 ns t23 00-031, OPO-OP3, M-A31 Read Hold Time 4 4 ns Notes 3 3 NOTES: 1. O-MHz operation is tested using the STPCLK# and Stop Grant bus cycle protocol. Operation between 0 MHz < CLK2 < 8 MHz is guaranteed by design characterization, but is not 100% tested. 2. Not 100% tested, guaranteed by design characterization. 3. FERR# and IGNNE# are present only in Intel486 DX processors. I 2-295 Intel486TM PROCESSOR FAMILY 15.1.5.2 5V AC Characteristics Table 15·4 is for 25· and 33·MHz Intel486 SX and 33·MHz .lntel486 OX processors in 2X Clock Mode. Table 15-4. 5V AC Characteristics (2X Clock) Functional operating range: Vee = 5V ±5%; TeASE = O°C + 85°C; CL = 50 pF, unless otherwise specified. Symbol Parameter Min Max Min Max Unit Notes Frequency 25 33 MHz 1 CLK2 Frequency 50 66 MHz 1 t1 CLK2 Period 20 15 ns t2 CLK2 High Time 7 5 ns at2V t3 CLK2 Low Time 7 5 ns at 0.8V t4 CLK2 Fall Time 2 2 ns 2Vto 0.8V, Note 2 t5 CLK2 Rise Time 2 2 ns 0.8Vto 2V, Note 2 t6 A2-A31, PWT, PCO, BEO-3 #, M/IO#, O/C#, W/R#, AOS#, LOCK#, BREQ, HLOA, SMIACT#, FERR#Valid Delay 17 ns 3 t7 A2-A31, PWT, PCO, BEO-3#, M/IO#, O/C#, W/R#, AOS#, LOCK#, BREQ, HLOA Float Delay 21 ns 2 ts PCHK#Valid Delay 3 24 3 23 ns tSa BLAST#, PLOCK#Valid Oe!ay 3 24 3 21 ns t9 BLAST #, PLOCK # Float Delay 21 ns tlO 00-031, OPO-OP3 Write Data Valid Delay 19 ns t11 00-031, OPO-OP3 Write Data Float Delay 21 ns t12 EAOS#Setup Time 9 6 ns t13 EAOS#Hold Time 4 4 ns t14 KEN#, BS16#, BS8#SetupTime 9 6 ns t15 KEN#, BS16#, BS8#Hold Time 4 4 ns t16 ROY#, BROY#SetupTime 9 6 ns 2·296 3 19 3 28 28 3 20 3 28 2 2 I Intel486TM PROCESSOR FAMILV Table 1S-4. SV AC Characteristics (2X Clock) (Continued) Functional operating range: Vcc = 5V ± 5%; T CASE O°C + 85°C; CL = 50 pF, unless otherwise specified. Symbol Parameter Min Max Min Max Unit Notes t17 ROY #, BROY # Hold Time 4 4 ns t16 HOLD, AHOLO Setup Time 11 7 ns t16a BOFF # Setup Time 11 9 ns t19 HOLD, AHOLO, BOFF # Hold Time 4 4 ns t20 FLUSH#, A20M#, NMI,INTR SMI#, STPCLK#, SRESET, RESET,IGNNE#, Setup Time 11 6 ns t20a RESET Falling Edge Setup Time 9 5 ns t21 FLUSH#, A20M#, NMI,INTR, SMI#, STPCLK#,SRESET,RESET, IGNNE#Hold Time 4 4 ns 3 t22 00-031, OPO...,OP3, A4-A31 Read Setup Time 6 6 ns 3 t23 00-031, OPO-OP3, A4-A31 Read Hold Time 4 4 ns 3 NOTES: 1. O·MHz·operation is tested using the STPCLK # and Stop Grant bus cycle protocol. Operation between 0 MHz < CLK2 < 6 MHz is guaranteed by design characterization, but is not 100% tested. 2. Not 100% tested, guaranteed by design characterization. 3. FERR# and IGNNE# are present only in Intel466 OX processors. ~---------~1-----------+--~2 CLK2 - - - - . I I f4--- 12 13 11 242202-E9 Figure 1S-6. CLK2 Waveform I 2·297 Intel486TM PROCESSOR FAMIL V ~1--------~----~2------ CLK2 Iy i+---Ix VALID OUTPUTS 242202-FO Oulpul Valid Delay Ix = 16. 18. 18a. 110 Oulput Floal Delay ty ,;" t7. tg. tll Figure 15-7. Valid and Float Delay Timings ~-------'~2 \ CLK2 ~1------- ~ INPUTS 242202-Fl Ix = t12. t14. 116. t18. 120. 120a. 122. 118a ty = 113. 115. 117. 119. t21. t23 Figure 15-8. Setup and Hold Timings 2·298 I Intel486TM PROCESSOR FAMILV 16.0 OverDrive™ Processor Socket This section contains the specifications for the OverDrive processor socket for systems based on Intel486 processors. All of the specifications described herein are based on the specifications of the Intel486 processors. One of the most important features of the Intel486 family architecture, compared with previous Intel architectures, is its upgradability via the OverDrive processor socket. Inclusion of the OverDrive processor socket in systems based on the Intel486 family of microprocessors provides the end user with an easy and cost-effective way to increase system performance. The paradigm of simply installing an additional component into an empty OverDrive processor socket to achieve enhanced system performance is familiar to the millions of end users and dealers who have purchased Intel math coprocessor upgrades to boost system floating point performance. The OverDrive processor provides improvement over the base system. The OverDrive processor takes advantage of Intel's next generation processor technology to p~ovide this performance improvement. The OverDrive processor socket described in this chapter is designed for the Pentium OverDrive processor. The Pentium OverDrive processor is designed with Pentium processor core technology. This socket is also backwards compatible for the IntelSX2 and IntelDX2 OverDrive processors. Support for the future 3.3V Pentium OverDrive processor upgrade for IntelDX4 processor-based systems is also specified. The Pentium OverDrive processor implements a superset of Intel486 processor signals. The new signals for the Pentium OverDrive processor socket, in addition to Intel486 processor signals, support a write-back protocol for the on-chip cache in the Write-Back Enhanced IntelDX2 processor. Implementation of the cache writeback capability for the OverDrive processor socket is optional, although implementation of the on-chip write back protocol enables maximum performance gain. For more information, please contact Intel. As a new system architecture feature, the provision of the OverDrive processor socket as a means for PC users to take advantage of the ever more rapid advances in software and hardware technology helps to maintain the competitiveness of Intel PCcompatible systems over other architectures. I The majority of upgrade installations which take advantage of the OverDrive processor socket will be performed by end users and resellers. Therefore, it is important that the design minimize the amount of training and technical expertise required to install the OverDrive processors. Upgrade installation instructions should be clearly described in the system user's manual. In addition, by making installation simple and foolproof, PC manufacturers can reduce the risk of system damage, warranty claims and service calls. Feedback from Intel's math coprocessor upgrade customers highlights three main characteristics of end user easy designs: accessible OverDrive processor socket location, clear indication of upgrade component orientation, and minimization of insertion force. OverDrive Socket Location The OverDrive processor socket can be located on either the motherboard or modular processor card. The OverDrive processor socket should be easily accessible for installation and readily visible when the PC case is removed. The OverDrive processor socket should not be located in a position that requires removal of any other hardware (such as hard disk drives) in order to install the OverDrive processor. Component Orientation The most common mistake made by end users and resellers when installing Math Coprocessor upgrades is incorrect orientation of the chip. This can result in irreversible damage to the chip. To solve this problem, Intel has designed OverDrive processor sockets and OverDrive processors with keying mechanisms to ensure that upgrade components fit in the right socket, with the right orientation. There are two OverDrive processor sockets, presented in this chapter, that can accept the OverDrive processor for Intel486 processor-based systems, designated as socket 3 and socket 6. The two sockets and the keying mechanism are illustrated in Figure 16-1. Socket 3 is a 237-pin socket and accepts both 5V and 3.3V OverDrive processors. The keying mechanism consists of one Key Pin (A1) and four missing pins (81, C1, A2and A3). To be effective as a keying mechanism, the locations in the socket corresponding to the four missing pins must be plugged. This socket is designed to be backwards compatible with the 169-pin IntelSX2 and IntelDX2 OverDrive 2-299 Intel486TM PROCESSOR FAMILY 0000000000000000000 0000000000000000000 0000000000000000000 000000000·0000000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000 0000 0000000000000000000 000000000000000000 a 000000000000000000 0000000000000000 Socket 3 3V/SV \ I qoooooaooooooOOOOD ~ooooooooooooooooo 0000 00000 0000 0000 0000000000000000000 oooooooooonoooooooo 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000000000000000000 000000000000000000 a 000000000000000000 0000000000000000 OverDrive Processor 3V-only 242202-F2 Figure 16-1. OverDrive™ Processor Sockets for Intel486TM Processor-Based Systems processors. In order to maintain compatibility, socket 3 includes the Key Pin at location E5. Socket 6 is a 235-pin socket and accepts 3.3V OverDrive processors only. The keying mechanism consists of one Key Pin (A 1) and five missing pins (B 1, C1, A2, A3 and A 19). Being designed for 3.3V OverDrive processors only, socket 6 must not accept the 169-pin IntelSX2 and IntelDX2 OverDrive processors, therefore, the Key Pin E5 is missing. To be effective as a keying mechanism, the locations in the socket corresponding to the six missing pins must be plugged. 2-300 In addition, the location of the pin 1 corner should be clearly marked on the motherboard, for example by silk screening. 'Insertion Force The third major concern voiced by end users refers to how much pressure should be exerted on the upgrade chip and PC board for proper installation without damage. This becomes even more of a concern with the larger components which require up to 200 pounds of pressure for insertion into a standard screw machine socket. This level of pressure can easily result in cracked traces and stress to solder joints. To minimize the risk of system damage, it is recommended that a Zero Insertion Force (ZIF) I Intel486TM PROCESSOR FAMILV socket be used for the OverDrive processor socket. Designing with a ZIF socket eliminates the need to design in additional structural support to prevent flexing of the PC board during installation, and results in improved end user and reseller product satisfaction due to easy "drop-in" installation. If a L1F socket is used, sufficient support should be provided directly under the OverDrive processor socket. This will minimize the possibility of damage to the motherboard during insertion of the OverDrive processor. 16.1 OverDrive Processor Socket Overview The circuit design requirements for the OverDrive processor socket are discussed in section 16.2, "OverDrive Processor Circuit Design,". In addition to the OverDrive processor socket circuits, there are layout considerations for the OverDrive processor socket and processor spatial requirements. These issues are discussed in section 16.3, "Socket Layout,". Because the OverDrive processor must function in the OverDrive processor socket, the OverDrive processor socket heat dissipation specifications must be implemented. Section 16.4, "Thermal Design Consideration," discusses the thermal considerations in the system design. Because the system must operate correctly with any OverDrive processor without a BIOS change, BIOS and software restrictions and recommendations are provided in section 16.5, "BIOS and Software,". Section 16.6, "Test Requirements," discusses OverDrive processor socket test requirements. Sections and 16.7, "OverDrive Processor Socket Pinout," and 16.8, "3.3V Socket Specifications,". specify the pinout specifications for the 5V and 3.3V OverDrive processor sockets, respectively. Finally, section 16.9, "DCI AC Specification," specifies the electrical characteristics of the OverDrive processor socket. Internal No Connect (INC) on the OverDrive processor socket, and the pin corresponding to TMS is defined as UP# (pin C15 of the OverDrive processor socket). For compatibility, the system should not do boundary scan testing when the OverDrive processor is installed. • On Intel486 processors, the pin C14 defines FERR# and the pin A13 is INC, while on the OverDrive processor socket the pin 015 is INC and the pin B14 defines FERR# . • On Intel486 processors the pin C1 0 defines SRESET, while on the OverDrive processor socket the corresponding pin 011 is INC. In a system with a single socket motherboard design, the processor and the OverDrive processor share the same socket. The Intel486 processor occupies the three innermost rows of pins of the 237pin OverDrive processor socket, with the "pin 1" (notched corner) oriented towards the "pin 1" corner of the socket. The processor will be replaced with the OverDrive processor when the system is upgraded. In a jumperless, plug-and-play, upgradable system, the pins 015 and B14 of the OverDrive processor socket should be connected together to the FERR# signal line, and the pins F19 (INIT) and 011 should be connected together to SRESET. When the OverDrive processor is installed, the pin C15 (UP #) must be isolated from TMS (see Figure 16-2). CTRL _ _ _ _ _ _ _ _.. AOOR _ _ _ _ _- . OATA _ _ _- , DATA AOOR CTRL FERR- 814 16.2 OverDrive Processor Circuit Design The Intel486 processors in the 168-pin PGA package fit in the three inner rows of the 237-pin OverDrive socket. The corresponding pins of the Intel486 processor and the OverDrive socket define identical signals, with the following exceptions: • The pins corresponding to the signals TCK, TDI and TOO on Intel486 processors are defined as I 19 RESERVED 01 S SRESET 011 F 19 HIS CIS 0 242202-F3 Figure 16-2. OverDrive™ Processor Socket Circuit Diagram-Single Socket Design 2-301 Intel486TM PROCESSOR FAMIL V 16.2.1 BACKWARD COMPATIBILITY The Pentium OverDrive processor socket for Intel486 processor-based systems is designed to be compatible with the other OverDrive processors. The Pentium OverDrive processor socket has a fourth row of contacts around the outside of the 169 contacts defined for the IntelSX2 and IntelDX2 OverDrive processors. The three inner rows of the Pentium OverDrive processor socket, with the inner key pin, are 100% compatible with the 169-pin PGA OverDrive processors. For backward compatibility, the inner row key pin location (E5) must be included in the Pentium OverDrive processor socket. 16.3 Socket Layout This section discusses three aspects for the OverDrive processor socket: size, upgradability, and vendors. 16.3.1 MECHANICAL DESIGN CONSIDERATIONS The Pentium OverDrive processor for Intel486 processor-based systems is designed to fit in a standard 240-lead (19 x 19) PGA socket with four corner pins removed. The Pentium OverDrive processor uses a fan/heatsink, and therefore, requires vertical clearance to allow adequate air circulation. 16.3.1.1 Fan Heatsink Design The maximum and minimum dimensions of the Pentium OverDrive processor package with a fan/heatsink are shown in Table 16-1. The fan/heatsink unit space requirement is divided into ttie size of the actual heatsink, and the required free space above the heatsink. The total height required for the Pentium OverDrive processor from the motherboard will depend on the height of the PGA socket. The total external height given in Table 16-1 is only measured from the PGApin stand-offs. Table 16-1 also details the minimum clearance needed around all four sides of the PGA package. Since the Pentium OverDrive processor dissipates more power than the Intel486 processor family members, it requires a larger cooling capacity. To facilitate the task of cooling the Pentium OverDrive processor, Intel will ship the product with a fan/heatsink. No external connections (Le., power) will be required for the fan/heatsink. All the needed connections will be made through the pins on the outer row of the processor. 16.3.1.2 Passive Heatsink Design The space required for the passive heatsink OverDrive processors is less than the space requirements shown in Table 16-1. The passive heatsink OverDrive. processors include the IntelSX2 and IntelDX2 OverDrive processors. 16.3.2 DESIGN RECOMMENDATIONS PC buyers value easy and safe upgrade installation. PC manufacturers can make upgrade component installation in the OverDrive processor socket simple and foolproof for the end user and reseller by implementing the suggestions listed in Table 16-2. Table 16-1. Pentium™ OverDrive™ Processor, 236-Pin, PGA Package Dimensions with Fan/Heatsink Attached Component Length and Width (inches) Height (inches) Minimum Maximum Minimum Maximum 1.950 1.975 0.140 0.180 N/A N/A 0.008 0.012 Heatsink Unit 1.830 1.850 N/A N/A Heatsink N/A N/A 0.790 0.810 PGA Package Adhesive Req'd Free Space N/A N/A 0.400 0.400 External Total 1.950 1.975 1.338 1.402 Space From Package 0.200 0.200 N/A N/A 2-302 I Intel486TM PROCESSOR FAMIL V Table 16-2. Socket and Layout Considerations Design Considerations Implementation Visible OverDrive™ processor socket The OverDrive processor socket should be easily visible when the PC's cover is removed. Label the OverDrive processor socket and the location of pin 1 by silk screening this information on the PC board. Accessible OverDrive processor socket Make the OverDrive processor socket easily accessible to the end user (i.e., do not place the OverDrive processor socket under a disk drive). Be sure to leave e,nough clearance to open the Zero Insertion Force (ZIF) socket. Foolproof Chip Orientation This OverDrive processor socket must insure proper orientation of the OverDrive processor. The PGA package of the Pentium OverDrive processor for Intel486 processor-based systems is oriented by the four corner pins that have been removed and the "KEY" pin from the "pin 1" corner. The four contacts (A2, A3, B1 and C1) in the socket should be plugged, such that PGA pins cannot be inserted, to assure correct orientation. Zero Insertion Force OverDrive processor socket The high pin count of the OverDrive processor makes the insertion force required for installation into a screw machine PGA socket excessive. Even most Low Insertion Force (LlF) sockets often require more than 60 Ibs. of insertion force. A Zero Insertion Force (ZIF) socket insures that the chip insertion force does not damage the PC board. Be sure to allow enough clearance for the ZIF socket handle. Do not use a LlF or screw machine socket. "Plug and Play" Jumper or switch changes should not be needed to electrically configure the system for the OverDrive processor. Thorough Documentation Describe the OverDrive processor socket and the OverDrive processor installation procedure in the PC's User's Manual. Required Airspace 1----------1.963" - - - - - - - - - - - t 0.40" OVERDRIVE PROCESSOR ACTIVE FAN/HEATSINK UNIT OVERDRIVE PROCESSOR PGA PACKAGE 242202-F4 Figure 16-3. Pentium™ OverDrive™ Processor PGA Package with Heatsink Attached I 2-303 Intel486TM PROCESSOR FAMILY 16.3.3 ZIF SOCKET VENDORS Intel has a list of qualified ZIF socket vendors. Contact Intel for more information. 16.4 Thermal Design Considerations 16.4.1 ACTIVE HEATSINK THERMAL DESIGN The Pentium OverDrive processor for Intel486 processor-based systems will have an active fan/heatsink for thermal dissipation. The maximum allowable temperature of the air entering the fanlheatsink can not exceed 55°C under the worst case operating conditions specified for the system. The fan/heatsink reduces the need for high airflow. However, the system must provide adequate ventilation to prevent localized heating above the specified ambient. 16.4.2 PASSIVE HEATSINK THERMAL DESIGN Passive heatsinks are used on the IntelSX2 and IntelDX2 OverDrive processors. The thermal efficiency of passive heatsink processors is dependent on system airflow. Please refer to the individual OverDrive processor Data Sheet for airflow requirements. 16.5 BIOS and Software The following should be considered when designing the OverDrive processor socket for an Intel486 processor-based .system. 16.5.1 OverDrive PROCESSOR DETECTION The component identifier and steppinglrevision identifier for the OverDrive processor is readable in the DH and DL registers respectively, immediately after RESET. See Table 16-3. These values can also be obtained using the CPU_ID instruction. As with the Intel486 processor specification, it is recommended that the BIOS save the contents of the DX register, immediately after RESET, so that this information can be used later, if required. 2-304 Table 16-3. DX Register Contents after Reset Component ID (DH) Stepping ID (DL) Future Pentium™ OverDrive processor (3.3V) 15h xxh Pentium OverDrive processor (5V) 15h 3xh IntelDX2TM OverDrive processor 04h 3xh IntelSX2TM OverDrive processor 04h 5xh OverDrive™ Processor 16.5.2 TIMING DEPENDENT LOOPS The OverDrive processor for Intel486 processorbased systems executes instructions at a multiple of the frequency of the input clock. This OverDrive processor also will use advanced design techniques to decrease the number of clocks per instruction (cpi) from that of Intel486 processors. Thus software, such as instruction-based timing loops, will execute faster on the OverDrive processor than on the Intel486 processor at the same input clock frequency. Instructions such as NOP, LOOP, and JMP $ + 2 are frequently used by the BIOS to implement timing loops that are required, for example, to enforce recovery time between consecutive accesses for I/O devices. These instruction-based, timing-loop implementations may require modification to be compatible with this OverDrive processor socket. In order to avoid any incompatibilities, timing loops can be implemented in hardware rather than in soft. ware. This provides transparency and also does not require any change in BIOS or 1/0 device drivers in the future when moving to higher processor clock speeds. As an example, a timing loop may be implemented as follows: The software performs a dummy 1/0 instruction to an unused 1/0 port. The hardware for the bus controller logic recognizes this 1/0 instruction and delays the termination of the 1/0 cycle by keeping RDY # or BRDY # deasserted for the appropriate amount of time. I Intel486™ PROCESSOR FAMILY 16.6 Test Requirements The electrical functionality of the OverDrive processor socket can be verified by fully testing the PC with a populated OverDrive processor socket. Intel recommends that the system be tested with all available OverDrive processors that are compatible with the OverDrive socket type and power supply voltage, to ensure that there are no BIOS issues. The BIOS requirements to maintain compatibility with all OverDrive processors are discussed in section 16.5, "BIOS. and Software," of this document. All OverDrive processors undergo thorough application software compatibility testing prior to their introduction. 16.7 OverDrive Processor Socket Pinout Socket 3 can accept all 5V OverDrive processors and the future 3.3V Pentium OverDrive processor for the IntelDX4 processor. The socket 3 pinout is shown in Figure 16-4 and Figure 16-5. Socket 6 is discussed in section 16.3, "Socket Layout,". 16.7.1 PIN DESCRIPTION The signal pin descriptions for the OverDrive processor are identical to the pin descriptions for the Intel486 processor except for those shown in Table 16-5. 16.7.2 RESERVED PIN SPECIFICATION Many pins in the OverDrive processor socket are defined as reserved (RES). The function of these pins is documented separately. Some of these pins will be used to implement an on-chip writeback cache protocol, and the remaining pins will provide other OverDrive specific functions. These pins must not be connected unless they are used to implement these functions, as documented in the information available separately. To insure proper operation, I pins marked as NC must be left unconnected as well. For more information contact Intel. Section 16.7.4, "Shared Write-Back Pins," discusses the Pentium OverDrive processor compatibility with the Write-Back Enhanced IntelDX2 processor. 16.7.3 INC "Internal No Connect" PIN SPECIFICATIONS INC Pins are defined as Internal No-Connects. This means that the pin is not connected to the processor internally. Since the pin is inert and floating, it may be used in any manner seen fit, but must meet the INC pin specifications. In general, all INC pins have an intended use to implement a single processor socket system design. The INC pin will never be used for any other function. The 6 signals which use the INC pins to simplify single-socket board design are shown in Table 16-6. 16.7.4 SHARED WRITE-BACK PINS There are several Signals that the Pentium OverDrive processor socket has in common with the Write-Back Enhanced IntelDX2 processor, but which are located on different pins. An example of this would be the HITM# signal. On the Pentium OverDrive processor socket, it is located in the outer row of pins, while on the Write-Back Enhanced IntelDX2 processor, it is located on one of the inner rows. Single socket designs require that these signals be tied together so that the use of a jumper to reroute the signal is unnecessary. This is done through the use of the INC pin. Figure 16-6 shows an example of how the INC pins shown in Table 16-6 should be connected together to allow single socket compatibility between the Write-Back Enhanced IntelDX2 processor and the OverDrive processor socket. The figure is provided as an example only and is not intended to be guide for how the signals should actually be routed on a motherboard. 2-305 Intel486TM PROCESSOR FAMILY 16.7.5 PINOUT U T S R o 0 0 000 Q P N M L K 0 0 0 0 0 0 0 f E DeB A 0 0 0 0 • • ~~~~~~~~~~~~~~~~~~~ 2 3 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RES A27 A28 A31 DO 02 Vss Vss Vss Vets Vss Vss Of! Voss 09 011 019 0 0 0 0 0 0 0 0 0 0 000 0 0 0 Vss A215 A25 V$S 429 01 Vee 06 Vee 05 03 Vee DLl 018 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A23 Vee A17 A30 OPO 04 07 01. 016 DP2 012 015 010 017 ,tlK 0 0 Vss Vss AI 9 7 8 9 10 11 12 13 14 15 16 17 18 0 vss VOLOET 0 0 0 A'" A18 A21 0 0 0 0 Vss Vss Vee A24 0 vss 0 0 0 A12 A15 A22 0 000 Vee Vss Vee • PLUG • • PLUG PLUG 022 000 Vss INC 0 000 Vee Vss PLUG 0 vee DP3 000 7 000 0 8 026 Vss o 028 025 OH Vee Vss RES 0 0 0 D31 029 Vee 0 0 0 000 0 Vss Vee Ale 030 Vee 0 0 0 0 Vss Vee A 13 0 0 0 0 Vss Vss Vee A9 0 0 0 0 Vss vss A 11 AS 0 0 0 0 Vss A 10 A8 47 0 0 0 0 Vss Vss Vee 42 0 0 0 0 Vee A5 43 SREO 0 0 0 0 Vss A4 0 0 ~ U 0 0 0 0 0 • • PLUG • • • PlUC PlUC 0 0 0 HLOA LOCK- o/c- 0 0 0 0 0 PWT 8ro. 0 0 0 0 000 0 INC Vss 12 INC Vee Vss 0 0 0 0 0 0 0 0 Ycc B[I" Ycc Ycc ROY- Ycc 0 0 0 0 0 0 0 0 0 W/R" "Iss "Iss Vss PCO Vss Vss 8E3" Vss 0 0 Vss 0 0 0 0 FEARIII 0 Ycc 0 0 B(2- BROY. STPCLI(fII KEN. HOLD A20 ... flUSH_ 0 0 0 0 0 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T .S R Q P N ~ L K J H G 0 ~ f 0 0 8S8- RUET 0 0 000 UP. INC 0 E 0 ~ Vss 0 0 0 "WI IGNN[- Vee ,0 0 0 INC INlA Yss 0 0 serr- 8516- EADS- AHOlD ~ 0 0 ~ 0 m DeB 13 Vss INC INC Ycc INC 0 PLUG 0 INC Yee Ne PLUG N/IO- 0 • 10 11 5WI. 000 ADS" ClKWUl PCHK- "Iss m Vss 9 0 5W'ACT_ BLAST- PLOCK" Ycc 0 Vee 000 INC PLUG 6 RES 0 Yee 5 0 027 A20 Vss 000 Vss 4 Vee 0 023 3 PLUG kEY • • PLUG 00. 02·' 2 PLUG Vee RES 19 08 Vee 500 6 Yee b. 02D 0 14 15. 16 17 18 RES 0 ~ 19 A 242202-F5 NOTE: All NC and RES pins must remain unconnected Figure 16-4. OverDrive™ Processor Socket 3 Pinout (Top Side View) 2-306 I Intel486TM PROCESSOR FAMILY A B c o E o • • 0 0 0 M N G H J K L 0 0 0 0 000 p a R s 0 0 0 000 T u ~~~~~mm~~~~~mm~~~mm 2.0000000000000000002 PLUG 020 019 011 09 Vss OPt Vss Vss Vees Vss Vss Vss 02 DO A31 A18 A27 RES 3.0000000000000000003 PLUG 022 021 018 013 Vee 08 Vee D3 05 Vee 015 Vee 01 A29 Vss 425 A25 Vss 400000000000000000004 Vee 5 6 7 8 9 10 11 12 13 14 elK 017 010 000 0 0 • Vss vee KEY 16 17 18 19 023 Yss Vss 015 012 DP2 01& 0'. D7 0' OPO • PLUG PLUG .6.30 A17 • 0 PLUG AI 9 Vee 423 Vee 000 Vss VOlOEl Yss 0 0 0 0 • • 0 0 0 0 RES OP3 vss vee PLUG PLUG A21 A18 AU Vss 0 0 0 0 RES 024 025 027 0 0 0 0 Vss Vss Vee 02& 0 0 0 0 Vee 029 031 028 o .6.2' o A22 o A20 0 0 0 0 o Vee Vss Vee 030 AI6 o 0 0 0 Vee INC SW'. INC 0 0 0 Vee Yss Vss 0 0 0 AU .'2 Yss 0 0 0 Vee Vss Vee 0 Yee 0 0 Vss Vee 000 0 vee vee AI3 Vss o 0 0 0 000 0 Vss Vss Vee INC A9 Vss 0 0 0 0 Vss INC INC SNIACT- o 0000. Vss 15 INC FERR- INC Ne PLUG vee Vss 0 0 0 AS All Vss Vss • 0 0 0 PLUG A7 A8 AID 5 6 7 8 9 10 11 12 13 0 14 15 Yss vss • • • 0 000 INC UP. INC PLUG PLUG PLUG PLUG A2 Vee Vss Vss 0 0 0 0 0 0 0 0 0 0 0 0 16 0 0 00 17 0000. 0 0 0 0 0 0 0 ~~~~~~~-~-~-~~~~"~~ o Vss o RES 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INTR INC RESET 858- Vee ROY. Vee Vee 8EI- Vee Vee Vee N/IO. Vee 0 0 0 0 AHOlO [ADS- 8516. Bon'. 0 0 0 0 ~ m ~ ~ ~ A B c o E 0 0 0 0 0 0 0 0 0 Vss B[3" Vss Vss peD Vss Vss Vss W/R" 0 _ 0 0 0 0 0 ~ ~ ~ ~ ~ G H J K L 0 0 0 0 PLOCK- BLAST-"'4 0 0 0 0 0 ~ m ~ ~ ~ M N p a R s ~ 0 Vss PCHK- CLKNUl ADS- 0 m T Vss 0 18 RES 0 ~ 19 U 242202-F6 NOTE: All NC and RES pins must remain unconnected Figure 16-5. OverDrive™ Processor Socket 3 Pinout (Pin Side View) I 2-307 Intel486™ PROCESSOR FAMILY Table 16-4. OverDrive™ Processor Socket Pin Cross Reference Address A2 A3 A4 AS A6 A7 AB A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 R15 516 T17 R13 T16 R14 514 R12 T14 513 T8 R11 T6 5B R10 R4 56 R5 R9 R6 R8 T4 R7 53 T3 T2 52 03 04 R2 Data Control 00 02 A20M# A05# 01 P3 02 P2 AHOLO 03 J3 BEO# BE1# 04 N4 BE2# 05 K3 06 M3 BE3# 07 M4 BLA5T# OB G3 BOFF# BROY# 09 E2 010 F4 BREO 011 02 B5B# B516# 012 H4 013 E3 CLK CLKMUL(2) 014 L4 015 G4 O/C# 016 K4 OPO 017 E4 OP1 01B 03 OP2 019 C2 OP3 020 82 EA05# 021 C3 FERR# FLU5H# 022 B3 023 B5 HLOA 024 87 HOLO IGNNE# 025 C7 026 OB INIT 027 07 INTR KEN# 02B 09 029 89 LOCK# 030 010 M/IO# 031 C9 NMI Control Res(1) E16 PCO K18 A6 T18 PCHK# R18 A7 B1B PLOCK # R17 A1B L16 PWT M16 A19 K17 ROY# G17 B4 K16 RE5ET 017 B15 G1B 5MI# C11 B19 517 5MIACT# 013 C17 E1B 5TPCLK# H16 F1 J16 UP# C15 G1 R16 VOLDET(2) T5 N1 E17 W/R# P18 P1 01B P19 Position 04 T1 KEY E5 51B T19 KEY A1 N16 U1 PLUG A2 P4 U2 PLUG A3 G2 Ut8 PLUG B1 J4 U19 PLUG C1 86 N/C(1) PLUG E6 C1B PLUG E14 A19 814 PLUG E15 014 016 F5 PLUG INC 016 PLUG F15 F16 B11 PLUG P5 B16 B13 PLUG P15 F19 C13 PLUG 05 B17 C14 PLUG 06 G16 011 PLUG 014 P16 012 PLUG 015 P17 015 C16 Vcc A4 N3 A9 N17 A10 017 A11 R1 A16 R19 CB 54 C10 57 C12 59 01 510 05 511 06 512 019 515 F3 U4 F17 U9 H3 U10 H17 U11 J17 U16 J19 VCC5P(2) K19 J1 L3 K1 L17 L1 L19 VCC5 M17 K2 VSS AS AB A12 A13 A14 A15 A17 BB B10 B12 C4 C5 C6 C19 E1 E19 F2 F18 G19 H1 H2 H18 H19 J2 J18 L2 L18 M1 M2 M18 M19 N2 N1B N19 01 01B 019 R3 51 55 519 T7 T9 T10 T11 T12 T13 T15 U3 US U6 U7 UB U12 U13 U14 U15 U17 NOTES: 1. All RES pins are reserved for later use by Intel. To ensure proper operation of the microprocessor, all RES and N/C pins should be left unconnected. Please contact Intel for design information. 2. These pins are valid only for the future Pentium™ OverDrive processor (3.3V). 2-30B I Intel486TM PROCESSOR FAMILV Table 16-5. OverDrive™ Processor Socket Pin Description Symbol Type Name and Function Intel486TM PROCESSOR INTERFACE UP# 0 The Upgrade Present pin is used to signal Intel486 processor to float its outputs and stop driving the bus in a dual socket system design. It is active low and is never floated. UP# is driven low at power-up and remains active for the entire duration of the OverDrive processor operation. OverDrive PROCESSOR INTERFACE VCC5P I The VCC5P pin supplies power to the OverDrive processor's fan/heatsink and should be connected to + 5V ± 10% regardless of the system design. Failure to connect VCC5P to 5V will cause the component to overheat. INIT I The INIT input will force the OverDrive processor to begin execution in a known state. The processor state after INIT is the same as the state after RESET, except that the internal caches, floating point register and SMM base register retain whatever values they had prior to INIT. INIT may not be used in lieu of RESET after power-up. KEY PIN KEY The Key pin is an electrically non-functional pin which ensures correct orientation for the OverDrive processor. Socket plugs are also used to ensure correct orientation. Table 16-6. Single Socket Compatibility Signals Write-Back Enhanced Inte IntelDX2TM Processor Signal Pin Pentium™ OverDrive™ Processor Socket Signal Pin Pentium OverDrive Processor Socket INC Pins INV A10 N1 B11 HITM# A12 U1 B13 CACHE# B12 G1 C13 Signal I WB/WT# B13 T1 C14 INIT C10 F19 011 FERR# C14 B14 015 2-309 Intel486TM PROCESSOR FAMILY SRESET SRESETIINIT Q .. FERR. ooo~oo·ooo·ooooooo.O 00.0 00 0-0-0 0.00 0 0 0 0 0 0 ~~~ ~~: QQO 00·.0 00'0 CACHE. INV HITM. 168 - Pin Write-Back Enhanced IntelDX2 Signal Layout CACHE. INV 237-Pin Single Socket Design Using INC Pins 242202-F7 Figure 16-6. Sample Routing of INC Pins 16.8 3.3V Socket Specification Socket 6 is a 235-pin socket and accepts 3.3V OverDrive processors only. The keying mechanism consists of a Key pin (A 1) and five missing pins (81, C1, A2, A3, and A 19). Since it is designed for 3.3V OverDrive processors only, socket 6 must not accept the 169-pin OverDrive processors, therefore, the key pin, E5, is missing. To be effective as a keying mechanism, the locations in the socket corresponding to the six missing pins must be plugged. In addition, the location of the pin 1 corner should be clearly marked on the motherboard. Systems designed with the OverDrive processor socket 3 could use the future Pentium OverDrive processor for IntelDX4 processor-based systems, as well as the Pentium OverDrive processor. However, the OverDrive processor for IntelDX4 processorbased systems requires a 3.3V supply, while the Pentiu":! OverDrive processor requires a 5V supply. 2-310 Therefore, the supply voltage to the socket 3 (Ved must be 3.3V when it is used with the future Pentium OverDrive processor for IntelDX4 processor-based systems, and 5V when it is used with the Pentium OverDrive processor. However, the fan/heatsink does require a 5V power supply, Vee5P, as specified in Tables 16-4 and 16-5. To ensure adequate air circulation, the additional clearance specified in Table 16-1 must be provided. 16.9 DCI AC Specifications The electrical specifications in this section represent the electrical interface of the OverDrive processor for Intel486 processor-based systems. The OverDrive processor will be compatible to the maximum ratings and AC Specifications of Intel486 processors. Tables 16-7 and 16-8 provide the unique DC Operating Conditions for the OverDrive processors. I Intel486TM PROCESSOR FAMIL V Table 16·7. Pentium™ OverDrive™ Processor Socket (5V) DC Parametric Values Functional Operating Range: Vcc = 5V ± 5%; TSINK = O°C to + 85°C. Symbol Icc CIN Parameter· Min Power Supply Current ClK = 25 MHz ClK = 33 MHz Input Capacitance Max Unit 1900 2500 rnA rnA 13 pF Co 1/0 or Output Capacitance 17 pF CCLK ClK Capacitance 15 pF Notes Table 16·8. Future Pentium™ OverDrive™ Processor Socket (3.3V) DC Parametric Values Functional Operating Range: Vcc = 3.3V ±0.3V; TSINK = O°C to + 85°C. Symbol Icc Parameter Power Supply Current Min Max Unit 3000 rnA ICC5 Reference Supply Current 100 A ICC5P Fan/Heatsink Supply Current 200 rnA CIN Input Capacitance 13 pF Co 1/0 or Output Capacitance 17 pF CCLK ClK Capacitance 15 pF Notes (Note 1) 1watt @ 5V NOTES: 1. To avoid damaging the OverDrive processor when the 3.3V power supply is accidentally connected to Vss. the system must limit this current to less than 55 mAo I 2-311 Intel486TM PROCESSOR FAMILY The recommendation for the Intel486 processor Is 9 x 0.01 p.F and· 9 x 0.1,..,F capacitors. 17.0 ELECTRICAL DATA The following sections describe recommended electrical connections and electrical specifications for the Intel486 processor. 17.1 Power and Grounding 17.1.1 POWER. CONNECTIONS The Intel486 processor is implemented in CHMOS technology and has modest power requirements. However, the high clock frequency output buffers can cause power surges as multiple output buffers dri~e new signal levels simultaneously, For clean onchip power distribution at high frequency, multiple Vcc and Vss pins feed the Intel486 processor. Power and .ground connections must be made to all external Vcc and GND pins of the Intel486 processor. On the circuit board, all Vcc pins must be connected on a Vcc plane. All Vss pins must be likewise connected on a GND plane. 17.1.2 INTEL486 PROCESSOR POWER DECOUPLING RECOMMENDATIONS Liberal decoupling capacitance should be placed near the Intel486 processor. The Intel486 processor driving its 32-bit parallel address and data buses at high. frequencies, can cause transient power surges, particularly when driving large capacitive loads. Low inductance capacitors (i.e., surface-mount capacitors) and interconnects are recommended for the best high-frequency electrical performance. Inductance can be reduced by· connecting capacitors directly to the Vcc and Vss planes,with minimal trace length between the component pads and vias to the plane. These capacitors should be evenly distributed around each component on the Vcc power plane. The power consumption can transition from a low , level of power to amuch higher level (or high to low power) very rapidly. A typical example would be entering or exiting the Stop Grant state. Another example would be executing a HALT instruction, causing the Intel486 processor to enter the Auto HALT Power Down state, or transitioning from HALT to the Normal state. All of these examples may cause abrupt changes in the power being consumed by the . Intel486 processor. Bulk storage capacitors with a low ESR (Effective Series Resistance) in the 10 to .100 microfarad range are required to maintain a regulated supply voltage during the interval between the time the current load changes and the point that the regulated power supply output can react to the change in load. In order to reduce the ESR, it may be necessary to place several bulk storage capacitors in parallel. These capacitors should be placed near the Intel486 processor (on the processor power plane) to ensure that the supply voltage stays within specified limits during changes in the supply current while in operation. 17.1.3 VCC5 AND Vcc POWER SUPPLY REQUIREMENTS FOR THE INTELDX4 PROCESSOR· In mixed voltage systems that will be driving Inprocessor inputs in excess of 3.3V, the VCC5 pin must be connected to the system 5V supply. In order to limit current flow into the VCC5 pin, there is a limit to the voltage differential between the VCC5 pin and the other Vcc pins. The voltage differential between the VCC5 pin of the IntelDX4 processor and its 3:3V Vcc pins should never exceed 2.25V. The 2.25V limit applies to power up, power down and steady state operation. Table 17-1 outlines this requirement. t~IDX4 C~p~citor values should be chosen to ensure they eliminate both low and high frequency noise components. Table 17·1. Dual Power Supply Requirements for the IntelDX4TM Processor Symbol VDIFF 2-312 Parameter VCC5- VCC ! Difference Min Max Unit 2.25 V Notes VCC5 input should not exceed Vcc by more than 2.25V during power-up, power-down or during operation. I Intel486™ PROCESSOR FAMILY Meeting this requirement ensures proper operation of the IntelDX4 processor and guarantees that the current draw into the VCC5 pin will not exceed the ICC5 specification (see section 17.3.1, "DC Specifications"). If the voltage difference requirement cannot be met due to system design limitations, then an alternate solution may be employed. A minimum of a 1000. series resistor may be used to limit the current into the VCC5 pin. This resistor will ensure that current drawn by the VCC5 pin will not exceed the maximum rating of 55 mA for this pin (see section 17.2, "Maximum Ratings"). 5V (+- 0.25V) G \M----i~~ Vccs 100 Q (+- 5%, 0.5'N) 242202-FB Figure 17-1.lnteIDX4TM Processor VCC5 Current Limiting Resistor Note that this resistor is not necessary if the system can guarantee that the voltage difference between VCC5 and VCC is always limited to 2.25V, even during power up and power down. In 3.3V-only systems and systems that will be driving . all IntelDX4 processor inputs and I/Os from 3.3V logic, the VCC5 pin should be connected directly to the 3.3V VCC plane. This will guarantee the voltage difference specification is met and will eliminate the current draw into the VCC5 pin. In a 3.3V-only system, the VCC5 may be connected to the 5V supply as described previously, as long as the voltage differential in Table 17-1 is met, and assuming the current drawn by the VCC5 pin is of little consequence to the system design. 17.1.4 SYSTEM CLOCK RECOMMENDATIONS It is recommended that the ClK input to the Intel486 processor should not be driven until VCC has reached its normal operating level (either 3.3V or 5V). The ClK input may be grounded or allowed to ramp with VCC during this period. Once VCC has reached its normal operating level, the Intel486 processor can handle the clock frequency for which it is specified and the oscillator/clock driver should have locked onto its desired frequency. I 17.1.5 OTHER CONNECTION RECOMMENDATIONS NC pins should always remain unconnected. Connection of NC pins to Vcc or VSS or to any other signal can result in component malfunction or incompatibility with other steppings of the Intel486 processor family. INC (Internal No Connect) pins are not connected to any internal pad in Intel486 and OverDrive™ processors. However, new signals are defined for the location of the INC pins in the Intel486 processor proliferations. All INC pins defined by Intel have a specific use for jumperless single socket compatibility with current and future processors. A system design could connect any signal to an INC pin without affecting the operation of the processor. However, the purpose of a specific INC pin should be understood before it is used. If not, the system design will sacrifice the ability to implement a jumperless (single socket) flexible motherboard. For reliable operation, always connect unused inputs to an appropriate signal level. Active lOW inputs should be connected to VCC through a pull-up resistor. Pull-ups in the range of 20 Kn are recommended. Active HIGH inputs should be connected to GND. 17.2 Maximum Ratings Table 17-2 is a stress rating only, and functional operation at the maximums is not guaranteed. Function operating conditions are given in Table 17-3 for 3.3V processor DC Specifications, Table 17-9 f.or 5V DC Specifications, Tables 17-17 through 17-20 for 3.3V processor AC specifications, and Tables 17-23 through 17-25 for 5V processor AC specifications. Extended exposure to the Maximum Ratings may affect device reliability. Furthermore, although the Intel486 processor contains protective circuitry to resist damage from static electric discharge, always take precautions to avoid high static voltages or electric fields. 2-313 Intel486TM PROCESSOR FAMILY Table 17·2. Absolute Maximum Ratings Case Temperature under Bias -65·C to + 110·C Storage Temperature -65·C to + 150·C DC Voltage on Any Pin with Respect to Ground -0.5 to Vcc + 0.5V -0.5 to Vccs + 0.5V(I) Supply Voltage with Respect to Vss Vcc -0.5Vto +6.5V(2) Vcc -0.5Vto +4.6V(I) Vccs(1) -0.5Vto +6.5V(I) Transient Voltage on Any Input -1.6V to Vccs + 1.6V(1 ,3) Maximum Allowable Current Sink on Vccs(1) 55mA NOTES: 1. For IntelDX4™ processor only. 2. All Intel486TM processors except IntelDX4 processor. 3. Maximum voltage on any pin with respect 10 ground is the lesser of VccS + 1.6V or 6.SV for the IntelDX4 processor. 2·314 I Intel486TM PROCESSOR FAMILY 17.3 DC Specifications 17.3.1 3.3V DC CHARACTERISTICS Table 17-3 is for Intel486 SX, Intel486 DX, InteIDX2™, Write-Back Enhanced Inte1DX2, and IntelDX4 processors. Functional operating range: Vcc Symbol = Table 17-3. 3.3V DC Specifications 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 17); TCASE Parameter Min Typ = O°C to +85°C Max Unit VIL Input LOW Voltage -0.3 +0.8 V VIH Input HIGH Voltage 2.0 2.0 Vcc+0.3 VCC5+ 0.3 V VIHC Input HIGH Voltage of CLK, CLK2 VCC-0.6 VCC+0.3 V VOL Output LOW Voltage IOL = 2.0 rnA IOL = 100 /LA 0.40 0.20 0.45 V V V VOH Output HIGH Voltage IOH = -2.0 rnA IOH = -100/LA 2.4 Vcc-0.2 Notes 1 10 7 V V 16 ICC5 VCC5 Leakage Current- 15 300 /LA 8,9 Iccu UP# Active Supply Current 15 35 50 rnA rnA 2 2,10 III Input Leakage Current ±15 /LA 3 IIH Input Leakage Current 200 300 /LA /LA 4 15 IlL Input Leakage Current -400 /LA 5 ILO Output Leakage Current ±15 /LA CIN Input Capacitance 10 pF 6 COUT Output or 1/0 Capacitance 10 14 pF pF 6 6,10 CCLK CLK Capacitance 6 12 pF pF 6 6,10 ISHL Bus Hold Low Sustaining Current 17 /LA 11 ISHH Bus Hold High Sustaining Current -20 /LA 12 ISHLO Bus Hold Low OverDrive Current 210 /LA 13 ISHHO Bus Hold High OverDrive Current -350 /LA 14 I 2-315 Intel486™ PROCESSOR FAMIL V NOTES: 1. All inputs except ClK, ClK2. (For all Intel486 processors except the IntelDX4 processor.) 2. When the processor is in Stop Grant state, the Iccu of the host processor is less than 2 mA. S. This parameter is for inputs without internal pull-ups or pull downs and OV s Y,N S Vcc. 4. This parameter is for inputs with internal pull-downs and V,H = 2.4V. 5. This parameter is for inputs with internal pull-ups and V,L = O.4V. 6. Fc = 1 MHz; Not 100% tested. 7. For the IntelDX4 processor, this parameter is measured at: Address, Data, BEn = 4.0 mA Definition, Control = 5.0 mA 8. Typical values are not 100% tested. 9. This parameter is for Vccs-Vcc S 2.25V. (Inte1DX4 processor only.) 10. For the IntelDX4 processor only. 11. This is the maximum current the bus hold circuit can sink without raising the node above V'Lmax. IBHLShould be measured after lowering Y,N to ground and then raising to V,Lmax. (V,N = 0.8V). (Write-Back Enhanced IntelDX2 processor only.) 12. This is the maximum current the bus hold circuit can source without lowering the node voltage below V,Hmin. IBHH should be measured after raising Y,N to Vcc (S.SV) and then lowering to V'Hmin. (V,N. = 2.0V). (Write-Back Enhanced IntelDX2 processor only.) 1S. An external driver must source at least ISHLO to switch this node from low to high. (V'N ;;0, 1.SV) (Write-Back Enhanced IntelDX2 processor only.) 14. An external driver must source at least ISHHO to switch this node from high to low. (V'N s1.SV) (Write-Back Enhanced IntelDX2 processor only.) 15. This parameter is for inputs with pull-downs and V'H=2.4V. (Write-Back Enhanced IntelDX2 processor only.) 16. All Intel486 processors except the IntelDX4 processor. 17. Vccs should be connected to S.SV±O.SV in S.SV-only systems (Inte1DX4 processor only.) Table 17-4. 3.3V Icc Values for Intel486TM SX Processor Functional Operating Range: VCC Parameter = 3.3V ± 0.3V; Operating Frequency T CASE = OOG to Typ + 85°G Maximum Notes 315 mA 415mA 1 ICC Active (Power Supply) 25 MHz 33 MHz ICC Active (Thermal Design) 25 MHz 33 MHz 220mA 289mA 292mA 356mA 2,3,4 Icc Stop Grant 25 MHz 33 MHz 20mA 25mA 40mA 50mA 5 Icc Stop Glock OMHz 100 !LA 1 mA 6 2-316 I Intel486TM PROCESSOR FAMILY Table 17-5. 3.3V Icc Values for Intel486TM OX Processor Functional Operating Range: Vcc = 3.3V ±0.3V; TCASE = O°C to +8SoC Parameter Operating Frequency Typ Maximum Notes 41SmA 1 ICC Active (Power Supply) 33 MHz ICC Active (Thermal' Design) 33 MHz 290mA 383mA 2,3,4 Icc Stop Grant 33 MHz 2SmA SOmA S Icc Stop Clock OMHz 100/-LA 1 mA 6 Table 17-6. 3.3V Icc Values for IntelOX2TM Processor Functional Operating Range: Vcc = 3.3V ±0.3V; TCASE = O°C to +8SoC Parameter Operating Frequency Typ Maximum Notes 4S0mA SSOmA 1 Icc Active (Power Supply) 40 MHz SO MHz ICC Active (Thermal Design) 40 MHz SO MHz 318mA 39SmA 416mA S07mA 2,3,4 Icc Stop Grant 40 MHz SO MHz 20mA 23mA 40mA SOmA S Icc Stop Clock OMHz 100/-LA 1 mA 6 Table 17-7. 3.3V Icc Values for Write-Back Enhanced IntelOX2TM Processor Functional Operating Range: Vcc = 3.3V ±0.3V; TCASE = O°C to +8SoC Parameter I Operating Frequency Typ Maximum Notes S1SmA 630mA 1 ICC Active (Power Supply) 40 MHz SO MHz ICC Active (Thermal Design) 40 MHz SO MHz 309mA 384mA 47SmA S81 mA 2,3,4 Icc Stop Grant 40 MHz SO MHz 20mA 23mA 40mA SOmA S Icc Stop Clock OMHz 100/-LA 1 mA 6 2-317 Intel486TM PROCESSOR FAMILY Table 17-8. 3.3V Icc Values for IntelDX4TM Processor Functional Operating Range: Vcc = 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 7); TCASE Parameter Operating Frequency Typ = O°C to +85°C Maximum Notes 1450 rnA 1100 rnA 1 ICC Active (Power Supply) 100 MHz 75 MHz Icc Active (Thermal Design) 100 MHz 75 MHz 1075 rnA 825 rnA 1300 rnA 975 rnA 2,3,4 Icc Stop Grant 100 MHz 75 MHz 50 rnA 20 rnA 100mA 75 rnA 5 Icc Stop Clock OMHz 600 )LA 1 rnA 6 NOTES FOR TABLES 17-4 THROUGH 17·8: 1. This parameter is for proper power supply selection. It is measured using the worst case instruction mix at Vcc = 3.6V. In order to support the OverDrive™ processor, care should be taken to accommodate the Maximum Power Supply Current Value in section 16, "OverDrive Processor Socket." 2. The maximum current column is for thermal design power dissipation. It is measured using the worst case instruction mix at Vcc = 3.3V. 3. The typical current column is the typical operating current in a system. This value is measured in a system using a typical device at Vcc = 3.3V, running Microsoft Windows 3.1 at an idle condition. This typical value is dependent upon the specific system configuration. 4. Typical values are not 100% tested. 5. The Icc Stop Grant specification refers to the Icc value once the Intel486 processor enters the Stop Grant or Auto HALT Power Down state. 6. The Icc Stop Clock specification refers to the Icc value once the processor enters the Stop Clock state. The VIH and VIL levels must be equal to Vcc and OV, respectively, in order to meet the Icc Stop Clock specifications. 7. VCC5 should be connected to 3.3V ±0.3V in 3.3V·only systems. 2·318 I Intel486TM PROCESSOR FAMILY 17.3.2 5V DC CHARACTERISTICS Table 17-9 is for Intel486 SX, InteISX2™, Intel486 DX, IntelDX2 Processors, and Write-Back Enhanced IntelDX2 processors. Table 17-9. 5V DC Specifications Functional operating range: Vcc = 5V ± 0.25V; TCASE = O°C to +85°C Symbol Max Unit Input lOW Voltage -0.3 +0.8 V VIH Input HIGH Voltage 2.0 Vcc+0.3 V VOL Output lOW Voltage 0.45 V VOH Output HIGH Voltage V 2 Iccu UP# Active Supply Current 50 rnA 6 III Input leakage Current ±15 ""A 3· IIH Input leakage Current 200 300 ""A p.A 4 8 IlL Input leakage Current -400 p.A 5 ILO Output leakage Current ±15 p.A CIN Input Capacitance PGA PQFP 20 10 pF pF 7 Output or 1/0 Capacitance PGA PQFP 20 10 pF pF 7 ClK Capacitance PGA PQFP 20 6 pF pF 7 ISHL Bus Hold low Sustaining Current 33 p.A 9 ISHH Bus Hold High Sustaining Current -80 ISHLO Bus Hold low OverDrive™ Current· ISHHO Bus Hold High OverDrive Current VIL COUT CCLK I Parameter Min Typ 2.4 25 Notes 1 , p.A 10 330 p.A 11 -550 p.A 12 2-319 Intel486TM PROCESSOR FAMILY NOTES: 1. This parameter is measured at: Address, Data, BEn 4.0 mA Definition, Control 5.0 mA 2. This parameter is measured at: Address, Data, BEn -1.0 mA , Definition, Control - 0.9 mA 3. This parameter is for inputs without pull-ups or pull-downs and OV :s; VIN :s; Vee. 4. This' parameter is for inputs with pull-downs and VIH = 2.4V. 5. This parameter is for inputs with pull-ups and VIL = 0.45V. 6. When the processor is in Stop Grant state, the leeu of the host processor is less than 2 mAo 7. Fe= 1 MHz; Not 100% tested. B. This parameter is for inputs with pull-downs and VIH=2.4V. (SRESET pin only.) 9. This is the maximum current the bus hold circuit can sink without raising the node above VILmax. IBHLShould be measured after lowering VIN to ground and then raiSing to VILmax. (VIN = O.BV) (Write-Back Enhanced IntelDX2 processor only.) 10. This is the maximum current the bus hold circuit can source without lowering the node voltage below VIHmin. IBHH should be measured after raising VIN to Vee (5V) and then lowering to VIH min. (VIN = 2.0V) (Write-Back Enhanced IntelDX2 processor only.) 11. An external driver must source at least IBHLO to switch this node from low to high. (VIN 21.6V) (Write-Back Enhanced IntelDX2 processor only.) 12. An external driver must source at least IBHHO to switch this node from high to low. (VIN :S;1.6V) (Write-Back Enhanced IntelDX2 processor only.) Table 17-10. 5V Icc Values for Intel486TM SX Processor Functional'Operating Range: Vcc = 5V ±0.25V; TCASE = O°C to +85°C Parameter Operating Frequency Typ Maximum Notes 560mA 685mA 1 ICC Active (Power Supply) 25 MHz 33 MHz Icc Active (Thermal Design) 25 MHz 33 MHz 378mA 497mA 535mA 654mA 2,3,4 Icc Stop Grant 25 MHz 33 MHz 35mA 40mA 65mA BOmA 5 Icc Stop Clock OMHz 200 p,A 2mA 6 Table 17-11. 5V Icc Values for IntelSX2TM Processor Functional Operating Range: Vcc = 5V ±0.25V; TCASE = O°C to +85°C Parameter Operating Frequency Typ Maximum Notes 855mA 1 615mA 815mA 2,3,4 ICC Active (Power Supply) 50 MHz Icc Active (Thermal Supply) 50 MHz Icc Stop Grant 50 MHz 35mA 70mA 5 Icc Stop Clock OMHz 200 p,A 2mA 6 2-320 I Intel486TM PROCESSOR FAMILY Table 17-12. 5V Icc Values for Intel486TM OX Processor Functional Operating Range: Vcc = SV ±0.2SV; TCASE = O°C to +8SoC Parameter Operating Frequency Typ Maximum Notes 630mA 1000 mA 1 ICC Active (Power Supply) 33 MHz SO MHz ICC Active (Thermal Supply) 33MHz SOMHz 499mA 800mA 602mA 9S6mA 2,3,4 Icc Stop Grant 33 MHz SO MHz 40mA N/A 80mA N/A S 7 Icc Stop Clock OMHz 200/LA 2mA 6, 7 Table 17-13. 5V Icc Values for IntelOX2TM Processor Functional Operating Range: Vcc = SV ±0.2SV; TCASE = O°C to +8SoC Parameter I Operating Frequency Typ Maximum Notes 9S0mA 1200 mA 1 Icc Active (Power Supply) SO MHz 66 MHz ICC Active (Thermal Supply) SO MHz 66 MHz 680mA 901 mA 906mA 114S mA 2,3,4 Icc Stop Grant SO MHz 66 MHz 3SmA 4SmA 70mA 90mA S Icc Stop Clock OMHz 200/LA 2mA 6 2-321 Intel486TM PROCESSOR FAMIL V Table 17·14. 5V Icc Values for Write·Back Enhanced IntelDX2TM Processor Functional Operating Range: Vcc = 5V ±O.2SV; TCASE = O°C to +8SoC Parameter Operating Frequency Typ Maximum Notes 102S mA 13S0 mA 1 Icc Active (Power Supply) SO MHz 66 MHz ICC Active (Thermal Supply) SO MHz 66 MHz 659mA 872mA 928mA 1287 mA 2,3,4 Icc Stop Grant SO MHz 66 MHz 35mA 4SmA 70mA 90mA 5 Icc Stop Clock OMHz 200/LA 2mA 6 NOTIOS FOR TABLES 17·10 THROUGH 17·14: 1. This parameter is for proper power supply selection. It is measured using the worst case instruction mix at Vcc = 5.25V. 2. The maximum current column is for thermal design power dissipation. It is measured using the worst case instruction mix at Vcc = 5V. 3. The typical current column is the typical operating current in a system. This value is measured in a system using a typical device at Vec = 5V, running Microsoft Windows 3.1 at an idle condition. This typical value is dependent upon the specific system configuration. 4. Typical values are not 100% tested. 5. The Icc Stop Grant specification refers to the Icc value once the Intel486 processor enters the Stop Grant or Auto HALT Power Down state. 6. The Icc stop Clock specification refers to the Icc value once the processor enters the Stop Clock state. The VIH and VIL levels must be equal to Vce and OV, respectively, in order to meet the Icc Stop Clock specifications. 7. The 50 MHz Intel486 OX does not implement SL Technology and cannot utilize Stop Grant or Stop Clock functions. 2-322 I Intel486™ PROCESSOR FAMILY 17.3.3 EXTERNAL RESISTORS RECOMMENDED TO MINIMIZE LEAKAGE CURRENTS FOR THE WRITEBACK ENHANCED INTELDX2 PROCESSOR any potential DC paths (i.e., leakage currents. If external resistors are not strong enough to "flip" the logic level, the estimated leakage current on the data bus and the data bus parity pins are approximately 2 mA.) The data bus and data parity pins of the Write-Back Enhanced IntelDX2 processor employ internal bus hold circuitry to maintain their previous logic level while in the Stop Grant state; external resistors are not required to prevent excessive current during the Stop Grant state for the Write-Back Enhanced IntelDX2 processor. See Table 17-15 for specifications of the bus hold circuitry. If resistors are present, they should be strong enough to "flip" the logic level of the bus hold circuitry to minimize According to section 9.6.2, "Pin State During Stop Grant," data pins must be driven low to achieve the lowest possible power consumption. If the WriteBack Enhanced IntelDX2 processor is installed in an existing Intel486 processor socket, and the socket contains existing pull down resistors, these resistors should be changed to the suggested values, listed in Table 17-16. The values listed are recommended to minimize leakage current. Table 17-15. Write-Back Enhanced IntelDX2TM Processor DC Keeper Specifications Parameter Condition Description 3.3V 5V ISHL Current Required to Sustain Bus Hold LOW 17/-LA (max) 33/-LA (max) VIN @0.8V ISHH Current Required to Sustain Bus Hold HIGH -20/-LA (max) -80/-LA (max) VIN @2.0V ISHLO Current Required to "Flip" Bus Hold HIGH 210/-LA (min) 330/-LA (min) VIN:2: 1.3V @ Vcc=3.3V VIN:2: 1.6V @ Vcc=5V ISHHO Current Required to "Flip" Bus Hold LOW -350/-LA (min) -550/-LA YiN (min) VIN ~ ~ 1.3V @VcC=3.3V 1.6V @ Vcc=5V Table 17-16. Write-Back Enhanced IntelDX2TM Processor Recommended Values for Bus Keeper Vee Calculation Suggested 3.3V 1.3V --=3.7KO 350/-LA 1.6V 550 /-LA = 2.9 KO 3 KO 5V I 2.7KO 2-323 Intel486TM PROCESSOR FAMILY· 17.4 AC Specifications The AC specifications given in the tables in this section consist of output delays, input setup requirements and input hold requirements. AIIAC specifications are relative to the rising edge of the input system clock (ClK) unless otherwise specified. 17.4.1 3.3V AC CHARACTERISTICS Table 17-17 is for 25- and 33-MHz Intel486 SX, 33-MHz Intel486 DX, 40-MHz IntelDX2 (20-MHz Max.), 50-MHz IntelDX2 (25-MHz Max.), 40-MHz Write-Back'Enhanced IntelDX2 (20-MHz Max.), and 50-MHz Write-Back Enhanced IntelDX2 Processors (25-MHz Max.). Table 17-17. 3.3V AC Characteristics Functional operating range: Vee = 3.3V ±0.3V; TeASE = Oo~ to +85°C; CL = 50 pF, unless otherwise specified. Bus Speed Symbol Parameter 20 MHz 25 MHz 33 MHz Unit Figure Notes Min Max Min Max Min Max Frequency 8 20 8 25 8 33 MHz t1 ClK Period ,50 125 40 125 30 125 ns 17-2 t1a ClK Period Stability ±250 ps 17-2 Adjacent clocks t2 ClK High Time 16 14 11 ns 17-2 at2V t3 ClKlowTime 16. 14 11 ns 17-2 atO.8V ~ ClKFallTime 6 4 3 ns 17-2 2Vto 0.8V. t5 ClK Rise Time 6 4 3 ns 17-2 0.8Vto 2V t6 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK #, BREQ, HlDA, CACHE # , HITM#, SMIACT#, FERR# Valid Delay 16 ns 17-6 l2 t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#,ADS#, lOCK #, BREQ, HlDA, CACHE# Float Delay 20 ns 17-7 3 ta PCHK# Valid Delay 3 22 ns 17-5 3 20 . ns 17-6 20 ns 17-7 taa BLASH, PLOCK # Valid Delay t9 BLAST#, PlOCK# Float Delay 2-324 ±250 3 23 ±250 3 37 3 3 28 28 37 19 3 28 3 3 24 24 28 1 3 I Intel486TM PROCESSOR FAMILY Table 17-17. 3.3V AC Characteristics (Continued) Functional operating range: Vcc = 3.3V ±0.3V; VCC5 = SV ±0.2SV (Note 1); TCASE CL = SO pF, unless otherwise specified. = O·C to +8S·C; Bus Speed Symbol Parameter 20 MHz 25 MHz 33 MHz Unit Figure Notes Min Max Min Max Min Max 3 26 3 20 3 19 ns 17-6 20 ns 17-7 3 t10 00-031,OPO-OP3 Write Data Valid Delay tl1 00-031,OPO-OP3 Write Data Float Delay t12 EAOS#, INV Setup Time 10 8 6 ns 17-3 6 t13 EAOS#, INV Hold Time 3 3 3 ns 17-3 6 t14 KEN#, BS16#, BSB#, WB/WT # Setup Time 10 8 6 ns 17-3 6 t15 KEN#, B516#, BSB#, WB/WT # Hold Time 3 3 3 ns 17-3 6 t16 ROY #, BROY # Setup Time 10 B 6 ns 17-4 t17 ROY #, BROY # Hold Time 3 3 3 ns 17-4 t18 HOLD, AHOLO Setup Time 12 10 6 ns 17-3 t18a BOFF # Setup Time 12 10 9 ns 17-3 t19 HOLD, AHOLO, BOFF # Hold Time 3 3 3 ns 17-3 t20 FLUSH #, A20M #, NMI, INTR, SMI#, STPCLK#, SRESET, RESET,IGNNE# Setup Time 12 10 6 ns 17-3 2 t21 FLUSH #, A20M #, NMI, INTR, SMI #, STPCLK#, SRESET, RESET, IGNNE# Hold Time 3 3 3 ns 17-3 2 t22 00-031, OPO-OP3, A4-A31 Read Setup Time 6 6 6 ns 17-3 17-4 t23 00-031,OPO-OP3, A4-A31 Read Hold Time 3 3 3 ns 17-3 17-4 I 37 28 2-32S Intel486TM PROCESSOR FAMIL V NOTES: 1. 2. 3. 4. 5. O-MHz operation is'guaranteed when the STPClK# and Stop Grant bus cycle protocol is used. IGNNE# and FERR# are present only in the Intel486 DX, Inte1DX2, and Write-Back Enhanced IntelDX2 processors. Not 100% tested, guaranteed by design characterization. All timing specifications assume CL = 50 pF. See capacitive derating charts for additional timing delays due to loading, A reset pulse width of 15 ClK cycles is required for warm resets (RESET or SRESET). Power-up resets (cold resets) require RESET to be asserted for at least 1 ms after Vee and ClK are stable. 6. CACHE#, WB/WT#, HITM#, and INV are present only in the Write-Back Enhanced IntelDX2 processor, VCC = Table 17-18. 3.3V AC Characteristics for the 75/25-MHz IntelDX4TM Processor 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 1); TCASE == O·C to + 85·C; Cl = 50 pF Symbol Parameter ClKfrequency tl ClK Period Min Max Unit 8 25 MHz 40 125 ns ±250 ps Figure Notes 2 17-2 3,6 tla ClK Period Stability t2 ClK High Time 14 ns 17-2 at2V t3 ClK low Time 14 ns 17-2 at 0.8V t4 ClK Fall Time 4 ns 17-2 2VtoO.8V t5 ClK Rise Time 4 ns 17-2 0.8Vto2V t6 A2-A31, PWT, PCD,BEO-3#, MIIO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA Valid Delay 19 ns 17-6 t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK# Float Delay 28 ns 17-7 ta PCHK# Valid Delay 3 24 ns 17-5 t8a BlAST#,PlOCK# SMIACT# Valid Delay 3 24 ns 17-6 tg BLAST #, PLOCK # Float Delay 28 ns 17-7 tl0 DO-D31, DPO-3 Write Data Valid Delay 20 ns 17-6 2-326 3 3 3 3 I Intel486TM PROCESSOR FAMILV Table 17-18. 3.3V AC Characteristics for the 75/25-MHz IntelDX4TM Processor (Continued) Vcc = 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 1); TCASE = O°C to + 85°C; Cl = 50 pF Symbol Parameter Min Max Unit Figure Notes 28 ns 17-7 3 t11 00-031, OPO-3 Write Data Float Oelay t12 EADS# Setup Time 8 ns 17-3 t13 EAOS# Hold Time 3 ns 17-3 t14 KEN#, BS16#, BS8# Setup Time 8 ns 17-3 t15 KEN#, BS16#, BS8# Hold Time 3 ns 17-3 t16 ROY #, BROY # Setup Time 8 ns 17-4 t17 RDY#, BROY# Hold Time 3 ns 17-4 t18 HOLD, AHOLD Setup Time 8 ns 17-3 t18a BOFF # Setup Time 8 ns 17-3 t19 HOLD, AHOLD, BOFF # Hold Time 3 ns 17-3 t20 RESET, FLUSH#, A20M#, NMI, INTR, IGNNE# SRESET, STPCLK#, SMI# Setup Time 8 ns 17-3 5 t21 RESET, FLUSH #, A20M #, NMI, INTR, IGNNE# SRESET, STPCLK#, SMI# Hold Time 3 ns 17-3 5 t22 00-031, OPO-3, A4-A31 Read Setup Time 5 ns 17-3, 17-4 t23 00-031, DPO-3, A4-A31 Read Hold Time 3 ns 17-3, 17-4 NOTES: 1. 2. 3. 4. 5. VCC5 should be connected to 3.3V ±0.3V in 3.3V-only systems. O-MHz operation is guaranteed when the STPClK # and Stop Grant Acknowledge protocol is used. Not 100% tested. Guaranteed by design characterization. All timing specifications assume CL = 50 pF. See capacitive derating charts for additional timing delays due to loading. A reset pulse widtli of 15 ClK cycles is required for warm resets (RESET or SRESET). Power-up resets (cold resets) require RESET to be asserted for at least 1 ms after Vcc and ClK are stable. 6. For adjacent clocks, assumes frequency of operation is constant. STPClK # input should be used to change frequency of operation. I 2-327 Intel486™ PROCESSOR FAMILY Table 17-19. 3.3V AC Characteristics for the 100/33-MHz IntelDX4TM Processors Vcc = 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 1); TCASE = O°C to + 85°C; CL = 50 pF Min Max Unit ClK Frequency 8 33. MHz t1 ClK Period 30 125 ns t1a ClK Period Stability ±250 ps t2 ClK High Time 11 ns 17-2 at2V t3 ClKlowTime 11 ns 17-2 at 0.8V t4 ClKFaliTime 3 ns 17-2 2VtoO.8V t5 ClK Rise Time 3 ns 17-2 0.8V to 2V t6 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA Valid Delay 14 ns 17-6 t7 A2-A31, PWT, PCD, BEO-3#, MIIO#, D/C#, W/R#, ADS#ilOCK# Float Delay 20 ns 17-7 ts PCHK# Valid Delay 3 14 ns 17-5 tSa BLAST#, PlOCK#, SMIACT# Valid Delay 3 14 ns 17-6 t9 BlAST#, PlOCK# Float Delay 20 ns 17-7 14 ns 17-6 20 ns 17-7 Symbol Parameter 3 Notes 2 17·2 3,6 t10 00-D31, DPO-3 Write. Data Valid Delay t11 DO-031, OPO-3 Write Data Float Delay t12 EADS# SetupTime 5 ns 17-3 t13 EADS# Hold Time 3 ns 17-3 t14 KEN#~ BS16#, BS8# Setup Time 5 ns 17-3 t15 KEN#, BS16#, BS8# Hold Time 3 ns 17-3 t16 RDY #, BRDY # Setup Time 5 ns 17-4 t17 RDY #, BRDY # Hold Time 3 ns 17-4 t1S HOLD, AHOlD Setup Time 6 ns 17-3 t1Sa BOFF # Setup Time 7 ns 17-3 t19 HOLD, AHOlD, BOFF # Hold Time 3 ns 17-3 2-328 3 Figure 3 3 3 I Intel486TM PROCESSOR FAMILY Table 17-19. 3.3V AC Characteristics for the 100/33-MHz IntelDX4TM Processors (Continued) VCC = 3.3V ±0.3V; VCC5 Symbol = 5V ±0.25V (Note 1); TCASE Parameter = O°C to +85°C; Cl Min Max = 50 pF Unit Figure Notes t20 RESET, FLUSH#, A20M#, NMI, INTR, IGNNE#, SRESET, STPCLK#, SMI# Setup Time 5 ns 17-3 5 t21 RESET, FLUSH#, A20M#, NMI, INTR, IGNNE#, SRESET, STPCLK#, SMI# Hold Time 3 ns 17-3 5 t22 00-031, OPO-3, A4-A31 Read Setup Time 5 ns 17-3, 17-4 t23 00-031, OPO-3, A4-A31 Read Hold Time 3 ns 17-3, 17-4 NOTES: 1. 2. 3. 4. 5. VCC5 should be connected to 3.3V ±0.3V in 3.3V-only systems. O-MHz operation is guaranteed when the STPClK # and Stop Grant Acknowledge protocol is used. Not 100% tested. Guaranteed by design characterization. All timing specifications assume CL = 50 pF. See capacitive derating charts for additional timing delays due to loading. A reset pulse width of 15 ClK cycles is required for warm resets (RESET or SRESET). Power·up resets (cold resets) require RESET to be asserted for at least 1 ms after VCC and ClK are stable. 6. For adjacent clocks, assumes frequency of operation is constant. STPClK # input should be used to change frequency of operation. I 2-329 Intel486TM PROCESSOR FAMIL V Table 17-20. 3.3V AC Characteristics for the 100/S0-MHz IntelDX4™ Processor Vcc = 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 1); TCASE = O°C to +85°C; CL = 0 pF Symbol Parameter Min Max Unit ClK Frequency 16 50 MHz t1 ClK Period 20 62.5 ns t1a ClK Period Stability ±250 ps t2 ClK High Time 7 ns 17-2 at2V t3 ClK low Time 7 ns 17-2 at 0.8V 4 ClK Fall Time 2 ns 17-2 2Vto 0.8V t5 ClK Rise Time 2 ns 17-2 0.8Vto 2V tSa A20-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA Valid Delay 2 12 ns 17-6 tSb A2-A19 2 10.5 ns 17-6 t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK# Float Delay 18 ns 17-7 t8 PCHK# Valid Delay 2 14 ns 17-5 t8a . BLAST # , PlOCK#, SMIACT# Valid Delay 2 12 ns 17-6 t9 BlAST#, PlOCK# Float Delay 18 ns 17-7 tlO DO-D31, DPO-3 Write Data Valid Delay 12 ns 17-6 t11 DO-D31, DPO-3 Write Data Float Delay 18 ns 17-7 t12 EADS# Setup Time 5 ns 17-3 t13 EADS# Hold Time 2 ns 17-3 t14 KEN#, BS16#, BS8# Setup Time 5 ns 17-3 t15 KEN#, BS16#, BS8# Hold Time 2 ns 17-3 t16 RDY#, BRDY# Setup Time 5 ns 17-4 t17 RDY #, BRDY # Hold Time 2 ns 17-4 t18 . HOLD, AHOlD, Setup Time 5 ns 17-3 , 3 Figure 2 17-2 3,6 t18a BOFF # Setup Time 5 ns 17-3 t19 HOLD, AHOlD, BOFF # Hold Time 2 ns 17-3 2-330 Notes 3 3 3 I Intel486™ PROCESSOR FAMILV Table 17-20. 3.3V AC Characteristics for the 100/S0-MHz IntelDX4TM Processor (Continued) VCC = 3.3V ±0.3V; VCC5 = 5V ±0.25V (Note 1); TCASE = O°C to + 85°C; Cl = 0 pF Symbol Parameter Min Max Unit Figure Notes t20 RESET, FLUSH#, A20M#, NMI, INTR, IGNNE #, SRESET, STPCLK #, SMI # Setup Time 5 ns 17-3 5 t21 RESET, FLUSH #, A20M #, NMI, INTR, IGNNE#, SRESET, STPCLK#, SMI# Hold Time. 2 ns 17-3 5 t22 00-031, OPO-3, A4-A31 Read Setup Time 4 ns 17-3, 17-4 t23 00-031, OPO-3, A4-A31 Read Hold Time 2 ns 17-3, 17-4 NOTES: 1. 2. 3. 4. VCC5 should be connected to 3.3V ±0.3V in 3.3V-only systems. O-MHz operation is guaranteed when the STPClK# and Stop Grant Acknowledge protocol is used. Not 100% tested. Guaranteed by design characterization. All timing specifications assume CL = 0 pF. 1/0 buffer modeling should be used to calculate additional timing delays due to loading. 5. A reset pulse width of 15 ClK cycles is required for warm resets (RESET or SRESET). Power-up resets (cold resets) require RESET to be asserted for at least 1 ms after Vcc and ClK are stable. 6. For adjacent clocks, assumes frequency of operation is constant. STPClK # input should be used to change frequency of operation. I 2-331 Intel486TM PROCESSOR FAMILY Table 17-21. 3.3V Intel486 Processor AC Specifications for the Test Access Port (Alllntel486 Processors and Frequencies except the IntelDX4™ Processors) VCC = 3.3V ±0.3V; TCASE = O·C to Symbol + 85·C; Cl = Parameter 50 pF Min Max Unit Notes 8 MHz 1 t24 TCK Frequency t25 TCK Period 125 ns t26 TCK High Time 40 ns at2V t27 TCKLowTime 40 ns atO.8V t28 TCK Rise Time 8 ns 2. t29 TCKFaliTime 8 ns 2 t30 TDI, TMS Setup Time 8 ns. 3 t31 TDI, TMS Hold Time 10 ns 3 t32 TOO Valid Delay 3 30 ns 3 t33 TOO Float Delay 36 ns 3 t34 All Outputs (Non-Test) Valid Delay 30 ns 3 t35 All Outputs (Non-Test) Float Delay 36 ns 3 t36 All Inputs (Non-Test) Setup Time 8 ns 3 t37 All Inputs (Non-Test) Hold Time 10 ns 3 3 NOTES: 1. TCK period S; ClK period. 2. Rise/Fall times are measured between 0.8V and 2.0V. Rise/Fall times can be relaxed by 1 ns per 10-ns increase in TCK period. ' 3, Parameters t30-t37 are measured from TCK. 4. Refer to Figure 17-18 for signal waveforms. 2-332 I Intel486™ PROCESSOR FAMILY Table 17-22. 3.3V InteIDX4TM Processor AC Specifications for the Test Access Port (AlllntelDX4 Processor Frequencies) VCC = 3.3V ±0.3V; VCC5 =5V ±0.25V (Note 1); TCASE = O°C to + 85°C; Cl = 0 pF Symbol Parameter Min Max Unit 25 MHz Figure t24 TCK Frequency t25 TCK Period 40 ns t26 TCK High Time 10 ns t27 TCKLowTime 10 ns t28 TCK Rise Time 4 ns t29 TCKFaliTime 4 ns t30 TDI, TMS Setup Time 8 ns 17-8 t31 TDI, TMS Hold Time 7 ns 17-8 t32 TOO Valid Delay 3 25 ns 17-8 t33 TOO Float Delay 30 ns t34 All Outputs (Non-Test) Valid Delay 25 ns 17-8 t35 All Outputs (Non-Test) Float Delay 36 ns 17-8 t36 All Inputs (Non-Test) Setup Time 8 ns 17-8 t37 All Inputs (Non-Test) Hold Time 7 ns 17-8 3 NOTES: 1. Vccs should be connected to 3.3V ± 0.3V in 3.3V-only systems. 2. All inputs and outputs are TTL level. 3. Rise/Fall times are measured between 0.8V and 2.0V. Rise/Fall times can be relaxed by 1 ns per 10-ns increase in TCK period. 4. TCK period ,,; ClK period. 5. Parameters t30-t37 are measured from TCK. I 2-333 Intel486TM PROCESSOR FAMILY 17.4.2 5V AC CHARACTERISTICS Table 17-23 is for 25- and 33-MHz Intel486™ SX, 33-MHz Intel486 DX, 50-MHz IntelSX2TM (25-MHz Max.), 50-MHz IntelDX2TM (25-MHz Max.), 66-MHz IntelDX2 (33-MHz Max.), 50-MHz Write-Back Enhanced IntelDX2 (25-MHz Max.) ar)d 66-MHz Write-Back Enhanced IntelDX2 (33-MHz Max.) processors. Functional operating range: Vee specified. (See also Table 17-24). Table 17-23. 5V AC Characteristics + 85°C; CL = 5V ±0.25V; TeASE = O°C to = 50 pF unless otherwise Bus Speed Symbol Parameter 25 MHz 33 MHz Unit Figure Notes Min Max Min Max Frequency 8 25 8 33 MHz t1 ClK Period 40 125 30 125 ns 17-2 t1 a ClK Period Stability ±250 ps 17-2 Adjacent clocks t2 ClK High Time 14 11 ns 17-2 at2V t3 ClK low Time 14 11 ns 17-2 atO.8V t4 ClK Fall Time 4 3 ns 17-2 2Vto 0.8V ts ClK Rise Time 4 3 ns 17-2 0.8Vto 2V ts A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, BREQ, HlDA, SMIACT#, FERR#, CACHE#, HITM# Valid Delay 16 ns 17-6 3,4 t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#,BREQ,HlDA,CACHE# Float Delay 20 ns 17-7 2,4 . ts PCHK# Valid Delay 3 24 3 22 ns 17-5 tSa BLAST #, PLOCK # Valid Delay 3 24 3 20 ns 17-6 t9 BLAST #, PLOCK # Float Delay 20 ns 17-7 t10 DO- D31, DPO- DP3 Write Data Valid Delay 18 ns 17-6 t11 DO-D31, DPO-DP3 Write Data Float Delay 20 ns 17-7 2 t12 EADS#, INV Setup Time 8 5 ns 17-3 4 t13 EADS#, INV Hold Time 3 3 ns 17-3 4 t14 KEN#, BS16#, BS8#, WB/WT# Setup Time 8 5 ns 17-3 4 2-334 ±250 3 19 3 28 28 3 20 3 28 1 2 I Intel486TM PROCESSOR FAMILY Table 17-23. 5V AC Characteristics (Continued) Functional operating range: VCC = 5V ± 0.25V; T CASE = O°C to + 85°C; CL specified. (See also Table 17-24). = 50 pF unless otherwise Bus Speed Symbol Parameter 25 MHz Min Max 33 MHz Min Unit Figure Notes 4 Max t15 KEN#, BS16#, BS8#, WB/WT# Hold Time 3 3 ns 17-3 t16 ROY #, BROY # Setup Time 8 5 ns 17-4 t17 ROY #, BROY # Hold Time 3 3 ns 17-4 t18 HOLO, AHOLO Setup Time 10 6 ns 17-3 t18a BOFF# Setup Time 10 8 ns 17-3 t19 HOLO, AHOLO, BOFF # Hold Time 3 3 ns 17-3 t20 FLUSH#, A20M#, NMI, INTR, SMI#, STPCLK#, SRESET, RESET, IGNNE# Setup Time 10 5 ns 17-3 3 t21 FLUSH#, A20M#, NMI, INTR, SMI#, STPCLK#, SRESET, RESET, IGNNE# Hold Time 3 3 ns 17-3 3 t22 00-031, OPO-OP3, A4-A31 Read Setup Time 5 5 ns 17-3, 17-4 t23 00-031, OPO-OP3, A4-A31 Read Hold Time 3 3 ns 17-3, 17-4 NOTES: 1. 2. 3. 4. I O-MHz operation is guaranteed when the STPCLK # and Stop Grant bus cycle protocol is used. Not 100% tested, guaranteed by design characterization. IGNNE# and FERR# are present only in the Intel486 OX, Inte10X2, and Write-Back Enhanced IntelDX2 processors. CACHE#, WB/WT#, HITM#, and INV are present only in the Write-Back Enhanced IntelOX2 processor. 2-335 Intel486TM PROCESSOR FAMILY The following specifications are different for existing Inte1SX2, IntelDX2 and Write-Back Enhanced IntelDX2 processors. A system board that will support all of the Intel486 processors should be designed to the worstcase specifications of the 25- and 33-MHz local bus timings. Table 17-24 is for 50-MHz IntelSX2TM (25-MHz Max.), 50-MHz IntelDX2TM (25-MHz Max.), 66-MHz IntelDX2 (33-MHz Max.), 50-MHz Write-Back Enhanced IntelDX2 (25-MHz Max.) and 66-MHz Write-Back Enhanced IntelDX2 (33-MHz Max.) processors. Table 17-24. 5V AC Characteristics Functional operating range: VCC = 5V ±0.25V; TCASE = O·C to +85·C; CL = 50 pF unless otherwise specified. (See also Table 17-23). Bus Speed Symbol Parameter 25 MHz Min Frequency ClK Frequency Max 33 MHz Min 50 8 25 8 Unit Notes Max 66 MHz 33 MHz ts A2-A31, PWT, PCD, BEO-3#, MIIO#, D/C#, W/R#, ADS#, lOCK#, BREQ,HlDA, SMIACT#, FERR#, CACHE#, HITM# Valid Delay 14 ns ts PCHK# Valid Delay 14 ns tSa BlAST#, PlOCK# Valid Delay 14 ns tlO DO-D31, DPO-DP3 Write Data Valid Delay 14 lis t1S HOLD, AHOlD Setup Time 8 t1Sa BOFF # Setup Time 8 t20 FlUSH#, A20M#, NMI, INTR, SMI#, STPClK#,SRESET, RESET, IGNNE# Setup Time 8 3,4 ns 7 ns ns NOTES: 1. O-MHz operation is guaranteed when the STPCLK# and STOP GRANT bus cycle protocol is used. 2. Not 100% tested, guaranteed by design characterization. . 3. IGNNE# and FERR# are present in the IntelDX2 and Write-Back Enhanced IntelDX2 processors only. 4. CACHE#, WB/WT#, HITM#, and INV are present only in the Write-Back Enhanced IntelDX2 processor. 2-336 I Intel486TM PROCESSOR FAMILY Table 17-25. 5V AC Characteristics 50-MHz Intel486™ OX Processors Functional operating range: Vee = 5V ±0.25V; TeASE = O·C to + 85·C; CL = (Note 1). Symbol I Parameter Figure' Min Max Unit Frequency 16 50 MHz t1 ClK Period 20 62.5 ns 17-2 t1a ClK Period Stability ±250 ps 17-2 Adjacent clocks t2 ClK High Time 7 ns 17-2 at2V t3 ClK low Time 7 ns 17-2 atO.8V t4 ClK Fall Time 2 ns 17-2 2Vto 0.8V ~ ClK Rise Time 2 ns 17-2 0.8V to 2V t6 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, BREQ, HlDA, FERR# Valid Delay 12 ns 17-6 t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, BREQ, HlDA Float Delay 18 ns 17-7 ts PCHK# Valid Delay 3 14 ns 17-5 tSa BlAST#, PlOCK# Valid Delay 3 12 ns 17-6 t9 BLAST # , PlOCK# Float Delay 18 ns 17-7 t10 DO-D31, DPO-DP3 Write Data Valid Delay 12 ns 17-6 t11 DO-D31, DPO-DP3 Write Data Float Delay 18 ns 17-7 t12 EADS# Setup Time 5 ns 17-3 t13 EADS # Hold Time 2 ns 17-3 t14 KEN#, BS16#, BS8# Setup Time 5 ns 17-3 t15 KEN#, BS16#, BS8# Hold Time 2 ns 17-3 t16 RDY #, BRDY # Setup Time 5 ns 17-4 t17 RDY #, BRDY # Hold Time 2 ns 17-4 t1S HOLD, AHOlD, BOFF # Setup Time 5 ns 17-3 t19 HOLD, AHOlD, BOFF # Hold Time 2 ns 17-3 t20 FlUSH#, A20M#, NMI,INTR, RESET, IGNNE# Setup Time 5 ns 17-3 t21 FlUSH#, A20M#, NMI,INTR, RESET, IGNNE# Hold Time 2 ns 17-3 3 3 Notes 1 2 2 2 2-337 Intel486™ PROCESSOR FAMILY Table 17-25. 5V AC Characteristics 50-MHz Intel486TM OX Processors (Continued) Functional operating range: Vcc Symbol = SV ±0.2SV; TCASE = Parameter O·C to +8S·C; Cl Min Max = (Note 1). Unit Figure t22 00-031, OPO-OP3, A4-A31 Read Setup Time 4 ns 17-3, 17-4 t23 00-031, OPO-OP3, A4-A31 Read Hold Time 2 ns 17-3, 17-4 Notes NOTES: 1. Specifications assume CL = 0 pF. I/O buffer model must be used to determine delays due to loading (trace and component). First order I/O buffer models for the Intel486 processor are available. Contact Intel for the latest release. 2. Not 100% tested. Guaranteed by design characterization. Table 17-26. 5V Intel486 Processor AC Specifications for the Test Access Port (All Processors and Frequencies) Vcc = SV ± 0.2SV; T CASE = O·G to + 8S·C; Cl Symbol Parameter = SO pF , Min Max Unit Notes 8 MHz 1 t24 TCK Frequency t25 TCK Period 12S ns t26 TCK High Time 40 ns at2V t27 TCKLowTime 40 ns atO.8V t28 TCK Rise Time 8 ns 2 t29 TCK Fall Time 8 ns 2 t30 TOI, TMS Setup Time 8 ns 3 t31 TOI, TMS Hold Time 10 ns 3 t32 TOO Valid Oelay 3 30 ns 3 t33 TOO Float Oelay 36 ns 3 t34 " All Outputs (Non-Test) Valid Oelay 30 ns 3 t35 All Outputs (Non-Test) Float Oelay 36 ns 3 3 t36 All Inputs (Non-Test) Setup Time 8 ns 3 t37 All Inputs (Non-Test) Hold Time 10 ns 3 NOTES: 1a. Vee should be connected to 3.3V ± 0.3V in 3.3V-only systems. 1. TCK period :s; ClK period. 2. Rise/Fall times are measured between 0.8V and 2.0V. Rise/Fall times can be relaxed by 1 ns per 10·ns increase in TCK period. 3. Parameters t30-t37 are measured from TCK. 4. Refer to Figure 17·18 for Signal waveforms. 2-338 I Intel486TM PROCESSOR FAMILY 1.5V \5 ~------ \1 ------~ 242202-F9 Figure 17-2. elK Waveforms eLK INV,EADS# =.:.m.._+_.a:.l~ BS8#,BS16#. =~,...;:.."-~::..::....~ WBIWT,KEN# ~""",,,____+-_...w;..,..,. BOFF#,AHOLD, =~,..;::."-~::..::....~ HOLD ~""""''--_+-_-'Il~ RESET,FLUSH#, =~~'-+-"::':;"""TIl~ A20M#,IGNNE#, INTR,NMI .......;:.;w'--_+-_-'Il~ A~A31~~~~~~~b (READ) ..,....~ _ _ _ _w.;,,:,... 242202-GO Figure 17-3. Input Setup and Hold Timing I 2-339 Intel486™ PROCESSOR FAMILY Tx Tx Tx ClK [ ROY#, BROY# [ 00-031 OPO-OP3 +-___~~ ~~_ _ _ _ _ [.\~~L_-4. ~ __~~242202-G1 Figure 17,4. InputSetup and Hold Timing Tx Tx Tx ClK [ BROY#, ROY# [ ~...:~t...-_ _+ __.J..~~ 00-031 [ OPO-OP3 -4oI~ . .t...-__--t---""'~~ _ PCHK# [ 242202-G2 Figure 17·5. PCHK # Valid Delay Timing 2-340 I Intel486TM PROCESSOR FAMILV CLK A2-A31. PWT. PCO. BED-3#. MIIO#. 0/C#. W/R#. AOS#. LOCK#. FERR#. BREa. HLOA. CACHE#. HITM# 0D-031. OPD-3. BLAST#. PLOCK# 242202-G3 Figure 17·6. Output Valid Delay Timing CLK A2-A31. PWT. PCO. BED-3#. MIIO#. 0/C#. W/R#. AOS#. LOCK#. FERR#. BREa. HLOA. CACHE# 0D-031. OPO-3. BLAST#. PLOCK# 242202-G4 Figure 17·7. Maximum Float Delay Timing I 2·341 Intel486TM PROCESSOR FAMILY TCK TOI. TMS ~~~~~~,--------~~~ TDO~~~~~~~~~~__________~~~ s~~~~! t34i XXXXXXXXXXXX>c ~!;:t36 t35} t37:::j~ Si~';ia~!~ X'-----r- ~ 242202-GS Figure 17-8. Test Signal Timing Diagram 17:5 Capacitive Derating Curves The capacitive derating curves illustrate output delay versus capacitive load for 3.3V and 5V Intel486 processors. The derating curves show the delays for the rising and falling edges under worst-case conditions. Figure 17-9 and Figure 17-10 apply to all 3.3V Intel486 SX. Intel486 OX, and IntelOX2 processors. Figure 17-11 and Figure 17-12 apply to 5V Intel486 OX and IntelOX2 processors. Figure 1713 and Figure 17-14 apply to 5V Intel486 SX and IntelSX2 processors. Figures 17-15 through 17-17 apply to the IntelOX4 processor. The figures apply to all frequencies specified for each corresponding product. Refer to Appendix C for bus frequencies above 33 MHz for Intel486 processors. 3.3V Intel486™ SX, Intel486 OX, and IntelDX2™ Processors (Rising) nom+7 / nom+6 nom+5 ~- nom+4 'iii' .:.. /' nom+3 >.!!! nom+2 III Q / nom+1 / ~ ~.~ nom nom-1 V ~ /' /~ I V~ I nom-2 25 50 75 100 Capacitive Load (pF) 125 150 242202-G6 NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-9. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a Low-to-High Transition 2-342 I Intel486™ PROCESSOR FAMILY 3.3V Intel486111 SX, Intel486 OX, and Intel0X2111 Processors (Failing) nom+5 I I nom+4 i nom+3 ! >- ~ !! nom+2 nom+1 ~ nom ~ ..------' ! I ~ V-~ ! ! ~ nom-1 I nom-2 50 25 75 100 Capacitive Load (pF) 125 150 242202-G7 NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-10. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a High-to-Low Transition 5V Intel486 111 OX and IntelOX2TM Processors (Rising) nom+4 I nom+3 'iii .s.. I iI nom+2 I ~ ~ >III Qi nom+1 Q nom nom-1 i ~ 25 ...- ~ --- .------' 1 i 50 75 100 Capacitive Load (pF) 125 150 242202-G8 NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-11. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a Low-to-High Transition I 2-343 Intel486TM PROCESSOR FAMILY 5V Intel486 T11 OX and InteIDX2T11 Processors (Falling) nom+7 nom+6 nom+5 nom+4 Ui' .5.. nom+3 >ca 'i nom+2 Q ~ nom+1 nom nom-1 V nom-2 ~ ~ ~ ~ ------ ----------- 25 75 100 Capacitive Load (pF) 50 125 150 242202-G9 NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-12. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a High-to-Low Transition SV Inte1486 T11 SX and Inte1SX2T11 Processors (Rising) nom+5 .----' nom+4 nom+3 ! nom+2 ~ >.!!! nom+1 ~ -------- nom-2 25 .--------- -------- nom nom-1 ---' .-----' ~ 50 75 100 Capacitive Load (pF) 125 150 242202-HO NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-13. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a Low-to-High Transition 2-344 I Intel486TM PROCESSOR FAMILV 5V Intel486 111 SX and IntelSX2 111 Processors (Failing) nom+7 nom+6 nom+5 nom+4 Wi .:. nom+3 >'ii nom+2 ~ III Q nom+1 nom nom-1 ~ V ~ V / ~ / V ~ nom-2 100 75 Capacitive Load (pF) 50 25 125 150 242202-H1 NOTE: This graph will not be linear outside of the capacitive range shown. nom = nominal value from the AC Characteristics table. Figure 17-14. Typical Loading Delay versus Load Capacitance under Worst-Case Conditions for a High-to-Low Transition IntelDX4™ Processor (Failing) (3V Signals) nom+6 nom+5 -- nom+4 Iii' nom+3 >III 'ii nom+2 .:. Q / nom+1 nom nom-1 V ~ V ~ ~ ----' ~ ~ ~ nom-2 25 50 75 100 Capacitive Load (pF) 125 150 242202-H2 Figure 17-15.lnteIDX4TM Processor Capacitive Derating Curve for High-to-Low Transitions (3V Signals) I 2-345 Intel486TM PROCESSOR FAMILV IntelDX4™ Processor (Failing) (5V Signals) nom+12 nom+10 .---' nom+8 nom+6 Ui' c >1\1 .... 8 nom+4 ......--' nom+2 nom nom-2 ~ V V ~ ~ ~ ~ ~ nom-4 25 75 100 Capacitive Load (pF) 50 125 150 242202-H3 Figure 17-16.lnteIDX4TM Processor Capacitive Derating Curve for Low-to-High Transitions (5V Signals) IntelDX4™ Processor (Rising) (3V/5V Signals) nom+6 nom+5 nom+4 Ui' nom+3 .!!! nom+2 .s.>- ! / nom+1 nom nom-1 ~ ~ ~ ~ ~ ----' ~ ~ ~ , nom-2 25 50 75 100 Capacitive Load (pF) 125 150 242202-MO Figure 17-17.lnteIDX4TM Processor Capacitive Derating Curve for Low-to-High Transitions (3V/5V Signals) 2-346 I Intel486TM PROCESSOR FAMILY 18.0 MECHANICAL DATA This section describes the package dimensions and thermal specifications for all processors in the Intel486 processor family. NOTE: For further details about thermal and mechanical package specifications and methodologies, refer to the 1994 Packaging Handbook (order number 240800). 18.1 Intel486 Processor Package Dimensions The processor dimensions are listed in the following order: • 168-pin PGA package; • 208-lead SQFP package; • 196-lead PQFP package. 18.1.1 168-PIN PGA PACKAGE r·~··~~~~~~~~~~~~~-~--------------·~·~IrSI III 1.65 REF. @@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@ @@~@@@@@@@@@@@~@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ o@@ @@@ @ @ @@@ @@o@@@@@@@@@@@o@@ @@@@@@@@@@@@@@@o@ ( PIN C3 " SEATING PLANE I illS (ALL PINS) I }=~ D "'-.. ./ SWAGGED PIN DETAIL ~@@@@@@@@@@@@@@@ r- t~; REF. 45° CHAMFER (INDEX CORNER) SWAGGED PIN (4 p~) --~ AI BASE PLANE 242202-H4 Figure 18-1. 168-Pin Ceramic PGA Package Dimensions I 2-347 Intel486TM PROCESSOR FAMIL V Table 18-1. 168-Pin Ceramic PGA Package Dimensions Family: Ceramic Pin Grid Array Package Inches Millimeters Symbol Min Max Notes A 3.56 4.57 A1 0.64 1.14 SOLID LID SOLID LID Min Max 0.140 0.180 0.025 0.045 SOLID LID SOLID LID A2 2.8 3.5 0.110 0.140 A3 1.14 1.40 0.045 0.055 8 0.43 0.51 0.017 0.020 0 44.07 44.83 1.735 1.765 01 40.51 40.77 1.595 1.605 e1 2.29 2.79 0.090 0.110 L 2.54 3.30 0.100 N 1.52 0.130 168 168 Sl Notes 2.54 0.060 0.100 IWS REV X 7/15/88 ISSUE Table 18-2. Ceramic PGA Package Dimension Symbols Letter or Symbol Description of Dimensions A Distance from seating plane to highest point of body A1 Distance between seating plane and base plane (lid) A2 Distance from base plane to highest point of body A3 Distance from seating plane to bottom of body B Diameter of terminal lead pin 0 Largest overall package dimension of length 01 A body length dimension, outer lead center to outer lead center e1 Linear spacing between true lead position centerlines L Sl Distance from seating plane to end of lead Other body dimension, outer lead center to edge of body NOTES: 1. Controlling dimension: millimeter. 2. Dimension "e1" ("e") is non-cumulative. 3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 4. Dimensions "8", "81" and "C" are nominal. 5. Details of Pin 1 identifier are optional. 2·348 I Intel486TM PROCESSOR FAMILV 1B.1.2 20B-LEAD SQFP PACKAGE j4--------:30.6O%O.25-------toj 1.14 (reI) .40 Min 156 b=F==r"R'Fn=={ 0.13+0.12-0.08 0" Min 7"Max { ~-*- \ 0.60%0.10--1 1.30 Rei -..I ~3'3t~'08 52 -II-- 0.25 ~ 0.13 Min Max 105 53 i 3.70 Max 104 • Note· length Measurements same as Wid1h Measurements , ~ .\ Tolerance Window lor Lead Skew from Theoretical True Position i4-0.10Max Units: mm 242202-H5 Figure 1B-2. 20B-Lead SQFP Package Dimensions I 2-349 Intel486TM PROCESSOR FAMILY 18.1.3 196-LEAD PQFP PACKAGE , 1-$1 ,',,@10I-®-'(;:+®1 irm , \ '" 0 0 " !O.OO6@!C!A@.B®!O®j I -$- , Lr " c.oos@ c "G)-a@D@ J ---, 1-----------0 Of i AL_ I-r--- "- -- J " 1-$1 ,"·@H·®-'®I'®I -$- o.oos@ C A®-a@D® ~A ~ ~c " 242202-H6 NOTE: Interpret dimensions and tolerances in accordance with ANSI Y14.5M·1982. Figure 18-3. Principal Dimensions and Data.for 196-Lead Plastic Quad Flat Pack Package Table 18-3. Symbol List and Dimensions for 196-Lead PQFP Package Symbol Description of Dimensions Min Max A Package Height: Distance from the seating plane to the highest point of body. 0.160 0.175 A1 Standoff: The distance from the seating plane to the base plane. 0.020 0.035 D,E Overall Package Dimension: Lead tip to lead tip. 1.470 1.485 D1,E1 Plastic Body Dimension 1.347 1.353 D2,E2 Bumper Distance Without FLASH With FLASH 1.497 1.497 1.503 1.510 . Seating Plane Copianarlty 0.000 0.004 CP NOTES: 1. All dimensions and tolerances conform to ANSI Y14.5M·1982. 2. Dimensions are in inches. 2·350 I Intel486TM PROCESSOR FAMILY 0.009 R 0.011 0.005 0.0'6 1-$-1 0 . 007 , FERR#, HLDA, PLOCK#, LOCK# Revision: @ LOW 60.0 1" -"--""'- ._ ... ~ 40.0 S 30.0 -' 20.0 '0 10.0 ) I / / 0.0 -+-+--i 0.0 1.0 2.0 3.0 4.0 5.0 V_at (valls) Beyond The Rail Info DlodeloOND '" I\) c.., 0> c.l ~, C;; tol tol min mIX IYP -S.O -1000 -1830 ·1380 HIGH ·279.0 -472.0 ·3700 ·467 -510 ·05 00 ·89 ·57.1 20.7 00 00 ·13 5 0.0 05 130 201 12.9 1.0 15 230 372 23.7 ;( 290 49.1 31.5 - 2.0 33.0 56.6 35.4 25 33.5 58.6 363 30 34.0 59.0 36& 3.5 34.1 594 369 4.0 34.2 59.7 370 4.5 34.4 59.8 37.1 5.0 60.0 372 6.0 34.4 34,4 60.0 372 100 34.4 600 37.2 60.0 / 40.0 I !, 20.0 . 0.0 / mIn max unit -20.0 -40.0 -60.0 . . I 325.0 mOhm RIse .1 0.0 0.0 vee 0.0 L_pkg 9.5 15.1 nH Fan I ·04 0.0 vcc+0.4 0.0 C pkg 56 9.7 pF -0.5 0.2 llee+0.5 0.0 C comp 2.9 3.2 pF ·06 1.1 llee+0.6 0.0 0.1 ·0.9 -1.0 11.0 30.0 vcc+O.9 ycc+l.D B.O 1.0 .. 1.72 1.91 I 2.43 1 Low infel~ 1m" does noT ~art.. Doda operation NOTE: vee =voltage at pin Jl IYP ~. ~U 00 _, U ~1 10 -415 -702 15 400 ·67.1 421 20 ·320 ·585 ·35 1 448 2.5 ·220 ·449 ·241 30 ·90 ·25.2 ·95 3.5 I 3.4 I -<4.6 I 4.0 I 300 I lB.l I 45 I 43 0 I 43.0 I 41 7 5.0 64.0 56.4 4B5 6.5 243 5.5 55.4 B09 67.3 6.0 59.3 93.4 746 622 104.D 10.0 I 637 I 1090 I 1 I_oh pkg) unit 1.7 voisins 2.07 vollS/os 79 B B24 5' CD ~ CD 0'1 -t , 14.0 mu -1410.0 7.0 Ra~ RateJ!nto !!I!,F. no min I max I lVo 1.51 Loh -5.0 Simplified Output Resistance vcc·2 10f purposes OCtMw lhan ESO PfOlecllOn ~ V_oh (Valls) 140.0 ycc.O.7 llee.0.8 _oj -so.o RJ>kg 3.0 / ~ ,/' .-.,.~ l(mA) 6.0 <7/ ------ / / V ·0.7 ~ C! l(mA) ·0.8 / +- Packaglnl Characteristics Diode to Vee· / I E I_oh mIn I V oh 80.0 20 ·10 V (-vee) ~ tol Yol " 50.0 « r-- HIgh I I I min 27.4 25.2 ~1I.da..,.."-" ...... ~od .. I max I I 113.6 I I 63.1 I ".SO""""""". unll ONn' ONn. iii: "tI :II o(") m (J) (J) o:II "11 Thl.lnfor....tton I. lor modeling purpo ... on., .. d II not gu....nteed. » !!: !:;: i :; .. i IBIS I iii 1/0 Buffer Information Sheet Component: 5lgnels: InleIDX4(TM) PrOCessor A-3 331 SOMHz Bus BE<3:0>#, BREQ, D/C., MilO', PWT, PCD, PCHK#, W/R# Buller Type: OUI GROUP 12 Revision: ":JJ g ~ :JJ i2 ~ ~ Packaging Characteristics Beyond The Rail Info DlodolOGND l(mA) V 0.0 0.0 0.0 -0.4 0.2 ~-0.6 1.1 vcc.O.6 0.0 ·0.7 ·0.8 ·0.9 ·1.0 3.0 6.0 11.0 30.0 ... .0 7 0.1 1.0 8.0 14.0 -vee) I.., Dtode to Vcc· l(mA) V 0.0 voc 0.0 vcc+D.4 ....0.5 0.0 ""III does nol "'.v..... vcc+oa ycc+O.9 vcc+l.0 H_plca L-Pka C-Pkg C_comp m•• unb 140 8.5 5.2 27 325 20.4 11.0 2.7 mOwn pF pF I HI•• Fon ._--- I min 1.51 I mOl !YP unit I 1.7 '2.07 voisins 1.72 1.91 2.43 voisins Simplified Output Resistance Low' infel· ""2 dIIXII opet.aIMlfl IDf purposes oIhet IhIn ESD prolectlOn nH Ramp Rate (Into IIpF. no pk I) - min NOlE: lhl. 1n1or_lIon I. lOr modeling purpo_ only ..d Is nOllu....'ood. vee =voltage at pin J I High ~~ I I min 27.4 25.2 I I I mi. 1136 63.1 ---......,......" ............. ... unll I I ohms ohms -::::J c[ 18 - -- £ IBIS 1/11 Buffer Information Sheet ,------_ 6 Lol Lol Lol min mOll Iyp -1000 ·1830 ·1380 20 279.0 472.0 370.0 10 ·467 571 510 0.5 ·89 ..~ -5.0 60.0 , 50.0 ~ 40.0 g -, 30.0 . I 0 20.0 10.0 0.0 / .. _ 0.0 ---_._._--- --1- LO 2.0 +--!----., -j---+'-I 3.0 4.0 5.0 V.ol (Valls) --,------- I\) c:, (J) 01 Ui 0.0 13.0 20.1 129 .M.. .~ .E1. ~ 15 290 49.7 315 33.0 56.6 Diode 10 Vee· 25 335 58.6 36.3 340 59.0 366 -80.0 59.8 371 37.2 6.0 344 60,0 37.2 100 34.4 600 372 min max unll 140.0 325.0 mOhm 00 0.0 vee 00 L pkg 9.1 15.0 nH ·04 0.0 vcc+O.4 00 C pkg 48 6.5 pF ·05 0.2 vccf-O.5 00 C_comp 27 2.7 pF ·0.6 11 vcc.• O.6 00 vec+D.7 01 Ycc+O.8 10 11.0 voc+0.9 80 -~_D 30.0 yec+l.0 140 95.2 43.0 ·735 471 0.5 ·420 721 461 10 ·415 ·70.2 448 1.5 -400 671 ·421 -351 320 585 449 241 30 ·90 252 ·95 a '.DP<2:0> Component: Signals: me. _L~~_.I High ~~Ioodo..c,..e"-tl 25.2 ....... ~ 63 t ... w"'.s.o __ ... unll ohms ohms ;: oo" :0 m UJ UJ o :0 "'11 » !: !< -• I\) :;- lJ 0) IBIS 0) CD CD I/O Buffer Information Sheet Signels: 0) Bu"e. Type: I/O GROUP II Revision: InleIDX4(TM) Processor A-3 33/50MHz Bus DBUS<31 :17>, DP<3> Component: -4 iii: 'U ::tI - - LOW 60.0 50_0 _ 40_0 ,I ~ § ! 30.0 '0 -' 20.0 10.0 i i f --t--+--+--t--- 0.0 0_0 1.0 2.0 3.0 I 4.0 5.0 R pkg l(mA) V 0_0 0_0 vee 00 L pkg -04 0.0 vcc ..O 4 0.0 C_pkg l(mA) ·0.5 0.2 ycc+O.S 00 1.1 vcc+O.6 0.0 -0.7 3.0 6_0 ycc+O.7 0.1 ycc+O.8 ·0.9 11.0 vcc+O.9 1.0 8_0 -10 30.0 ycc+1.0 14.0 -08 (-vee) .... '" '"'" '"a;, Diode 10 Vee· -06 In," tol mu Lal min -5_0 -1000 -1830 -1380 tal -2.0 -279.0 -472.0 -370.0 -467 -57.1 -510 0.5 -8.9 -20.7 -13.5 00 0.0 0.0 0.0 0.5 130 20_1 12_9 1.0 230 37.2 23.7 SO.o 60.0 40.0 ~ E 20_0 0_0 15 290 49.7 315 33_0 566 354 25 33.5 363 30 3_5 340 58.6 59_0 ~, -20.0 366 -40.0 34.1 59_4 36.9 40 34_2 59.7 370 -60_0 344 598 37.1 -SO.O 45 ~- -~~ ... ~ --1- - q C camp 60 344 600 372 344 600 37.2 '_oil mu -1050_0 -1880.0 -66.7 -952 -798 00 -43.0 -735 -471 05 -42.0 -72.1 -461 10 -415 -70.2 -448 15 -40.0 -671 -421 20 -320 -58.5 -35 I 25 -22.0 -44_9 -241 30 -90 -95 35 34 30_0 25.2 -4_6 18.1 243 430 430 417 485 64.0 56"-_ .-2 5 554 809 67.~_ IUD -------- 63.7 en en o::tI ~ I !; !:( 65 45 ~ ~ .. ~~_2.0 62.2 m ·1410.0 50 /" n '_typoil -1.0 40 V_oh (Vol") ~i.. ---.?~ ~ 1040 79~ __ 1090 824 Ramp Rate (Into [~F. no pk ) min me. unit 140.0 325_0 mOhm Rise 13.7 nH Fall 6.3 pF 3.2 pF 29 '"// .E.L 10.0 8.5 4_0 0 ~.,.~.,r toll min r--Voh -5.0 2.0 I I L min 1.51 1.72 I I m.. ryp 1.91 1 2.43 1.7 2.07 ~ .~ voisins unit Low infel~ vcc·2 does nol guatatll. . Qode operatIOn 101 purposes othef than ESD prolectlOn HIGH typ -1.0 o -----"---- Packagln CharacterIstics Beyond The Rail Info V ry;; ...~~ V_ol (vall.) Dlod.toGND -----_.--- --------~---- 'NOTE: 0 ThlalnformaUon Is for modeling purpose. only and Is not guaranteed. vee = voltage at pin JI ohms HI h ' ....... ohms "-.~"""od,,".&Odvn""""" -- c( @> _. €: IBIS I/O Buffer Information Sheet Component: Signals: \.---I 1_01 'YoIlmln LOW 60.0 .~ so.o ! : «E o •... --.. -.. -" ., .. .' ~ 40.0 - I 30.0 ; ! --------.-..- - / / -' 20.0 - /; 10.0 ; 0.0 -t--t--+---t-----+-I 0.0 1.0 2.0 3.0 4.0 5.0 V_ol (volls) Beyond The Rallinio ~ '":::;, ~ c.l C1l -.,J Lol Iyp ·1000 ·1830 ·1380 20 ·279,0 ·472.0 370.u .1.0 -46.7 -57.1 -51.0 0.5 -8.9 -20.7 ·135 0.0 0,0 0.0 0,0 05 130 20.1 12.9 10 230 37.2 237 1.5 29.0 49.7 31.5 20 330 56.6 35.4 2.5 33.5 586 36.3 3.0 340 59.0 366 3.5 341 594 36.9 4.0 34.2 59.7 370 45 34.4 59.8 37.1 5.0 34.4 60.0 37.2 60 I 344 I 60.0 I 344 I 600 HIGH 80.0 -1.0 60.0 I / 40.0 ;( .5 ii _, 20.0 J t·_·t--+-t-t-·+ :3 <7. .20.0. 0.0 min mo. unit 37.2 R pkg 140 325 mOhm RI •• 0.0 L_pkg 79 14.7 nH Fall 04 00 vcc-+O.4 0.0 C pkg 44 74 pF ·05 0.2 vcc+D.5 00 C_comp 29 29 pF 00 01 -0.8 6.0 vcc+O.8 1.0 09 11.0 vcc+O.9 8.0 -1.0 30.0 ycc+1.0 14.0 infel~ 'NOTE: vee = voltage at pin Jl Thl.'n'ormlldon I, 'or mod.Uog purpo ... only IRd I, no. gu••nleed. ·79 B ·471 ·461 10 ·415 70.2 448 15 400 ·67.1 421 -58.5 ·351 -44.9 -241 -252 ·95 ·32 0 2.5 ·220 30 -90 35 I 3.4 I -46 I 6.5 I 18.1 I 243 I 430 I 43.0 I I 48 5 I 64.0 I 564 4.0 I 5.0 300 5.5 J. J_ min 1.51 172 .1 .1 41.7 673 60 746 7.0 798 I 637 109.0 I 824 ~ unit :; 1.91 17 volsins 243 207 volsins CD ma. min ...'2 -95.2 -73.5 -72 1 Simplified Output Resistance Inlel does nol guMiIlt. . Godt opel'allon 'IOf pu~os~0IJ!ef ~~~~~pt--'plKhon -667 -430 Ramp Rate (Into OpF. no pleg) limA) vccaO 7 Iyp ·1410.0 -420 10.0 V vcc.O.6 "/// 372 vee 1.1 -0I V_oh (Volls) 0.0 30 1_011 mIX 0.5 4.5 limA) 06 / / /.n ......... V ·07 q N .40.0 . ::.~ I @ 0,0 20 Packaglnl Characteristics Diode to Vee· toll 1_011 min I Y 011 ~ ·1050.0 ·1880.0 00 (·vee) o Lol mi. -5.0 10.0 Dlod.toGND Bune' Typo: I/O GROUP 13 Revision: InleIOX4(TM) Processor A-3 331 50 MHz Bus ABUS<31:4> Low -, High I s...,IiIod~,,",._ 27.4 25.2 max -:- unit ;;36-:::1 ohms 63.1 ohms ............... M _ _ ". ~ CO en ~ -U :D oo m en en o:D ~ 3: ~ Intel486TM PROCESSOR FAMILV Sample Text Listing of IBIS Files for IntelDX4 Processor I*******************~·************,*************************************~** I [IBIS Ver] [File name] [File Rev] [Date] [Source] [Notes] 1.1 inteldx4pg.ibs 2.0 3/23/94 I [Disclaimer] File originated at Intel Corporation The following information corresponds to the INTELDX4(TM) processor and has been correlated with silicon. This file is for the PGA package only.lntelDX4 processor This information is for modeling purposes only. and is not guaranteed. I I 1************************************************************************ I . I [Component] [Manufacturer] [Package] I I typ R_pkg L_pkg C-l)kg INTELDX4 PROCESSOR Intel min max 2329m 728m 3930m 17.79nH 8.56nH 6.03pF 1.89pF 27.01nH 10.16pF I I 1************************************************************************ I [Pin] A01 A02 A04 ADS A06 AD8 A15 A16 A17 B01 B02 B06 B08 B10 B15 B17 COl CD2 C03 C06 C07 C08 C09 C10 C12 signal_name D20 D22 D23 DP3 D24 D29 IGNNE# INTR AHOLD D19 D21 D25 D31 SMI# NMI EADS# D11 D18 CLK D27 D26 D28 D30 SRESET SMIACT# R_pin L_pin C_pin 15.54n 3.88p 1866m 13.70n 5.60p 1808m 1468m 13.33n 3.14p 11.59n 4.50p 1406m I102 13.02n 3.04p 1412m I102 10.90n 4.14p I/02 1274m Input1 1858m 13.96n 5.74p 14.47n 6.00p Input1 1956m 22.11n 9.99p 3414m Input1 13.46n 5.47p· 1762m I102 14.27n 3.45p I102 1636m 11.72n 2.61p 1178m I102 1050m 9.73n 3.52p I102 10.45n 2.19p Input1 948m 13.12n 3.07p Input1 1430m 14.78n 3.62p Input1 1728m 18.73n 4.93p II01 2440m I/02 13.21n 3.10p 1446m 18.99n 5.02p Clockbuffer 2486m 10.54n 2.21p I102 964m 8.90n 3.09p I102 892m I/02 752m 9.36n 1.82p 9.22n 1.78p 728m I102 Input1 784m 9.54n 1.88p 12.23n 2.78p Output1 1270m model_name I/02 I/02 I/02 242202-J3 2-368 I Intel486™ PROCESSOR FAMILY C14 C15 C16 C17 001 002 003 015 016 017 E03 E15 F01 F02 F03 F15 F16 F17 G03 G15 H02 H03 H15 J02 J03 J15 J16 J17 K03 K1S L02 L03 L15 M03 M15 N01 N02 N03 N1S N16 N17 POl P02 P03 P15 Q01 Q03 Q04 Q05 Q06 Q07 Q08 Q09 Q10 Q11 Q12 FERR# FLUSH# RESET BS16# 09 013 017 A20M# BS8# BOFF# 010 HOLO OP1 08 015 KEN# RDY# BE3# D12 STPCLK# 03 OP2 BROY# 05 016 BE2# BElli PCO 014 BEOII 06 07 PWT 04 O/CII 02 01 OPO LOCKII MIIOII W/R# 00 A29 A30 HLOA A31 A17 A19 A21 A24 A22 A20 A16 A13 A9 AS Output1 Input1 Input1 Inputl I/01 I/01 1/02 Input1 Inputl Input1 I/01 Input1 1/01 1/01 1/01 Input1 Input1 Output2 I/01 Input1 I/01 I/01 Input1 I/01 I/01 Output2 Output2 Output2 I/Ol Output2 I/01 I/01 Output2 I/01 Output2 I/01 I/01 I/01 Output1 Output2 Output2 I/01 1/03 1/03 Output1 1/03 1/03 1/03 1/03 1/03 1/03 1/03 1/03 1/03 1/03 1/03 1904m 1342m 14BOm 2756m 2718m 2100m 1156m 1148m 3474m 3452m 2356m 1920m 2394m 1864m 858m 2404m 1804m 3336m 1912m 3930m 1708m 928m 2134m 1528m 1614m 848m 1002m 2266m 1160m 954m 1432m 1048m 1548m 1000m 1442m 1448m ll98m 1038m 1118m 1744m 1676m 1532m 1292m 1194m 1568m 1608m 2016m lS84m 956m 920m 894m 788m 850m 836m 914m 966m 15.76n 11.26n 11.9Bn 20.49n 18.47n 16.84n 11.60n 11.56n 22.43n 22.31n 16.57n 15.85n 16.77n 15.53n 9.95n 16.82n 15.20n 23.71n 14.24n 27.01n 14.67n 9.09n 17.03n 13.67n 14.14n 8.67n 10.75n 17.77n 10.30n 9.22n 11.73n 11.00n 12.34n 9.47n 13.19n 11.81n 11.84n 10.95n 10.09n 14.87n 13.01n 12.2Sn 12.36n 10.48n 13.89n 14.11n 16.38n 12.S3n 10.49n 9.0Sn 8.91n 9.56n 8.68n 9.82n 9.02n 9.29n 3.95p 4.32p 4.70p 5.52p 8.09p 4.31p 2.57p 2.55p 10.16p 10.10p 7.10p 3.98p 7.20p 3.87p 2.02p 7.23p 3.76p 6.58p 5.88p 7.68p 3.59p 3.19p 4.37p 3.25p 3.41p 2.97p 2.28p 4.61p 3.83p 3.26p 4.57p 2.37p 4.89p 3.39p 3.10p 4.61p 2.6Sp 2.35p 3.71p 3.6Sp 5.24p 4.84p 2.82p 3.92p 3.33p 3.40p 4.1Sp 4.99p 2.20p 3.l7p 3.10p 1.89p 2.98p 1.98p 3.15p 3.29p 242202-J4 I 2-369 intel® Intel486TM PROCESSOR FAMILY 013 014 01S 016 017 R01 R02 ROS R07 R12 R13 R1S R16 501 502 503 50S 507 513 51S 516 517 A7 A2 BREO PLOCK# PCHK# A2S A2S A1S A1S All AS A3 BLA5T# A27 A26 A23 A14 A12 A10 A6 A4 AD5# I/03 Output1 Output2 Output1 Output2 I/03 I/03 I/03 I/03 I/03 I/03 Output1 Output1 I/03 I/03 I/03 I/03 I/03 I/03 I/03 I/03 Output1 10S4m 1134m lS72m 1630m 1616m 170Sm 1604m 149Sm 114Sm 1274m 12S2m lS04m 169Sin 1900m lS00m 17S6m 1842m lS30m 1444m 1722m 1792m lSSSm l1.20n 11.4Sn lS.SSn 14.23n 12.69n 14.67n 12.63n 13.S0n 11. S6n 10.90n 12.13n l2.11n 14.61n 14.1Sn lS.lSn 14.93n 13 . SSn 12.24n 13.20n 13.2Sn lS.l3n 14.12n 2.44p 2.S3p 3.S9p 3.44p S.07p 3.S9p S.04p 3.20p 2.SSp 4.14p 2.7Sp 4.77p 3.S7p S.SSp 3.7Sp 3.67p S.69p -4. S4p 3.10p S.36p 3.74p S.S2p I 1**************************************'******,**************************** I [Model] Output1 Model_type Output Polarity Non-Inverting Enable Active-Low Isignals AD5#,BLA5T#,5MIACT#,A<3:2>,FERR#,HLDA,PLOCK#,LOCK# I I typ min max C_comp 3.0SpF 2.9pF 3.2pF [Voltage range] 3.3V 3.0V 3.6V I 1************************************************************************ I [Pulldown] I voltage I(typ) I(min) I(max) -S.OV -960.0mA -SSO.OmA -1410mA -2.0V -190.0mA -99.0mA -292.0mA -1.0V -21.0mA -16.7mA -27.1mA -O.SV -13.30mA -S.7mA -20.SmA O.OV 0.0 0.0 0.0 20.1mA O.SV 12.9mA S.3mA 1. OV 23.7mA lS.lmA 37.2mA 49.7mA 1.SV 31. SmA 19.7mA 2.0V 3S.4mA 21.6mA S6.6mA 2.SV 36.3mA 22.0mA SS.6mA 36.6mA 22.2mA S9.0mA 3.0V 3.SV 22.3mA S9.4mA 36.9mA S9.7mA 37.0mA 4.0V 22.4mA 4.SV 37.1mA 22. SmA S9.SmA S.OV 37.2mA 22.SmA 60.0mA 6.0V 37.2mA 22.SmA 60.·0mA 37.2mA 10.0V 22.SmA 60.0mA 242202-J5 2-370 I Intel486™ PROCESSOR FAMILY I 1************************************************************************ I I I I I Note that the pullup voltage in the data table is derived from the equation: Vtable = Vcc - Voutput For the B.3V in the table, it is actually B.3V below Vcc and -SV with respected to Ground. 1************************************************************************ I [Pullup] voltage I I I(typ) I (min) I(max) B.3V -14l0mA -976.2SmA -1992.BmA 4.3V -79.BmA -SS.6mA -229.06mA 3.3V -47.lmA -40.BOmA -72.66mA 2.BV -46.lmA -29.42mA -70.96mA 2.3V -44.BmA -2B.64mA -6B.34mA 1. BV -42.lmA -27.l4mA -6l.94mA 1. 3V -3S.lmA -23.34mA -SO.34mA O.BV -24.lmA -16.0BmA -33.0BmA 0.3V -9.SmA -6.3mA -12.B6mA -0.2V 6.SmA 4.76mA 9.02mA -0.7V 24.3mA l7.3mA 33.04mA -1.2V 4l.7mA 30.4BmA SS.6mA -1.7V S6.4mA 42.26mA 74.l4mA -2.2V 67.3mA Sl.26mA BB.4mA -2.7V 74.6mA S6.96mA 96.SBmA -3.7V 79. BmA 6l.33mA 104.SmA -6.7V B2.4mA 63.SSmA lO9.SmA I [GND_clamp] Voltage I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV -O.2mA NA NA -O.6V -l.lmA NA NA -O.7V -3.0mA NA NA -O.BV -6.0mA NA NA -0.9V -ll.OmA NA NA -l.OV -30.0mA NA NA -1.2V -120.0mA NA NA -2.0V -lBO.OmA NA NA -S.OV -420.0mA NA NA I I I(max) ********************************************************************* The data in the following POWER_clamp table is listed as "Vcc-relative" , meaning that the voltage values are referenced to the Vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that Vcc is referenced to 3.3V. OV in the table actually means 3.3V with respected to Ground and OV ~bove Vcc. ****************************************************** ***********~*** 242202-J6 I 2-371 Intel486TM PROCESSOR FAMILY I voltage I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV OmA NA NA -0.6V OmA NA NA -0.7V O.lmA NA NA - 0 . BV 1. OmA NA NA -0.9V B.OmA NA NA -l.OV l4.0mA NA NA -2.0V 100mA NA NA I(max) I , I****·*************************+******~*************** ******************* [Ramp] , typ min max dV/dt_r 1.13/0.749n 0.93/0.868n 1.3S/0.642n dV/dt_f 0.99/0.447n 0.7S/0.S43n 1.27/0.387n I 1************************************************************************ I [Model] Output2 Model_type Output Polarity Non-Inverting Enable Active-Low ,signal BE<3:0>#,BREQ,D/C#,M/IO#,PWT,PCD,PCHK#,W/R# ,I typ min max C_comp 2.7pF 2.7pF 2.7pF [Voltage range] 3. 3V 3.0V 3. 6V I . 1************************************************************************ I [Pulldown] I voltage -S.OV -2.0V -l.OV -O.SV O.OV O.SV 1. OV 1. SV 2.0V 2.SV 3.0V 3.SV 4.0V 4.5V 5.0V 6.0V 10.0V , I(typ) I(min) I(max) -960.0mA -580.0mA -1410mA -190.0mA -99.0mA -292.0mA -2l.0mA -16.7mA -27.1mA -13.30mA -8.7mA -20.5mA 0.0 0.0 0.0 l2.9mA 8.3mA 20.1mA 23.7mA 37.2mA lS.lmA 31. SmA 19.7mA 49.7mA 3S.4mA 2l.6mA 56.6mA 36.3mA 22.0mA S8.6mA 36.6mA S9.0mA 22.2mA 36.9mA 22.3mA S9.4mA 37.0mA 22.4mA 59.7mA 37.1mA 22.5mA 59.BmA 37.2mA 22.SmA 60.0mA 37.2mA 60.0mA 22.SmA 37.2mA 60.0mA 22.SmA 1***************************************************************** I I Note' that the pullup voltage in the data table is derived from the equation: , Vtable = Vcc - Voutput 242202-J7 2-372 I Intel486™ PROCESSOR FAMILY I I For the 8.3V in the table, it is actually 8.3V below vcc and -5V with respected to Ground. 1***************************************************************** I [Pullup] I Voltage I I(typ) I(min) I(max) 8.3V -l4l0mA -976.25mA -l992.8mA 4.3V -79.8mA -55.6mA -229.06mA 3.3V -47.lmA -40.80mA -72.66mA -46.lmA -29.42mA -70.96mA 2.BV 2.3V -44.BmA -28.64mA -68.34mA l.8V -42.lmA -27.l4mA -6l.94mA 1. 3V -35.lmA -23.34mA -50.34mA 0.8V -24.lmA -l6.0BmA -33.0BmA 0.3V -9.5mA -6.3mA -l2.86mA -0.2V 6.5mA 4.76mA 9.02mA -0.7V 24.3mA 17.3mA 33.04mA 4l.7mA -l.2v 30.48mA 55.6mA -l.7V 56.4mA 42.26mA 74.l4mA 67.3mA -2.2V 5l.26mA 88.4mA 74.6mA -2.7V 56.96mA 96.58mA 79. BmA -3.7V 61.33mA l04.5mA -6.7V B2.4mA 63.55mA l09.SmA I [GND_clamp] I I Voltage I(typ) I (min) O.OV OmA NA NA -0.4V OmA NA NA -0.5V -0.2mA NA NA -0.6V -l.lmA NA NA -0.7V -3.0mA NA NA -O.BV -6.0mA NA NA -O.9V -ll.OmA NA NA -l.OV -30.0mA NA NA -l.2V -l20.0mA NA NA -2.0V -lBO.OmA NA NA -5.0V -420.0mA NA NA I (max) **,******************************************************************* The data in the following POWER_clamp table is listed as "Vcc-relative", meaning that the voltage values are referenced to the Vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that Vcc is referenced to 3.3V. OV in the table actually means 3.3V with respected to Ground and OV above Vcc. ********************************************************************* [POWER_clamp] I (typ) I (min) o. OV OmA NA NA· -0.4V OmA NA NA -0.5V OmA NA NA I voltage I (max) 242202-J8 I 2-373 Intel486TM PROCESSOR FAMILY -0.6V -0.7V -0.8V -0.9V -l.OV -2.0V OmA NA NA O.lmA NA NA 1.OmA NA NA 8.0mA NA NA l4.0mA NA NA 100mA NA NA I [Ramp] I typ min max dV/dt_r 1.13/0.749n 0.93/0.868n 1.3S/0.642n dV/dt_f 0.99/0.447n 0.7S/0.S43n 1.27/0.387n I 1********************************************************************** I [Model] I/Ol Model_type I/O Polarity Non-Inverting Enable Active-Low Vinl = 0.8v Vinh = 2. Ov Isignal DBUS<16:0>,DP<2:0> I I typ min max C_comp 2.7pF 2.7pF 2.7pF [Voltage range] 3.3V 3.0V 3.6V I 1************************************************************************ I [Pulldown] I (min) I (max) I voltage I(typ) -S.OV -960.0mA -S80.0mA -1410mA -2.0V -190.0mA -99.0mA -292.0mA -l.OV -:n.OmA -16.7mA -27.lmA -O.SV -13.30mA -8.7mA -20.SmA O.OV 0.0 0.0 0.0 O.SV l2.9mA 8.3mA 20.lmA 1.OV 23.7mA lS.lmA 37.2mA 1. SV 31. SmA 19.7mA 49.7mA 2.0V 3S.4mA 2l.6mA S6.6mA 2.SV 36.3mA 22.0mA S8.6mA 3.0V 36.6mA 22.2mA S9.0mA 3.SV 36.9mA 22.3mA S9.4mA 37.0mA 4.0V S9.7mA 22.4mA 4.SV 37.lmA 22.SmA S9.8mA S.OV 37.2mA 22.SmA 60.0mA 6.0V 37.2mA 22. SmA 60.0mA 10.OV 37.2mA 22.SmA 60.0mA I 1***************************************************************** I Note that the pullup voltage in the data table is derived from I the equation: I Vtable = Vcc - Voutput I For the 8.3V in the table, it is actually 8.3Vbelow Vcc and -SV I with respected to Ground. 1***************************************************** *********~** I 242202-J9 2-374 I Intel486TM PROCESSOR FAMILV [Pullup] voltage I I I(typ) I(min) I(max) B.3V -1410mA -976.25mA -1992.BmA 4.3V -79.BmA -55.6mA -229.06mA 3.3V -47.1mA -40.BOmA -72.66mA 2.BV -46.1mA -29.42mA -70.96mA 2.3V -44. BmA -2B.64mA -6B.34mA 1. BV -42.1mA -27.14mA -61.94mA 1.3V -35.1mA -23.34mA -50.34mA O.BV -24.1mA -16.0BmA -33.0BmA 0.3V -9.5mA -6.3mA -12.B6mA -0.2V 6.5mA 4.76mA 9.02mA -0.7V 24.3mA 17.3mA 33.04mA -1.2V 41.7mA 30.48mA 55.6mA -1.7V 56.4mA 42.26mA 74.14mA -2.2V 67.3mA 51.26mA BB.4mA -2.7V 74.6mA 56.96mA 96.5BmA -3.7V 79. BmA 61.33mA 104.5mA -6.7V B2.4mA -63.55mA 109.5mA I [GND_clamp] I Voltage I(typ) I (min) O.OV OmA NA NA -0.4V OmA NA NA -0.5V -0.2mA NA NA -0.6V -l.lmA NA NA -0.7V -3.0mA NA NA -O.BV -6.0mA NA NA -0.9V -ll.OmA NA NA -1.0V -30.0mA NA NA -1.2V -120.0mA NA NA -2.0V -lBO.OmA NA NA -5.0V -420.0mA NA NA I (max) I 1*********************************************************.******.**** I I I I I I I I The data in the following POWER_clamp table is listed as "Vcc-relative" , meaning that the voltage values are referenced to the Vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that Vcc is referenced to 3.3V. OV in the table actually means 3.3V with respected to Ground and OV above Vcc. 1*********************************************************.****** •• **. I [POWER_clamp] I(typ) I(min) O.OV OmA NA NA -0.4V oinA NA NA -0.5V OmA NA NA -0.6V OmA NA NA -0.7V O.lmA NA NA -O.BV 1.0mA NA NA -0.9V B.OmA NA NA -l.OV 14.0mA NA NA I voltage I (max) 242202-KO I 2-375 Intel486TM PROCESSOR FAMILY -2.0V 100mA NA NA I [Ramp) I typ min max dV/dt_r 1.13/0.749n 0.93/0.868n 1.35/0.642n dV/dt_f 0.99/0.447n 0.75/0.543n 1.27/0.387n I1***************************************************** *******~********** I I ' [Model) I/02 Model_type I/O Polarity Non-Inverting Enable Active-Low Vinl = 3. 3v Vinh = 6. Ov Isiganl DBUS<3l:l7>.DP<3> I I typ min max C_comp 3.05pF 2.9pF 3.2pF [Voltage range) 3.3V 3.0V 3.6V I 1*************************************************-*** *~*~************* I [Pulldown) I(min) I(max) I voltage I(typ) -5.0V -960.0mA -580.0mA -14l0mA ,-2.0V -190.0mA -99.0mA -292.0mA -l.OV -2l.0mA -16.7mA -27.lmA -0.5V -13.30mA -8.7mA -20.5mA O.OV 0.0 0.0 0.0 O.SV l2.9mA 8.3mA 20.lmA 23.7mA 37.2mA 1.OV 15.lmA 19.7mA 49.7mA 1. 5V 3l.5mA 2.0V 35.4mA 2l.6mA 56.6mA 2. '5V 36.3mA 22.0mA 58.6mA 3.0V· 36.6mA 22.2mA 59.0mA 36.9mA 22.3mA 3.5V 59.4mA 4.0V 37.0mA 22.4mA 59.7mA 4.5V 37.lmA 59.8mA 22.5mA 5.0V 37.2mA 22.5mA 60.0mA 6.0V 60.0mA 37.2mA 22.5mA 37.2mA 10.OV 60.0mA 22.5mA I I****~************************************************ ************ I Note that the pullup voltage in the data table is derived from I the equation: I Vtable = Vcc - Voutput I For the 8.3V in the table. it is actually 8.3V below Vcc and -5V I with respected to Ground. 1***************************************************************** I [Pullup) I voltage I 8.3V 4.3V /. 2-376 I(typ) -14l0~A -79.8mA I (min) I(max) -976.25mA -1992.8mA -55.6mA -229.06mA 242202-K1 I Intel486TM PROCESSOR FAMILY 3.3V -47.lmA -40.BOmA -72.66mA 2.BV -46.lmA -29.42mA -70.96mA 2.3V -44.BmA -2B.64mA -6B.34mA 1. BV -42.lmA -27.l4mA -6l.94mA 1. 3V -3S.lmA -23.34mA -SO.34mA O.BV -24.lmA -16.0BmA -33.0BmA 0.3V -9.SmA -6.3mA -12.B6mA -0.2V 6.SmA 4.76mA 9.02mA -0.7V 24.3mA l7.3mA 33.04mA -1.2V 4l.7mA 30.4BmA SS.6mA -1.7V S6.4mA 42.26mA 74.l4mA -2.2V 67.3mA Sl.26mA BB.4mA -2.7V 74.6mA S6.96mA 96.SBmA - 3 . 7V 79 . BmA 6l.33mA 104.SmA -6.7V B2.4mA 63. SS.mA 109.SmA I [GND_clamp] I I Voltage I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV -0.2mA NA NA -0.6V -l.lmA NA NA -0.7V -3.0mA NA NA -O.BV -6.0mA NA NA -O.QV -ll.OmA NA NA -l.OV -30.0mA NA NA -1.2V -120.0mA NA NA -2.0V -lBO.OmA NA NA -S.OV -420.0mA NA NA I(max) ********************************************************************* The data in the following POWER_clamp table is listed as "Vcc-relative", meaning that the voltage values are referenced to the Vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that Vcc is referenced to 3.3V. OV in the table actually means 3.3V with respected to Ground and OV above Vcc. ********************************************************************* [POWER_clamp] I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV OmA NA NA -0.6V OmA NA NA -0.7V O.lmA NA NA -O.BV 1.OmA NA NA -0.9V B.OmA NA NA -1.0V 14.0mA NA NA -2.0V 100mA NA NA I voltage I (max) I 1*********************************************************************** I 242202-K2 I 2-377 Intel486TM PROCESSOR FAMILY [Ramp] typ min max dV/dt_r 1.l3/0.749n 0.93/0.868n 1.3s/0.642n dV/dt_f 0.99/0.447n 0.7s/0.s43n 1.27/0.387n I I 1*********************************************************************** I [Model] I/03 Model_type I/O Polarity Non-Inverting Enable Active-Low Vinl 0.8v Vinh = 2. Ov I I signal ABUS<3l:4> I I typ min max C_comp 2.9pF 2.9pF 2.9pF [Voltage range] 3.3V 3.0V 3.6V I I*************************************************~*** ******************* I . [Pulldown] I voltage I(min) I (max) I(typ) -960.0mA -s80.0mA -1410mA -190.0mA -99.0mA -292.0mA -21.0mA -16.7mA -27.1mA -13.30mA -8.7mA -20.smA 0.0 0.0 0.0 8.3mA 20.lmA 12.9mA 37.2mA 23.7mA ls.lmA 19.7mA 49.7mA 31. SmA s6.6mA 21.6mA 3s.4mA s8.6mA 36.'3mA 22.0mA 22.2mA s9.0mA 36.6mA s9.4mA 22.3mA 36.9mA s9.7mA 22.4mA 37.0mA s9.8mA 37.1mA 22.smA 22.smA 60.0mA 37.2mA 60.0mA 37.2mA 22. SmA 60.0mA 22.smA 37.2mA -s.OV -2.0V -1.0V -o.sv O.OV O.SV 1. OV 1. SV 2.0V 2.sV 3.0V 3.sV 4 .. 0V 4.SV s.OV 6.0V 10.0V I 1***************************************************************** I Note that the pullup voltage in the data table is derived from I the equation: I Vtable = vcc - Voutput I I For the 8.3V in the table. it is actually 8.3V below Vcc and -sv with respected to Ground. 1*********-******************************************************** I [Pullup] I voltage I 8.3V 4.3V 3.3V I(typ) I (min) I(max) -14l0mA -976.2smA -1992.8mA -79.8mA -ss.6mA -229.06mA -47.1mA -40.80mA -72.66mA 242202-K3 2-378 I Intel486TM PROCESSOR FAMILV 2.SV -46.1mA -29.42mA -70.96mA 2.3V -44.SmA -2S.64mA -6S.34mA 1.SV -42.1mA -27.14mA -61.94mA 1.3V -3S.1mA -23.34mA -50.34mA O.SV -24.1mA -16.0SmA -33.0SmA 0.3V -9.SmA - 6. 3mA -12.S6mA -0.2V 6. SmA 4.76mA 9.02mA -0.7V 24.3mA l7.3mA 33.04mA 41.7mA -1.2V 30.48mA SS.6mA -1.7V S6.4mA 42.26mA 74.l4mA -2.2V 67.3mA Sl.26mA S8.4mA -2.7V 74.6mA S6.96mA 96.S8mA -3.7V 79. SmA 6l.33mA 104.SmA -6.7V S2.4mA 63.SSmA 109.SmA I [GN.D_clamp] I I Voltage I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV -0.2mA NA NA -0.6V -l.lmA NA NA -0.7V -3.0mA NA NA -O.SV -6.0mA NA NA -0.9V -11.0mA NA NA -1.0V -30.0mA NA NA -1.2V -120.0mA NA NA -2.0V -lSO.OmA NA NA -S.OV -420.0mA NA NA I(max) I . 1********************************************************************* I I I I I I I I The data in the following POWER_clamp table is listed as "Vcc-relative", meaning that the voltage values are referenced to the Vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that Vcc is referenced to 3.3V. OV in the table actually means 3.3V with respected to Ground and OV above Vcc. 1********************************************************************* I [POWER_clamp] I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV OmA NA NA -0.6V OmA NA NA -0.7V O.lmA NA NA -O.SV 1.0mA NA NA -0.9V 8.0mA NA NA -1.0V 14.0mA NA NA -2.0V 100mA NA NA I voltage I(max) I I******************************~********************** ******************* [Ramp] typ I min max 242202-K4 I 2-379 Intel486TM PROCESSOR FAMILY dV/dt r dV/dt f 1.l3/0.749n O.93/0.868n l.3S/O.642n O.9910.447n O.7S/O.S43n l.27/0.387n I 1***************************************************** ••••• ** •••• *.**.*.* I [Model] Inputl Model_type Input Polarity Non-Inverting Enable Active-Low Vinl = O.8v Vinh = 2. Ov Isignal A20M#,AHOLD,BOFF#,BRDY#,BSl6#,BS8#,FLUSH#, I HOLD,IGNNE#,INTR,KEN#,NMI,RDY#,RESET,SRESET,SMI#,STPCLK# I typ min max C_comp 2 . OpF 2. 'OpF 2 .OpF [Voltage range] 3.3V 3.0V 3.6V I 1*******************************************************-*-*******-*-**** I [GND_clamp] I Voltage I(typ) I O.OV OmA NA NA -O.4V OmA NA NA -O.SV -O.2mA NA -O.6V -l.lmA NA -O.7V -3.0mA NA, -O.8V -6.0mA NA -O.9V -11.0mA NA -l.OV -30.0mA NA -l.2V -l20.0mA NA -2.0V -180.0mA NA -S.OV -420.0mA NA I(min) I (max) NA NA NA NA NA NA NA NA NA *-******.****--****-*-*--*.****.*_.*.*****.**-*******.****_.****.***- The data in the following POWER_clamp table is listed as "Vcc-relative" , meaning that the voltage values are referenced to the vcc pin. The voltages in the data tables are derived from the equation: Vtable = Vcc - Voutput In this case, assuming that vcc is referenced to 3.3V. OV in the table actually mea~s 3.3V with respected 'to Ground and OV above Vcc. * •• ****.******** •• _ •• * ••••••• _ •• **-*-*._*._-_ .. 2-380 _--.*._ •• _••••• _••• _•• I Intel486TM PROCESSOR FAMILY I 1******************************************************--*** •• _* ••• * •• _-. I [Model] Clockbuffer Model_type Input Polarity Non-Inverting Enable Active-Low Vinl = 0.8V Vinh = 2. OV Isignal CLK I typ min max C_comp 2.0pF 2.0pF 2.0pF [Voltage range] 3.3V 3.0V 3.6V I 1******************************************************-***---_._.-._---* I [GND_clamp] I Voltage I(typ) O.OV OmA NA NA -0.4V OmA NA NA -0.2mA NA -o.SV -0.6V -l.lmA NA -0.7V -3.0mA NA -0.8V -6.0mA NA -0.9V -ll.OmA NA -l.OV -30.0mA NA -1.2V -120.0mA NA -2.0V -180.0mA NA -S.OV -420.0mA NA I(min) I(max) NA NA NA NA NA NA NA NA NA I I~**************************************************** *_.***---- I .. __ .- The data in the following POWER_clamp table is listed I as "Vcc-relative", meaning that the voltage values are I referenced to the Vcc pin. The voltages in the data tables I are derived from the equation: I Vtable = vcc - Voutput I In this case, assuming that Vcc is referenced to 3.3V. I OV in the table actually means 3.3V with respected to I Ground and OV above Vcc. 1*********************************************************-**-.-_.*--- I [POWER_clamp] I voltage I(typ) I(min) O.OV OmA NA NA -0.4V OmA NA NA -O.SV OmA NA NA -0.6V OmA NA NA -0.7V O.lmA NA NA -0.8V 1.OmA NA NA -0.9V 8.0mA NA NA -l.OV l4.0mA NA NA -2.0V 100mA NA NA I I (max) \***********.*****************************************-_ •• ***_._----_._-- I [End] 242202-K6 I 2-381 Intel486™ PROCESSOR FAMILY APPENDIX D BSDL LISTINGS Below is a listing of a boundary scan description language (BSDL) file for the IntelDX4 processor. processors. See section 11.5, "lntel486 Processor Boubdary Scan," for a complete description of BSDL instructions and usage. This file is provided as an example. Contact Intel for design information for this and other Intel486 IntelDX4 Processor Listing -- Copyright Intel Corporation 1993 --* •• ************.*.** ••• *****************.**********.***.* •••• **.*.********** Intel Corporation makes no warranty for the use of its products and assumes no responsibility for any errors which may appear in this document nor does it make a commitment to update the information contained herein. --*********.******.**.*****.*****.********~ •• **.**.*.* **** •••••• *** •• ** •• ****. Boundary-Scan Description Language (BSDL Version 0.01 is a de-facto standard means of describing essential features of ANSI/IEEE 1149.1-1990 compliant devices. This language is under consideration by the IEEE for formal inclusion within a supplement to the 1149.1-1990 standard. The generation of the supplement entails an extensive IEEE review and a formal acceptance balloting procedure which may change the resultant form of the language. Be aware that this process may extend well into 1993, and at this time the IEEE does not endorse or hold an opinion on the language. --*.**** •• * ••• ****.** •••• ************ •••• ***** ••• ** ••••• *** ••• ** ••••••••• ***** IntelDX4(tm) processor BSDL description This file has been electrically verified. Rev: 1.2 09/27/93 entity IntelDX4 is string := "PGA_17x17"); port (A20M ABUS2 ABUS3 ABUS ADS AHOLD BE BLAST BOFF BRDY BREQ BSS BS16 CLK in out out inout out in out out in in out in in in bit; bit; bit; bit_vector (4 to 31); bit; bit; bit_vector (0 to 3); bit; bit; bit; bit; bit; bit; bit; -- Address bus (words) 242202-K7 2-382 I _I I-n+'eI® Intel486™ PROCESSOR FAMILV CLKMUL DBUS DC DP EADS FERR FLUSH HLDA HOLD IGNNE INC_PGA INTR KEN LOCK MIO NC PGA NC_SQFP NMI PCD PCHK PLOCK PWT RDY RESET SMI SMIACT SRESET STPCLK TCK, TMS, TDI TDO UP VCC_PGA VCC_SQFP VCC5 VOLDET VSS PGA VSS_SQFP WR bit; in inout bit_vector (0 to 31) ; - - Data bus bit; out inout bit_vector ( a to 3) ; in bit; out bit; in bit; out bit; bit; in in bit; linkage bit_vector (1 to 5); -- Internal NC PGA in bi t; in bit; out bit; out bit; linkage bit; -- No Connect for PGA linkage bit_vector (1 to 7); -- NC SQFP in bit; out bit; out bit; out bit; out bit; in bit; in bit; in bit; out bit; bit; in in bit; in bit; Scan Port inputs Scan Port output out bit; in bit; linkage bit_vector (l to 23); -- vee linkage bit_vector (1 to 53); -- VCC linkage bit; -- Reference Voltage linkage bit; -- Voltage Detect Pin, PGA only linkage bit_vector (1 to 28); VSS linkage bit_vector (1 to 38); -- VSS out bit); attribute PIN_MAP of IntelDX4 : entity is PHYSICAL_PIN_MAP; -- Define Pin Out of PGA "A20M "ABUS2 "ABUS3 "ABUS "ADS "AHOLD "BE "BLAST "BOFF D15, "I< Q14, "I< R15, "I< (S16,Q12, S15, Q13, R13, Q11, S13, R12, "I< S07,Q10,SOS,R07,Q09,Q03,R05,Q04,Q08,Q05,"1< Q07, S03, Q06, R02, S02, Sal, R01, P02, P03, Q01l ,"I< S17, "I< A17, "I< (K15,J16,Jl5,F17), "I< R16, "I< D17, "I< 242202-K8 I 2-383 intel® Intel486TM PROCESSOR FAMILY "BRDY "BREQ "BS8 "BSI6 "CLK "CLKMUL "DBUS "DC "DP "EADS "FERR "FLUSH "HLDA "HOLD "IGNNE " INC_PGA "INTR "KEN "LOCK "MIO "NC_PGA "NMI "PCD "PCHK "PLOCK "PWT "ROY "RESET "SMI "SMIACT "SRESET "STPCLK "TCK "TDI "TOO "TMS "UP "VCC_PGA HIS, "& Q15, "& 016, "& C17, "& C03, "& R17, "& (P01,N02,N01,H02,M03,J02,L02,L03,F02,DOl,E03, "& C01,G03,D02,K03,F03,J03,D03,C02,BOl,AOl,B02,"& A02,A04,A06,B06,C07,C06,C08,A08,C09,B08)," & MIS, "& (N03,FOl,H03,AOS),"& B17, "& C14, "& CIS, "& PlS, "& E1S, "& A1S, "& (Al0,AI2,A13,B12,B13), "& A16, "& F1S, "& N15, "& N16, "& Cl3, "& B15, "& J17, "& Q17, "& Q16, "& L1S, "& F16, "& C16, "& B10, "& C12, "& Cl0, "& G1S, "& A03, "& A14, "& B16, "& B14, "& Cll, "& (B07,B09,Bl1,C04,COS,E2,E16,G02,G16,H16,K02,"& K16, L16, M02 ,M16, P16, R03, R06, R08 ,R09, RIO, Rll, "& R14) ,"& "VCCS "VOLDET "VSS_PGA "WR JOl, "& S04, "& (A07,A09,Al1,B03,B04;BOS,E01,E17,G01,G17,H01,H17,"& K01,K17,L01,L17,M01,M17,PI7,Q02,R04,S06,SOa,S09,"& S10,S11,S12,S14),"& N17 ". -- Define Pin Out of SQFP "A20M "ABUS2 "ABUSJ 47, "& 202" "& 197, "& 242202-K9 2-384 I _I •-n+'eI® "ABUS "ADS "AHOLD "BE "BLAST "BOFF "BRDY "BREQ "BS8 "BS16 "CLK "CLKMUL "DBUS "DC "DP "EADS "FERR "FLUSH "HLDA "HOLD "IGNNE "INTR "KEN "LOCK "MIO "NC_SQFP "NMI "PCD "PCHK "PLOCK "PWT "ROY "RESET "SMI "SMIACT "SRESET "STPCLK "TCK "TDI "TDO "TMS "UP "VCC_SQFP ·VCC5 ·VSS_SQFP ·WR Intel486TM PROCESSOR FAMILY ( 19 6. 195. 193 . 192. 190, 187 , 186. 182 • 180, 178. "& 177,174.173,171.166.165.164.161.160.159. "& 158.154.153.152.151.149.148.147)." & 203, "& 17. "& (31.32.33.34). "& 204. "& "& 6• "& 5. "& 30. "& 8. 7• "& 24. "& 11. "& (144.143.142.141.140.130.129.126.124.123.119."& 118.117.116.113.112.108.103.101.100.99.93."& 92.91.87.85.84.83.79.78.75.74) ."& 39. "& (145.125.109.90). "& "& 46. "& 66. "& 49. "& 26. "& 16. 72. "& "& 50. "& 13. 207. "& 37. "& (63 • 64. 67 • 70.71 • 96. 127) • "& 51. "& 41. "& "& 4• 206. "& "& 40. "& 12. "& 48. 65. "& "& 59. "& 58. "& 73. "& 18. 168. "& "& 68. 167. "& 194. "& (2.9.14.19.20.22.23.25.29.35.38.42.44.45.54."& 56.60.62.69.77.80.82.86.89.95.98.102.106.111."& 114.121.128.131.133.134.136.137.139.150,155."& 162.163.169.172.176.179.183.185.188.191.198. "& 200.205). "& 3. "& (1.10.15.21.28.36.43.52.53.55.57.61.76.81.88.94."& 97.104.105.107.110.115.120.122.132.135.138.146."& 156.157.170.175.~81,184.189.199.201.208). "& 27 •. 242202-LO I 2-385 Intel486™ PROCESSOR FAMIL V attribute Tap_Scan_In of attribute Tap_Scan_Out of attribute Tap_Scan_Mode of TDI TDO TMS attribute Tap_Scan_Clock of TCK signal is true: signal is true; signal is true; signal is 125.0e6, BOTH) ; attribute Instruction_Length of IntelDX4 entity is 4; attribute Ins truct_ion_Opcode of IntelDX4 entity is "BYPASS "EXTEST "SAMPLE "IDCODE "RUNBIST "PRIVATE (lllll," & (0000)," & (0001),"& (0010)," & (1000)," & 10011,0100,0101,0110,0111,1001,1010,1011,1100,1101,1110) "; attribute Instruction_Capture of IntelDX4 -- there 15 : entity is "0001-; no Instruct10n Dlsable attribute for IntelDX4 attribute Instiuction_Private of IntelDX4 attribute Idcode_Register of IntelDX4: "0000" "1000001010001000"& "00000001001" & : entity is "private"; entity is & --version --new part number --manufacturers identity --required by the standard "1" ; attribute Instruction_Usage of IntelDX4 : entity is "RUNBIST Iregisters BrST; "& "result 0;" & "clock CLK in Run_Test_Idle;"& "length 1600000)"; attribute Register_Access of IntelDX4 "BIST[l] entity is IRUNBIST) "; --{**************************************************.*****************} --( The firsi cell is closest to TDO attribute Boundary_Length of IntelDX4 : entity is 109; a t tribute Boundary_Cells of IntelDX4 : enti ty is "BC_2, BC_l, BC_6"; attribute Boundary_Register of IntelDX4 : entity is "0 IBC_2, ABUS2, output3, IBC_2, ABUS3, output3, "1 IBC_6, ABUS (4) , bidir, "2 IBC_6, ABUS(5) , bidir, "3 IBC_1, UP, input, "4 IBC_6, ABUS(6) , bidir, "5 IBC_6, ABUS(7) , bidir, "6 18C_6, "7 ABUS I 8) , bidir, X, X, X, X, X), X, X, X, 107, 107, 107, 107, 1, 1, 107, 1, 107, 107, 1, 1, 1, " 1, Z) Z) Z) Z) & Z) Z) Z) ,"& ,"& ,"& ,"& , "& , "& , "& 242202-L1 2-386 I Intel486TM PROCESSOR FAMILY "8 "9 "10 "11 "12 "13 "14 "15 "16 "17 "18 "19 "20 "21 "22 "23 "24 "25 "26 " 27 "28 "29 "30 "31 "32 "33 "34 "35 "36 "37 "38 "39 "40 "41 "42 "43 "44 "45 "46 "47 "48 "49 "50 "51 "52 "53 "54 "55 "56 "57 "58 "59 "50 "61 "62 "63 IBC_5, IBC_5, IBC_6, IBC_6, IBC_6, IBC_5, IBC_6, IBC_5, IBC_6, IBC_5, IBC_6, IBC_6, IBC_6, IBC_6, IBC_6, IBC_6, IBC_6, IBC_5, IBC_6, IBC_6, IBC_6, IBC_5, IBC_6, IBC_6, IBC_5, IBC_6, IBC_5, IBC_6, IBC_5, IBC_6, IBC_5, IBC_5, IBC_6, IBC_5, IBC_5, IBC_6, IBC_6, IBC_6, IBC_6, IBC_6, IBC_5, \3C_6, IBC_6, IBC_5, IBC_6, IBC_5, IBC_6, IBC_6, IBC_6, IBC_5, IBC_5, IBC_6, iBC_5, IBC_5, IBe6, ABUS (9) , ABUS (10) ASUS 1 11) ABUS (12) ABUS 113) ABUS I 14) ABUS(15) ABUS I 16) ABUS(17) , , , , , , , , ABUSI18)/ ABUS I 191 , ABUS 120 I , ABUS (21) , ABUS 122) , ASUS (23) , ABUS (24) , ABUS 1 2 5) , ABUS 1 26) , ABUS 127) , ABUS(28) , ABUSI291, ABUS(30) , ABUS 131) , DPIO) , DBUS (0) , DBUS 111 , DBUS (2) , DBUS I 3) , DBUS (4) , DBUS(5) , DBUS(6) , DBUS (71 , DPI1I, DBUS (81 , DBUS (91, DSUS I 10) DBUS (11) DBUS (12) DBUS (131 DSUS 1141 DBUS 115) DP (21, DBUS (16) DSUS 117) DBUS 118) DBUS(19) DBUS (20) DBUS I 21) DBUS(22) DBUS (23) DP (3) , DBUS(24) DBUS 125) DBUS (25) DBUS (27) DBUS 128) , , , ; , , , , , , , , , , , , , , , bidir, bidir, bidir, bidir, bidir, x, bidir. X, bidir. bidir, bidir, bidir, X, bidir. X, bidir, bidir. bidir. bidir, bidir, bidir, bidir, bidir. bidir, bidir, bidir, bidir, bidir. bidir. bidir, bidir. bidir, bidir, bidir, bidir. bidir, bidir, bidir, bidir, bidir, bidir, bidir, bidir, bidir, bidir, bidir. bidir. bidir, bidir, bidir, bidir. bidir. bidir, bidir, bidir, bidir. bidir, bidir, bidir, bidir, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, x, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 108, 108, 108, 108, 108, lOB, lOB, lOB, lOB, lOB, lOB, lOB, lOB, lOB, lOB, lOB, lOB, 1, Z) , "&, 1, ZI , 1, Z) ,"& "& 1, Z) , "& 1, 1, ZI , 1, Z) , "& 1, ZI , ZI , Z) , "& "& "& 1, 1, 1, 1, 1, 1, Z) , "& 1, Z) , "& "& Z) , "& Z) , "& ZI , ZI , "& "& 1, ZI , 1, Z) , "& "& 1, Z) , "& 1, 1, Z) , "& 1, 1, 1, ZI , ZI , Z) , "& "& "& Z) , "& 1, 1, Z) , "& 1, 1, 1, Z) , "& 1, "& Z) , "& 1, ZI ZI ZI ZI , , , , 1. Z) , "& 1, Z) , "& "& "& "& 1, Z) , "& 1, 1, 1, Z) , "& 1, 1, 1, 1, Z) , "& ZI ZI ZI ZI ZI , "& , "& , "& , "& , "& 108, 1, Z) , "& 1, 1, Z) , "& X, lOB, lOB, X, 108, Z) ,"& X, X, lOB, 1, 1, 1. Z) , "& 1, ZI ZI ZI ZI ZI X, X, X, X, X, X, X, X, X, X, 108, 108, 108, 108, 108, 108, 108, 108, 108, 108, 108, 1, 1, 1, 1, 1, Z) , "& 21, "& , , , , "& "& "& "& ,"& Z) , "& 1. ZI , 1, 1, Z) , "& 1, ZI , "& Z) , "& "& 242202-L2 I 2-387 2-388 I Intel486™ PROCESSOR FAMILY APPENDIX E SYSTEM DESIGN NOTES SMM Environment Initialization When the Intel486 processors are operating in Real Mode, the physical address at which instructions and data are fetched is determined by the segment register and an offset (Le., CS and IP for instructions). When a new value is loaded into a segment register, the new value is shifted to the left by four bits and stored in a segment base register that corresponds to that particular segment (CSBASE, OS BASE, ESBASE, etc.). It is the value stored in the segment base register that is actually used to generate a physical address. For example, the linear address to be used for fetching instructions is determined by adding the value contained in the CS segment base register with the value in the IP register. When the processor is in Protected Mode, the segment registers are used as selectors to a descriptor table. Each descriptor in a descriptor table contains information about the segment in use, including the segments BASE address (Le., CSBASE), the limit (or size of the segment), as well as protection level, privileges, operand sizes, and the segment type. In Protected Mode, the linear address is determined by adding the base portion of the descriptor to the appropriate offset. When in System Management Mode, the processor operates in a pseudo-Real Mode, with address calculation performed in the Real Mode manner. However, the processor adds the value in the segment base register with the value in the EIP register, rather than the IP register, so there are no limits as to the segment size. The physical address of an instruction is obtained by adding the value in CSBASE to the value in EIP. I When entering SMM, it may be necessary to initialize the segment registers to point to SMRAM (see section 8.4.2, 'Processor Environment,' for their value on SMM entry). If 5MBASE has not been relocated, then the necessary segment registers can be initialized to point to SMRAM by using the value in the CS register, 3000H, which points to the SMRAM address space. When an SMI # occurs after 5MBASE has been modified, CSBASE is loaded with the new value of 5MBASE. However, the CS selector register still contains the value 3000H, not the value corresponding to the new 5MBASE. To initialize segment registers to point to the new SMRAM area, read the 5MBASE value from the SMM state that was saved in memory. Because the data segment registers are initialized to 0, do not use them to access the SMM state save area. Instead, perform a read relative to the CS register by using a CS override prefix to a normal memory read. Although CS still contains 3000H, CSBASE contains the value of 5MBASE, and CSBASE is used for the address generation. Once the value of 5MBASE is obtained, it must be shifted to the right by four bits to get the appropriate value to be placed in the segment registers. The CS register itself can be initialized by executing a far jump instruction to an address within 5MBASE, which causes CS to be reloaded with a value corresponding to 5MBASE. Example E-1 describes one method of initializing the segment registers when 5MBASE has been relocated. This method works if 5MBASE is less than 1 Megabyte. 2-389 Intel486TM PROCESSOR FAMILY Example E-1. Initialization of Segment Registers within SMM ;read the value of 5MBASE from the state save area mov si,FEF8H ;SMBASE slot in SMM state save area mov eax,cs: [sil ;copy 5MBASE from 5MBASE:FEF8H to eax ;scale the 5MBASE value to a l6-bit quantity mov cl, 4 eax,cl ror ;scaled value of 5MBASE now in ax Ito load cs, execute a far jump to an address that has been stored ;at memory location PTR_ADDR ;store the 5MBASE value and an offset to a memory location that can be used as Ian indirect jump address mov di,PTR_ADDR mov bx, OFFSET mov inc mov cs:[dil,bx di di cs: [dil,ax mov bX,PTR_ADDR inc ;PTR_ADDR is the location used to ;store the jump address ; OFFSET is the address where ;execution continues after the ;far jump ;store the offset for the far jump ;store the segment address for the ;far jump, which is 5MBASE· ;bx now contains the address of the ;location holding the jump address ;initialize DS and ES with the correct address of 5MBASE mav ds, ax moves. ax ;execute a far jump instruction to load the CS register jmp far [bxl ; jump to address stored at memory ; location pointed to by bx ;es now contains the correct value of 5MBASE, and execution continues from the ;address 5MBASE:OFFSET 242202-L4 Accessing SMRAM LOADING SMRAM WITH AN INITIAL SMI HANDLER Under normal conditions, the SMRAM address space should only be accessible by the processor while it is in SMM mode. However, some provision must be made for providing the initial SMM interrupt handler routine. 2-390 Because System Management Mode must be transparent to all operating systems, the initial SMM handier must be loaded by the system BIOS. At some time during the power on sequence, the system BIOS will need to move the SMM handler routine from the BIOS ROM to the SMRAM. The system I iniaL Intel486™ PROCESSOR FAMILY designer must provide a hardware mechanism that allows access to SMRAM while SMIACT # from the processor is inactive. One method would be to provide an 1/0 port in the memory controller that forces memory cycles at a given address to be relocated to the SMRAM. Once the initial SMM handler has been loaded to SMRAM, the 1/0 port would be disabled to protect against accidental accesses to SMRAM. system memory controller must redirect any memory accesses that are not generated by the processor to normal system memory as if SMIACT# was inactive. The system BIOS must provide an SMM handler at the address 38000H. If the system designer has chosen to take advantage of the SMRAM relocation feature of the processor, this handler must change the 5MBASE register in the SMM state save. Next, the BIOS must move the full featured SMM handler to the new address. An SMI# must be generated in order to change the 5MBASE register before the BIOS passes control to the operating system. It is not recommended to block bus control requests when in SMM, because the increased bus access latency could cause compatibility issues with some software or expansion hardware. SMRAM HIDDEN FROM DMA AND BUS MASTERS In a system that allows DMA or other devices to take control of the system bus, care must be taken to ensure that only the master processor can access SMRAM. If an external bus master requests use of the system bus (by asserting HOLD or BOFF#) while the processor is executing an SMM handler routine, the processor would respond by passing control of the bus to the requesting device. The DMA accesses to the SMRAM area must be redirected to the correct address space when the initialization routine is loading SMRAM, as well as when the processor is in SMM. ACCESSING SYSTEM MEMORY FROM WITHIN SMM In order to enter a suspend state where power is removed from some or all of system memory, it is necessary for the processor to have access to the entire system address space from within SMM. Access to system memory from within SMM requires that the memory controller decode both SMIACT# and the processor address to determine accesses to SMRAM. Only those'memory addresses that are defined as being SMRAM space would be directed to SMRAM. If SMRAM is located at an address that overlays normal system memory address space (see section 8.6.2, "SMRAM Interface,".), the processor must have a method of accessing both SMRAM (for code reads) and system memory simultaneously. ClK \. . ._____1 ADS# \.. SMIACT#~,--_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ A31 .. A2===><====»-------- [fJ[g!!J~a~~OOW Order Number: 241261-001 2-405 Cache and Memory Design Considerations for the Intel486™ DX2 Microprocessor CONTENTS PAGE 1.0 INTRODUCTION ................... 2-409 2.0 THE HIGH-PERFORMANCE Intel486TM DX2 MICROPROCESSOR ................ 2-409 2.1 Internal Cache Hit Rates ......... 2-410 2.2 Bus Cycle Mix ................... 2-411 2.3 Bus Utilization ................... 2-412 2.4 Profiles of Some Applications .... 2-413 2.5 Wait States Explained ........... 2-414 2.5.1 The Ideal Zero Wait State Memory System ............... 2-414 2.5.2 Adding Wait States ......... 2-415 2.5.3 Wait States and CPU Stalls .......................... 2-420 CONTENTS PAGE 3.3 Memory Read Performance Considerations ............. " ..... 2-424 3.4 Memory Write Performance Considerations .................... 2-425 3.5 Viability of Intel486 DX2 System without an External Cache ........ 2-432 4.0 CACHE DESIGN OPTIMIZATiON .. 2-432 4.1 Overall Effect of an External Cache on CPU Performance ............. 2-433 4.2 Effect of Cache Size and Associativity ...................... 2-434 4.3 Improving the Performance of a Write-Through Cache ............. 2-436 4.3.1 Memory Write Pipelining .... 2-436 4.3.2 External Write Buffers ....... 2-436 2.5.3.1 Delay Till First Ready of a Read ........................ 2-420 4.3.3 Performance with an External Write-Through Cache .......... 2-438 2.5.3.2 Wait states on Bursts .. 2-420 2.5.3.3 Write Wait States ...... 2-421 4.4 Write-Back Caches .............. 2-439 3.0 MEMORY DESIGN OPTIMIZATION ..................... 2-422 3.1 Page Mode DRAM ............... 2-423 3.2 Interleaving ...................... 2-423 4.4.2 Write-Back Cycle ........... 2-439 2-406 4.4.1 Main Memory Controller Considerations ................. 2-439 5.0 CONCLUSION ..................... 2-441 I CONTENTS FIGURES Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 2.11 Figure 2.12 Figure 2.13 I PAGE The 66MHz Intel486 DX2 CPU Internal Architecture .. 2-409 The Intel486 DX2 Microprocessor is More Sensitive to Memory Latency than Its Predecessor ....... 2-410 There will be a Range of L1 Hit Rates for Different Applications/Operating Systems (%) ............... 2-411 The Effect of L.1 Cache Hits on the External Bus Cycle Mix for an Integer SPEC Benchmark ................ 2-411 Definition of Bus Utilization .................. 2-412 Memory Performance Notation ................... 2-414 Intel486 DX2 CPU Performance Degradation as Wait States are Added - for the SPEC1 Application Trace (UNIX) ..................... 2-416 Intel486 DX CPU Performance Degradation as Wait States are Added - for the SPEC1 Application Trace (UNIX) .......... , .......... 2-416 Intel486 DX2 CPU Performance Degradation as Wait States are Added - for the GCC Application Trace (UNIX) ..................... 2-417 Intel486 DX CPU Performance Degradation as Wait States are Added - for the GCC Application Trace (UNIX) ..................... 2-417 Intel486 DX2 CPU Performance Degradation as Wait States are Added - for the Pagemaker Application Trace (Windows) ........... 2-418 Intel486 DX CPU Performance Degradation as Wait States are Added - for the Pagemaker Application Trace (Windows) ........... 2-418 Intel486 DX2 CPU Performance Degradation as Wait States are Added - for the Turbo C Application Trace (DOS) ............... 2-419 CONTENTS PAGE Figure 2.14 Intel486 DX CPU Performance Degradation as Wait States are Added - for the Turbo C Application Trace (DOS) ............... 2-419 Figure 2.15 A Zero-Wait State Write for the Intel486 DX2 CPU ...... 2-420 Figure 2.16 The Intel486 DX2 CPU's Write Buffers are More Heavily Used than the Intel486 DX CPU's ......... 2-421 Figure 2.17 Reads Stalled as a Result of a Write Already In Progress ................... 2-422 Figure 3.1 The Two Cacheless Intel486 DX2 CPU-Based Systems Considered ................ 2-422 Figure 3.2 A Page Mode Burst Read (Page Hit on Lead-Off Cycle) ..................... 2-423 Figure 3.3 Burst Read Cycle from a Paged-Interleaved Memory System .................... 2-424 Figure 3.4 Total Intel486 DX2 CPU Execution Time versus Memory Read Performance for SPEC1 (UNIX) .......... 2-424 Totallntel486 DX2 CPU Execution Time versus Memory Read Performance for Pagemaker (Windows) .. 2-425 Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 Totallntel486 DX2 CPU Execution Time versus Memory Read Performance for Turbo C (DOS) ......... 2-425 Page Mode DRAM Allow for Fast Back-to-back Writes .. 2-426 Adding One Write Buffer to the Memory System ........ 2-426 Figure 3.9 Pipelining the Writes to Memory ................... 2-427 Figure 3.10 A Data Latch is Required for Write Buffering or Pipelining .................. 2-427 Figure 3.11 Total Execution Time versus Write Performance- for SPEC1 (UNIX) ............. 2-428 Figure 3.12 Total Execution Time versus Write Performance - for Pagemaker (Windows) ..... 2-429 2-407 CONTENTS PAGE Figure 3.13 Total Execution Time versus Write Performance - for Turbo C (DOS) ............. 2-429 Figure 3.14 Adding Write Buffers Does Not Improve Stalls Because of Reads on Busy Writes ... 2-431 Figure 4.1 Different Cache Architectures Discussed ................. 2-432 Figure 4.2 Adding an External Cache Decreases Execution Time for SPEC1 (UNIX) .......... 2-433 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Adding an External Cache Decreases Execution Time for Pagemaker (Windows) " 2-433 Adding an External Cache Decreases Execution Time for Turbo G (DOS) ......... 2-434 Total Execution Time for the Intel486 DX2 CPU as a Function of Cache Size and Associativity - for SPEC1 ... 2-435 L2 Read Hit Rate as a Function of Cache Size and Associativity (UNIX) ........ 2-435 Figure 4.7 Adding External Write Buffers to an External Cache Reduces Execution Time ... 2-437 Figure 4.8 A Hierarchy of Caches and Write Buffers .............. 2-438 Figure 4.9 Improving the Write Performance Benefits a Write Through Cache - for SPEC1 .................... 2-438 Figure 4.10 Different Architectures will Effect CPU Performance with a Write-Back Cache ........ 2-440 Figure 4.11 Different Implementations of the Write-Back Cycle ....... 2-440 Figure 4.12 Intel486 DX2 CPU Total Execution Time with Different Cache Size, Associativity, Memory Speed and WriteBack Method - for SPEC1 .. 2-441 2-408 CONTENTS PAGE TABLES Table 2.1 Bus Profiles of UNIX Applications with a Zero WaitState Memory System ........ 2-413 Table 2.2 Bus Profiles of Windows Applications with a Zero WaitState Memory System ........ 2-413 Table 2.3 Bus Profiles of DOS Applications with a Zero WaitState Memory System ........ 2-413 Table 2.4 Number of Internal Clocks (millions) Needed to Complete Application Trace ... : ........ 2-415 Table 2.5 Total Execution Time in Seconds ..................... 2-415 Table 2.6 CPU Performance versus Total Execution Time .............. 2-415 Table 2.7 Percentage of Total Execution Time Stalled under the Three Different Write Stall Conditions ................... 2-422 Table 3.1 Page Hit Ratios .............. 2:423 Table 3.2 Memory Systems used for the No-Cache System Test ....... 2-424 Table 3.3 Memory Systems with Different Write Performances .......... 2-428 Table 3.4 Stall Statistics for the Write Buffers for the SPEC1 (UNIX) Trace ........................ 2-430 Table 4.1 Memory Systems for the WriteThrough Cache Test .......... 2-434 Table 4.2 Memory Systems Used for Write-Through Cache Test .... 2-438 Table 4.3 Page Hit Ratios for a WriteBack Cache - for SPEC1 ...... 2-439 Table 4.4 Memory Systems used for Write-Back Cache Test ....... 2-441 I AP-469 1.0 INTRODUCTION This section discusses CPU performance optimization techniques for the Intel486TM DX2 microprocessor. The reader should be familiar with the Intel486™ DX microprocessor as well as knowledgeable about memory systems and cache architectures. For further reference, the reader is directed to the following documents and application notes (corresponding Intel order number is shown in parentheses): • Intel486TM DX2 Microprocessor Data Book (241245-00 I) • Intel486TM DX (240440-004) Microprocessor Data Book • Intel486TM DX Microprocessor Hardware Reference Manual (240552-001) • Cache Tutorial 1991 (296543-002) • AP447: A Memory Subsystem for the Intel486™ Family of Microprocessors including Second Level Cache (240799-002) 2.0 The High-Performance Intel486™ DX2 Microprocessor The Intel486 DX2 Microprocessor is functionally equivalent to the now-familiar Intel486TM DX Microprocessor. However, the Intel486 DX2 CPU's internal core runs at twice the frequency of its external bus. This architecture enables a very high level of performance while, at the same time, maintaining straightforward system design. The Intel486 DX2 CPU is partitioned such that the cache and write buffers operate at the full core speed as illustrated in Figure 2.1. As such, the processor is only slowed to the external bus speed on cache misses and when the write-buffers are full. The Intel486 DX2 CPU's external bus interface is identical to its predecessor, i.e. all system cycles emanating from the CPU look exactly as if they would from the Intel486 DX CPU. The Intel486 DX2 microprocessor includes additional features such as JTAG boundary scan and power-down capability that are not covered in this section. • 485TurboCache Module Intel486™ DX Microprocessor Cache Upgrade (240722-002) 66MHz Inte1486'" OX t.ticroprocessor Core l§1~~~~~~~~~~~~1 } Microprocessor 33 MHz Inte1486'" OX External Bus 241261-1 Figure 2.1. The 66MHz Intel486TM DX2 CPU Internal Architecture MS-DOS, Windows, Word, Excel and Flight Simulator are trademarks of Microsoft Corporation. UNIX is a trademark of UNIX System Laboratories. Lotus, 123 are trademarks of Lotus Development Corporation. Turbo C is a trademark of Borland International. AutoCAD is a trademark of AutoDesk, Inc. 2-409 AP·469 Performance optimization for the Intel486 DX2 CPU is subtly different than for the Intel486 DX CPU due to the difference in the internal architecture. This is 'evi: dent if you consider that external memory latencies now affect the full speed core by twice as many CPU clocks (refer to Figure 2.2). In other words, the memory system should really be designed to satisfy the data throughput demands of a 50/66MHz CPU. The next few sections examine the situation in detail so that educated trade-offs can be made during system design. The discussions will focus on CPU-cache memory performance; I/O performance and other architectural issues are not addressed in this applications note. In an ideal system, all CPU cycles operate at zero-wait states and the theoretical maximum performance of the CPU is achieved. However, short of spending a lot of money on SRAMs, a real system always falls short of the ideal zero wait-states. There are many cache-memory designs that differ in both architecture and implementation. However, it may be impossible to design a system that performs better than all others across all applications; different applications generate different types of CPU bus activity and the cache-memory system will perform differently in each case. A range of statistical parameters may be used to illustrate this point: • Internal Cache (LJ) Hit Rate • • • • Number of prefetches Number of operand reads Number of operand writes Bus Utilization (amount of time spent using the CPU bus). These parameters are examined next and will be useful information for cache-memory design trade-offs. The Intel486 DX2 CPU maintains the same unified code/data, four-way interleaved, 8K-byte internal cache as the Intel486 DX CPU. The internal cache (LJ) hit rate is shown in Figure 2.3 for some different operating systems and applications. These hit rates were obtained from instruction traces captured from specific applications. Note that the Ll hit rate for the Intel486 DX2 CPU will be almost identical to that for the Inte1486 DX CPU; the 2X-internal frequency does not significantly alter the cache miss statistics. The DOS applications .included Auto Cad, Lotus123, Excel, Turbo C, and Flight Simulator. Lotus123 had the highest hit rate at 99% while Auto Cad was the lowest at 89.6%. The Windows 3.0 results included instructions executed while clipping an image, drawing a dialog box, executing Excel and while executing Pagemaker. The category UNIX-iSPEC refers to applications within the UNIX SPEC benchmark suite that are mostly integer intensive, whereas UNIX-fSPEC refers to those which are floating point intensive. The last category UNIX-TPI refers to the hit rates typical while running the TPI transaction processing benchmark. These results illustrate the different nature of different software on the bus characteristics. Single-threaded DOS applications have a typically higher hit rate compared to the multi-tasking nature of UNIX and Windows benchmarks. The UNIX floating point benchmarks show the lowest hit rates; this is partly due to the large data structures typical of floating point applications that do not fit well into the L 1 cache. It is also due to the nature of the operations performed on those data structures; i.e. the application does not exhibit very much temporal or spatial locality. • 50 MHz/66 MHz 25 MHz/33 MHz 2.1 Internal Cache Hit Rates •• Inte1486'" DX CPU • • External Bus Speed • Sensitivity to Memory Latency (Internal Core Frequency) Inte1486'" DX2 CPU Figure 2.2. The lntel486TM DX2 Microprocessor is More Sensitive to Memory Latency than Its Predecessor 2-410 241261-2 AP-469 100.00% 95.00% 90.00% 85.00% L1 Cache Hit 80.00% Rate 75.00% 70.00% 65.00% 60.00% +---/---r-+--I----i III ; 0 ... 0 0 -u ~ ~ () () W 0.. III W 0.. ., X Z => 0.. I Ii! X X Z => I Z => 241261-3 Figure 2.3. There will be a Range of L 1 Hit Rates for Different Applications/Operating Systems TPI is a UNIX multiuser multithreaded application with heavy amounts of disk and I/O as well as computation. The L1 cache hit rate is low since it has many active contexts due to multiple requests per user. High Ll hit rates are typical of many prevalent and commonly used DOS benchmarks - some have hit rates approaching 100%. This is unfortunate since they fail to properly account for the more realistic external cache and memory demands of most DOS and Windows applications. These demands will continue to be true with the advent of newer graphical-user-interfacebased multi-tasking operating systems and applications. With these benchmarks, there is a danger of misrepresenting system performance for the Intel486 DX microprocessor. For the Inte1486 DX2 microprocessor, this misrepresentation can be even more damaging since Ll cache misses incur a penalty that is twice the number of external clock cycles since the CPU core now runs twice as fast internally. In other words, the Intel486 DX2 CPU is twice as sensitive to wait states compared to the Intel486 DX CPU. To properly gauge the external cache and memory performance, more demanding benchmarks (e.g. UNIX SPECmarks) or real application benchmarks should be used. 2.2 Bus Cycle Mix Different applications cause the CPU to generate a different number of reads, writes and instruction prefetches. The reads and prefetches are filtered by the internal Ll cache before reaching the external CPU bus. All writes propagate through to the external bus since the internal cache follows a write-through protocol. This is shown in Figure 2.4 for an instruction trace captured from an integer SPEC benchmark. Prefetches (49.5%) Hit Rate ;:====~ 93.6% Reads (33.5%) ~~I 93.0% Reads/Prefetches I (24 5%) ~----~----~----~I Writes (17%) Internal Bus: Number of Cycle Types :I Writes (75.5%) I External Bus: Number of Cycle Types . 241261-4 Figure 2.4. The Effect of L 1 Cache Hits on the External Bus Cycle Mix for an Integer SPEC Benchmark 2-411 Ap·469 As with the Inte1486 DX CPU, the bus cycle mix for the Intel486 DX2 CPU consists of mostly write cycles. However, the exact ratio of reads to writes is again application dependent. For example, for Lotusl23 that has a L1 hit rate of 99%, writes make up 99.5% of external bus cycles. 2.3 Bus Utilization Bus utilization refers to the amount of time that the CPU spends executing bus cycles for a given application. It is a measure of the amount of bus traffic generated by a particular application on the CPU's external bus. This metric is illustrated in Figure 2.5 where the CPU bus is busy for 75% of the twelve external clock cycles shown. Bus utilization is dependent on the application and the external cache/memory system. Different applications generate different amounts of bus traffic. For example, some applications may have small data structures that fit easily within the internal cache and therefore smaller amounts of external bus cycles are generated. The.Intel486 DX2 CPU will have a larger percentage of bus utilization for the same application as compared to the Inte1486 DX CPU. This is due to the faster CPU core that can now operate twice as fast and that will try to generate twice as many bus cycles in the same amount of time. However, since the external CPU bus remains at a IX-frequency, it experiences heavy amounts of bus traffic as it trys to keep up with the 2Xinternal core. In other words, with faster internal core execution, less time is spent idling on the external bus. Different external cache/memory systems also affect the bus utilization; faster cache/memory systems allow CPU cycles to complete faster and therefore free up the bus more. 12 clocks External Clack ADS# -U-----.,LJ~------ BRDY# busy (25%) idle (25%) busy (50%) Figure 2.5. Definition of Bus Utilization 2-412 241261-5 AP-469 2.4 Profiles of Some Applications The Intel486 DX2 CPU bus characteristics of some different applications and operating systems are shown in Tables 2.1 through 2.3. These results are derived from traces captured from the actual applications. These traces were subsequently used in a CPU-cache-memory simulator to extract the desired information. The results shown here assume an ideal zero wait-state memory system; i.e. all bus cycles complete in zero waitstates. Table 2.1. Bus Profiles of UNIX Applications with a Zero Wait-State Memory System UNIX Applications SPEC1 GCC UNIXMIX1 Total Number of CPU clocks simulated 12.14M 12.54M 12.44M Overall L1 hit rate Prefetch hit rate Read hit rate Write hit rate 91.4% 9:3.6% 93.0% 82.0% 90.5% 91.7% 90.3% 85.7% 94.3% 94.8 94.2% 93.5% Number of external bus cycles 0(0 bus code prefetches % bus data reads % bus data writes 1.12M 14.5% 10.0% 75.5% 0.882M 22.6% 20.0% 57.4% 1.18M 10.2% 8% 81.8% Bus Utilization 51.6%. 46.4% 51.4% Table 2.2. Bus Profiles of Windows Applications with a Zero Walt-State Memory System Excel-Calc Pagemaker 27.02M 7.39M 30.06M 95.2% 96.9% 98.0% 87.7% 78.4% 76.0% 85.7% 69.7% 88.1% 81.8% 95.2% 83.5% Number of external bus cycles % bus code prefetches % bus data reads % bus data writes 2.43M 4.2% 3.4% 92.4% 0.654M 27.9% 15.1% 57.0% 3.04M 19.3% 6.7% 74% Bus Utilization 40.9% 60.1% 56.0% Windows Applications Word Total Number of CPU clocks simulated Overall L1 hit rate Prefetchhit rate Read hit rate Write hit rate Table 2.3. Bus Profiles of DOS Applications with a Zero Walt-State Memory System Excel Turbo C Auto Cad Total Number of CPU clocks simulated 11.1M 13.9M 16.1M Overall L 1 hit rate Prefetch hit rate Read hit rate Write hit rate 98.2% 98.8% 97.9% 97.4% 95.6% 94.2% 98.1% 93.8% 89.3% 87.7% 96.5% 81.3% Number of external bus cycles % bus code prefetches % bus data reads % bus data writes 1.06M 2.1% 3.1% 94.8% 1.18M 11.2% 3.0% 85.8% 1.77M 11.8% 4.0% 84.2% Bus Utilization 41.0% 40.6% 55.5% DOS Applications 2-413 AP-469 The UNIX applications are described as: • SPEC l: A mixture of integer SPEC benchmark suite programs running concurrently. • GCC: SPEC benchmark suite GNU C compiler, compiling itself. • UNIXMIX l: A mixture of UNIX utility programs like awk and grep, running concurrently. The Microsoft Windows 3.0 applications are described as: • Word: Microsoft Word for Windows converting a document for import (no VGA activity, includes kernel calls). • Excel-Calc: Microsoft Excel for Windows running a calculation. • Pagemaker: Pagemaker for Windows. formatting a document (no VGA activity, includes kernel calls) The three DOS applications shown are described as: • Excel: Microsoft Excel (DOS version) recalculating a spreadsheet. • Turbo C: Borland's Turbo C compiler compiling a large C program. • Auto Cad: Auto Desk's Auto Cad program computing and displaying a drawing (reason for low hit rate) Note that most DOS applications will typically use the external bus much less than UNIX or Windows applications. On the other hand, heavy duty DOS applications such as Auto Cad will actually exhibit a larger demand for the external bus similar to the demands of UNIX and Windows. Also, recall that the bus utilization numbers shown assume a zero wait-state memory system; more realistic external cache/memory systems with a finite number of wait-states will actually experience a larger percentage of bus utilization. Finally, note that since the L1 hit rate for the DOS applications is high, the external bus mix consists of mostly writes. 2-414 2.5 Wait States Explained Now that we have considered the characteristics of Ll hit rates, bus cycle mix and utilization, we can examine the impact that the cache-memory system has on overall Intel486 DX2 CPU performance. To examine these effects, traces were captured from three applications (one from each operating environment) and were used in a CPU-cache-memory simulator to measure the CPU performance under different conditions. The traces used were SPECl (UNIX), Pagemaker (Windows) and Turbo C (DOS). The simulator used is an accurate and convenient method of comparing performance by varying different parameters independently. This allows us to develop some heuristic rules to guide system design. Before continuing, the following convention is defined to denote memory performance: Notation Convention: A memory system's performance is abbreviated as: Lead-off clocks - Burst 2 - Burst 3 - Burst 4, Write clocks e.g. 3-1-2-1, 3 would look like: External Clock I I I I I I I I ADS# BRDY# - - - , BURST READ WRITE 241261-6 A zero-wait-state case corresponds to a 2-1-1-1, 2 memory system. Figure 2_6. Memory Performance Notation 2.5.1 THE IDEAL ZERO WAIT STATE MEMORY SYSTEM As a starting point, the ideal zero-wait-state memory system was characterized for three applications. For I/O cycles, it was assumed that a constant 8 wait-states were required. Since the I/O instructions were a small portion of the instruction traces used, any inaccuracy due to this assumption will be insignificant. AP-469 The total execution times reported are as follows: Table 2.4. Number of Internal Clocks (millions) Needed to Complete Application Trace SPEC1 Pagemaker Turbo C Intel486 OX CPU 10.76 26.68 13.34 Intel486 OX2 CPU 12.14 30.06 13.90 As an example, we can now compare the actual time required to .complete the applications between a 66MHz Intel486 DX2 CPU and a 33MHz Intel486 DX CPU. The number of clock cycles is multiplied by 15ns for the Intel486 DX2 CPU and by 30ns for the Intel486 DXCPU. Table 2.5 Total Execution Time in Seconds SPEC1 Page maker TurboC Intel486 OX CPU 323 ms 800 ms 400 ms Intel486 OX2 CPU 182 ms 451 ms 209 ms Performance Increase +77% +77% +91.4% Note that although the Intel486 DX2 CPU's internal clock rate is twice that of the corresponding Intel486 DX CPU, the relative improvement is less than 100%. This is due to cache miss reads and write cycles that run at the external bus speed. Note that the improvement is much greater for Turbo C which has a lower cache miss rate and low bus utilization. 2.5.2 ADDING WAIT STATES As wait states are added to the ideal zero wait state memory system, performance degrades. Three memory parameters are of interest in characterizing the memory performance. They are: • Number of wait states added to the first ready of a read (a.k.a. the lead-off cycle). e.g. One wait-state with zero wait·state burst = 3-1-1-1 • Number of wait states during the remainder of the burst cycle. e.g. a zero wait-state lead-off with a one wait-state burst = 2-2-2-2 • Number of wait states on a write cycle. e.g. a one wait-state write takes three clocks. To examine the impact of adding wait-states to each of the parameters above, three series of simulations are done where wait states are added to each memory parameter separately while the other memory parameters are held constant: Zero walt· 1 walt· 2 walt3 waltstates state states states 2·1·1·1,2 3·1·1·1,2 4·1·1-1,2 5·1-1·1,2 ... • .Lead·otf Series: • Burst Series: 2·1·1·1,2 2·2·2·2,2 2·3·3·3,2 2·4-4·4,2 • Write Series: 2·1·1·1,2 2·1-1·1,3 2·1·1-1,4 2·1-1-1,5 ... As the number of wait states increases, the number of clocks needed to complete the application increases (and the CPU performance decreases). This series of measurements is used to separate out the dependency of the CPU performance on the different memory parameters. This information is useful for subsequent external cache/memory design trade-offs. The number of clocks needed to complete the application relative to the zero wait-state case is referred to as the relative total execution time. This metric will be used in the following graphs instead of the reciprocal of total execution time which would be CPU performance; this is so that any inherent linear relationships between wait-states and execution time can be more easily recognized . To translate the total execution time back to CPU performance, the following table is provided for convenience (100% refers to the zero wait-state case): Table 2.6. CPU Performance versus Total Execution Time Total Execution Time CPU Performance 2-415 intel·® Ap·469 Figures 2.7 and 2.8 show the total execution time as wait states are added for the SPECI trace described earlier. As would be expected, as wait states are added, the relative total execution time for the Intel486 OX2 CPU increases faster than the Intel486 OX CPU. (however, note that the absolute total execution time for a Intel486 OX2 CPU will never be greater than a Intel486 OX CPU of the same external bus speed). Also note that the order of importance of the memory parameters ." " .... Eo changes between the Intel486 OX and the Intel486 OX2 CPU. For the Intel486 OX CPU, burst performance is the most important, with read-lead-off performance being slightly more important than writes. This conclusion was presented in Chapter 4 of the original Intel486 OX Microprocessor Hardware Reference Manual (Order Number 240552-001). However for the Intel486 OX2 CPU, while burst performance is still the most important, write performance becomes more important than read-lead-off performance. 135.00% ,0 130.00% -- 0 Worse I ~V; <=.2 '0 115.00% ~~ 110.00% :52 105.00% ~ 100.00% Q: 95.00% ~~ ----- -,0' 120.00% :;~ <> 0 .... Better 125.00% - - ~9 ..... .... ...-~ 'i 0 2 Number of Wail Slale. . "'N 125.00% 120.00% 115.00% 0 )( 2.2 0 .. .... > :g 110.00% 105.00% 100.00% 'i Q: 95.00% 0 2 Number of Wail Slale. 3 -Vl 125.00% 120.00% c:~ .~'i5 :;'"' u 0 115.00% x 110.00% Q) ~ Q) LoIN :s.s o 105.00% Q) I-~ " '" 100.00% q; 95.00% 0 2 Number of Wait States 3 4 241261-12 Figure 2.12.lnteI486 DX CPU Performance Degra~ation as Wait States are Added - for the Page maker Application Trace (Windows) 2-418 AP-469 external bus utilization of the application. However, the degradation is still about twice what it is for the Intel486 DX CPU. Note that the write importance is about equal to the burst importance in this case. This can be attributed to the greater percentage of writes in the bus cycle mix for this application. Finally, the results for the Turbo C application under DOS are shown in Figures 2.13 and 2.14. The rate of performance degradation for the Intel486 DX2 CPU with the Turbo C application is less than the UNIX and Windows examples. This is due to the lower 135.00% i.'l " "" .§] 130.00% u >-'" .~ 125.00% ,0 120.00% t::~ '0 :;~ () 0 115.00% x" ""N 110.00% "~ :§2 0" >- > :;:; 105.00% " '" 100.00% a; ------- 95.00% 2 0 4 3 Number of Wait States 241261-13 Figure 2.13. Intel486 DX2 CPU Performance Degradation as Wait States are Added - for the Turbo C Application Trace (DOS) 135.00% i.'l " 130.00% "" .§] 125.00% t::~ 120.00% u __ - Read Load Off 1-0-- Write Cycle -+-Bums >-(1) .~·c :;~ "x""~ 115.00% () ""N ]2 110.00% 0" 105.00% " 100.00% >-> :;:; a; '" - _ ~ _---..-8 3 4 -1"1..------- --~------- 95.00% 0 2 Number of Wait States 241261-14 Figure 2.14.lnteI486 OX CPU Performance Degradation as Wait States are Added - for the Turbo C Application Trace (DOS) 2-419 AP-469 2.5.3 WAiT STATES AND CPU STALLS 2.5.3.1 Delay Till First Ready of a Read The main reason why the relative performance of the Intel486 DX2 CPU degrades faster than the Intel486 DX CPU is that for every external wait state, two internal clock delays are caused. In fact, a zero wait state cycle on the external bus of the Intel486 DX2 CPU is already a two·wait state cycle as experienced by the 2Xclock internal core as shown in Fig. 2.15. Theimaginary 2X-clock versions of the signals ADS# and BRDY # illustrate what the cycle might have looked like if the internal bus frequen cy was equal to the external. Wait states incurred on the first ready of a external read (the lead·off cycle) affect both data reads and code prefetches. For data reads, the CPU's execution is stalled under most conditions; no ~ther operation can happen in parallel until the first ready of the line fill is received. For code prefetches, execution is stalled if the processor is fetching code as a result of a code branch (therefore flushing the prefetch buffers). For most applications, the read lead-off delay increases the execution time the least compared to the other pa· rameters. This is because writes cycles usually make up the dominant share of the bus cycles. However, there are exceptions to this case; for example, with the GCC trace,only 57.4% of the bus cycles were writes. Extending this fact, this means that a one wait-state cycle for the Intel486 DX2 CPU is actually equivalent to a four wait-state cycle for the 2X-clock internal CPU core. 2.5.3.2 Wait states on Bursts The internal cycle start indication conditions may also have a one internal clock cycle synchronization penalty if it is active in the wrong phase of the external clock (also shown in Figure 2.15). Adding wait states to the burst cycle increases the execution time the most. The burst is usually the result of a cache line fill or a code prefetch. Adding wait states to this parameter ties up the bus for the longest periods of time compared to adding the same number of wait states to the other parameters. As a result, all subsequent external bus requests are stalled as the CPU waits for the burst cycle to complete. These include stalls while the CPU waits to do a read cycle. A longer burst cycle also delays the rate at which the internal write buffers can be emptied since the write buffers must also wait for the external bus to free up. This causes stalls as described below for write cycles. As the effective number of wait states increases, the CPU will stall program execution differently for each of the three memory parameters described above. The stall conditions for each memory parameter are elaborated below. Exlernol Clock _1-_ _ _.,I_ _T_I_-+_ _ T2~_+_---_+_ Inlernol Clock -1--+I---4I.....;.T.,;.1-+-1...;T.;;2-+_T.;.:2~I~T.;;.2-+_-+___fInlernal Cycle Start Indication I ,.---+_---+-----4- '---!-J- --I:-------t·r ·---.·-r: Cor. vers.;on of ADS#. ADS# . uuu t:- ft ------ _ " I I I : I BRDY# Core version. -:- ______ ~ ;:--.Exlernol ______ To __ - . of BRDY# } Zero Wail Siale External Cycle I • __ - ___ Cycle Complete • _ _ I Internal Cycle Start Indication coreo;e;~~~ --I-•T I ! L......I --(- ---! I T2 T2 I T2 T2 _+.: ---- -i------ -1----- -t I I I I -- ~=!:~~~l f~~ase Alignment 241261-15 Figure 2.15. A Zero-Wait State Write for the Intel486™ DX2 CPU 2-420 Ap·469 Finally, longer code prefetch bursts will slow down CPU execution if the prefetch was a result of the prefetch queue being flushed. This is especially so if the instruction required extends beyond the first dword of the burst and therefore the CPU must wait for subsequent dwords before execution can start. at which the write buffer can be emptied. Since the Intel486 DX2 runs at a 2X-internal frequency, the likelihood of filling up the write buffers increases when compared to the Intel486 DX CPU as shown in Figure 2.16. 2.5.3.3 Write Wait States There are three conditions under which a longer write cycle will stall CPU execution as additional wait states are added. These conditions are: 1. The write buffers are full and cannot accept any more writes. 2. A read cannot bypass the write buffers and must wait for them to be flushed. 3. A read bypasses the write buffers but must wait for an existing write cycle to complete. posted~ Writes " \ Before these effects are elaborated, it is worthwhile to reexamine the operation of the internal write buffers. The Intel486 DX2 CPU uses the same four-deep write buffers as the Intel486 DX CPU. The write buffers can accept data writes from the execution core as fast as one per clock. Once a write request is buffered, the internal unit that generated the request is free to continue processing. When all write buffers are full, any subsequent write transfer will stall inside the processor until a write buffer becomes available. The bus interface unit can re-order pending reads in front of buffered writes. This is done because pending reads can prevep.t an internal unit from continuing, whereas buffered writes need not have a detrimental effect on processing speed. Writes are propagated to the external bus in the same first-in-first-out order in which they are received from the internal unit. However, a subsequently generated read request (data or instruction) may be reordered in front of buffered writes. As a protection against reading invalid data (reading stale data from a location in main memory when the location has been modified in the write buffers), this reordering of reads will only occur if all buffered writes are internal cache hits. Because an external read will only be generated for a cache miss, and will only be reordered in from of buffered writes if all such writes are internal cache hits, any read generated on the external bus will never read a location that is about to be written by a buffered write. . This reordering can only happen once for a given set of buffered writes, because the data returned by the read cycle could otherwise replace data about to be written from the write buffers. The first condition that causes CPU stalls is when the write buffer is full. Write wait states decrease the rate 241261-16 Figure 2.16. The Intel486 OX2 CPU's Write Buffers are More Heavily Used than the Intel486 OX CPU's The second situation that degrades performance is during reads which cannot bypass the write buffers - either because the buffered writes were cache misses or because a read reordering had already occurred. These reads will be stalled until the write buffers are emptied. The more wait states required for writes on the external bus, the longer these stalls will last. Finally, reads which can bypass the write buffers may be stalled by a write already in progress on the external bus. This condition is illustrated in Fig 2.17 for both the Intel486 DX and Intel486 DX2 CPUs. Note that for this example, although both the Intel486 DX2 and Intel486 DX CPUs take the same amount of time to complete the instruction stream, the Intel486 DX2 CPU is stalled longer (waiting for the write to complete) relative to its own internal 2X clock. Out of the three conditions described above, stalls on write buffers full and stalls because of reads on busy writes dominate the increase in execution time as wait states are added to write cycles as shown in Table 2.7 for the SPEC 1 trace. (The results shown in Figure 2.7 assume that the read and burst cycles complete in zero wait-states). These effects will be discussed again later when the addition of external write buffers is considered. 2-421 AP-469 Intel486™ DX INTERNAL EXECUTION STREAM WRITE Intel486™ DX2 INTERNAL EXECUTION STREAM EXTERNAL BUS write write posted posted ADS# • INSTRUCTION X (WAIT) INSTRUCTION Y (WAIT) } we;" BRDY# READ (CACHE MISS) (BUSY) ADS# (DONE) BRDY# } read time time The external read bypasses any writes that may have been posted ~~m7~~structions WRITE INSTRUCTION X INSTRUCTION Y READ (CACHE MISS) ~:~~0 }ti,,!_. spent (BUSY) (BUSY) (BUSY) (BUSY) (DONE) waiting for bus to fre_ up time 241261-17 Figure 2.17. Reads Stalled as a Result of a Write Already In Progress Table 2.7 Percentage of Total Execution Time Stalled under the Three Different Write Stall Conditions Intel486 OX CPU Write wait states 0 1 2 Intel486 OX2 CPU 0 1 2 Stalls on write buffers full 0.0% 0.1% 0.6% 1.0% 3.1% 6.5% Reads cannot overtake writes 0.1% 0.1% 0.3% 0.3% 0.5% 0.5% Stalls because of reads on busy writes 0.6% 1.3% 2.2% 2.4% 4.8% 7.5% 3.0 Memory Design Optimization Some Intel486 DX2 CPU-based designs will include a memory system without an external cache. This section covers the design of such a cacheless memory system. Different memory architectures are discussed and the benefits of improving write performance through the addition of external write buffes will also be considered (see Figure 3.1). Main memory performance will be important for external cache-based designs also, especially for applications with low external cache hit rates. It is recommended that the performance impacts of design choices in a cacheless memory design are understood even if you have already specified an external cache in your design. 2-422 MEMORY SYSTEM MEMORY SYSTEM 241261-18 Figure 3.1 The Two Cacheless Intel486 DX2 CPU-Based Systems Considered As discussed in the previous sections, the Intel486 DX2 Microprocessor requires a fast memory system for optimum performance. Memory systems that may have been adequate for Inte1486 DX CPU designs running DOS applications may be sUDoptimal for the Intel486 DX2 CPU, especially running today's more demanding operating systel!ls and applications. Main memory page-mode operation and interleaving techniques are important for Intel486 DX2 CPU perfor~ance. These are commonly used in existing, welldeSIgned Intel486 DX CPU memory systems. However, some systems still use non-interleaved memory designs borrowed from Inte1386 DX systems. These will be less than optimal for a high performance Intel486 DX2 CPU workstations design. AP-469 Table 3.1 Page Hit Ratios 3.1 Page Mode DRAM SPEC1 Page-mode main memory controllers can be impl\!men ted in several fashions. Typical memory systems utilize paging for all accesses - during the beginning of a read, during read bursts and during write cycles; i.e. the RAS # line is held active after all accesses and only returned inactive during a page miss. Alternatively, paging may be used only for the burst portion of a read cycle; the RAS# line always returns inactive after the read or write cycle has been completed. This method is more commonly used in conjunction with a write-back external cache as discussed in Section 4.4. 31.2% 68.8% 26.4% 73.2% 25.4% 74.6% Write Page Hit Write Page Miss 65.5% 34.5% 68.8% 31.2% 68.9% 31.1% 3.2 Interleaving Interleaving involves the use of more than one bank of memory; different banks are controlled separately. As an access is occurring, the other banks are being readied for the next access. Interleaving can be implemented in several ways. Horizontally interleaved banks generate accesses for consecutive locations in memory. For example, one bank can be designated as an odd dword and another for the even dword. Vertically interleaved banks separate large contiguous regions of memory between banks; i.e. multiple DRAM pages are open. Memory controllers often combine both methods. A paged memory system also allows for faster back-toback write cycles. As was true for the Intel486 DX CPU, the Intel486 DX2 CPU generates writes in strings of two, about 60%-70% of the time, and writes in strings of three about 40%-50% of the time. This bus characteristic accounts for a large page hit rate for writes; therefore, it is faster to perform the back-toback write cycles in a fast page mode rather than performing a full RAS#-CAS# cycle for each write. Horizontal interleaving can be combined with paging to generate very quick burst reads. Two 32-bit banks can generate a zero wait-state burst as detailed in the Intel Applications Note AP447 "A Memory Subsystem for the Intel486TM DX Family of Microprocessors including Second Level Cache." Fig. 3.3 illustrates a burst read cycle from a paged-interleaved memory system. The signals CASO # and CAS 1 # drive each of the two 32-bit banks of memory in this example. At this point, it is worthwhile to examine the page hit! miss ratio for the three applications considered in the previous section. These results are shown in Table 3.1 and assume no external cache and a page size of 8192 bytes. BRDY# Read Page Hit Read Page Miss Note the low page hit ratio for CPU reads. This is due to the internal cache of the Intel486 DX2 CPU that filters read requests and tends to make the cache miss reads more randomly distributed throughout main memory. For read burst cycles, the 16-byte linefill of data or code will always lie within a DRAM page, thereby allowing the data or code to be strobed out of memory with a series of back-to-back CAS# pulses. Paging allows for a much faster burst cycle compared to the case where a full RAS#-CAS# cycle is required for each dword. A page mode burst read access to a single bank of memory is shown in Fig. 3.2. .ADS# ~~~+-~~--~~~~--~~ --t-Tj i l' 0 -r-T""W024- i 01.1' i l' i 1 ____ 02 RAS# -----~----~,----4-----~--~,----_+,----~,~--_+,----~, MEM :::: Pagemaker TurboC 03 ~,----~,----~, ::::~:::::~i~x_::::tl~:~AL~~:~~_o~~J~::)ct:]V~ALm,o~oI:~I:::~)C'P,V~A~Li~Ooi:ql::J~(~:]V~ALmiD~o!:~I::::=ti::::=ti I I I I I I DATA 241261-19 Figure 3.2. A Page Mode Burst Read (Page Hit on Lead-Off Cycle) 2-423 AP-469 ADS# BRDY# RAS# MEM ADDR ~~~4-~-4--~4-, , --'--i"-1--ilL-4---l--J~--i.J --+---~-~--~--4---+--~~-_+--- ==:;::==;X:=~1m[I¢=>C~==)1:m::~;::=~::)(= CASO# CAS1 # DATA --~--~--7'L.2£.";'J ----I--+--r--+--<:J1DCtllX::mOCJE)241261-20 Figure 3.3. Burst Read Cycle from a Paged-Interleaved Memory System Some implementations of a two-bank interleaved memory may be limited to a 1-2-1 burst cycle for the last three dwords. This is mainly limited by the amount of time it takes to invert the A3 address line between the second and third dwords. Table 3.2. Memory Systems used for the No-Cache System Test System A System B SystemC System D System E System F System G 3.3 Memory Read Performance Considerations Seven memory systems with different read performance parameters are examined; write performance is kept constant during these simulations. The memory parameters are as follows: Systems A through' D represent the. performance of some typical page mode memory controllers while the performance of systems E through G would require a paged-interleaved memory controller. Read Page Hit Read Page Miss Write Page Hit Write Page Miss 4-3-3-3 4-2-2-2 4-2-2-2 3-2-2-2 3-1-2-1 3-1-2-1 3-1-1-1 8-3-3-3 8-2-2-2 7-2-2-2 8-2-2-2 7-1-2-1 6-1-2-1 ' 6-1-1-1 3 3 3 3 3 3 3 6 6 6 6 6 6 6 The results for the Intel486 DX2 CPU with these memory systems is shown in Figure 3.4 through Figure 3.6 for the SPEC1, Pagemaker and Turbo C traces used previously. The graphs show the total execution time relative to the ideal zero-wait state memory system. 170.00% 160.00% 150.00% 140.00% 130.00% 120.00% 110.00% 100.00% A B C D f' G Memory System 'Figure 3.4. Totallntel486 DX2 CPU Execution Time versus Memory Read Performance - for SPEC1 (UNIX) 2-424 241261-21 AP-469 170.00% :l " u 160.00% "" ED t=iii 150.00% .~ c'S 140.00% " 0 OIL 130.00% :;~ X" "'N :B.s ....0 ...~ 120.00% D 110.00% "i '" 100.00% A B C 0 E F G Memory System 241261-22 Figure 3.5. Totallntel486 DX2 CPU Execution Time versus Memory Read Performance - for Pagemaker (Windows) 170.00% "" u" 160.00% "ED" 150.00% c_ .2 '0 140.00% t=iii :;~ " 0 " L x" 130.00% "'N :B.s 0 120.00% .. t-~ " 110.00% "i '" 100.00% A B c o F G Memory System 241261-23 Figure 3.6. Totallntel486 DX2 CPU Execution Time versus Memory Read Performance - for Turbo C (DOS) The interesting points to note here are: • As expected, the DOS application suffers the least from slow memory performance. • Burst performance is very important. Note the improvement from system A to B, system D to E and even from system F to G (where one clock was removed from the third burst). • Since the read page hit ratio is lower than 50%, improving the read page miss lead-off cycle is more important than the read page hit lead-off cycle. Note the improvement from system B to C versus the improvement from system B to D. 3.4 Memory Write Performance Considerations There are several methods of improving the write performance of the memory system. These methods are first described; the benefits of the different methods are discussed later. The most common method for improving write performance is to employ page mode accesses to DRAM. As shown in Table 3.1, the page hit ratios for write cycles favor the use of page mode accesses. An example of a DRAM write cycle is shown in Fig. 3.7. On page hits, back~to~back three clock write cycles can be maintained. A page miss write would of course take an additional number of clocks to allow for the RAS ill precharge time. 2-425 AP-469 Ext~f:c~ --+--+--+---+--+---+---+ ADS# BRDY# RAS# _-+__ __ ~ __ __+-__ ~ ~ ,l!!..I' CAS# I ~_~ ~~~-J' I I , ~ Dete;mine . page hit/miss; Strobe Data allow for address into ORA... • ~ Det.;mine page hit/miss; strobe Data allow for address into DRAM setup time. setup time. 241261-24 Figure 3.7. Page Mode DRAM Allow for Fast Back-to-back Writes Memory write performance can be improved further'by two related methods: write buffering and pipelining. If one external write buffer is added to the memory system shown in Fig. 3.7, the write cycles in Fig. 3.8 may be observed: I In the example shown, the one level of write buffering allows the flfSt ready, signal to be returned one clock earlier. The first write cycle finishes in two clocks (zero wait states); however, if the CPU puts out many backto-back writes (as is typical), the memory system will , still be limited to a throughput of three clock writes subsequent to the first write cycle. If the CPU write is not followed immediately by any bus traffic, the one write buffer does relieve the CPU quickly, especially if the write was a page miss. More than one level of write buffering is sometimes employed. This would allow multiple writes to be accepted at zero wait states before wait states of the main EX~f:c~ memory system affect the CPU bus. To get the maximum benefit from multiple write buffering, CPU reads that occur when there are more than one writes pending in the external write buffers should be allowed to bypass the writes and be executed as soon as the existing memory write is complete. This is similar or course .to the internal write buffers of the Intel486 CPU. If the read cycle's address corresponds to an address already in the write buffers, the read must wait until the corresponding write completes so that the read does not fetch stale data from memory. In other words, care must be taken to ensure data consistency when using . ' external write buffers. Write pipelining .extends the use of buffering by overlapping the meniory controller operations during the write cycle. An example is shown in Fig. 3.9 below. The data phase ofthe last write cycle (CAS# pulse) is overlapped in time with the address phase of the next memory write cycle (Page hit/miss decoding, etc.). _-+__+_T_2_+-_-+__+ __-+-_T_l-+ ADS# BRDY# RAS# ---+---~---+---+--~~--+----+-- ,l.!:.J' CAS# ~~~-J' Determine page hit/miss; I I ~ Strobe Data allow for address Into DRAM setup time. , • ~ Det.imine page hit/miss; strobe Data allow for address into DRAM setup time. Figure 3.S. Adding One Write Buffer to the Memory System 2-426 241261-25 AP·469 -+__T_l__r-_T_2~~_T_l~__T_2__+-_T_l__r-_T_2~~__-+ Ex~~c~ ... ADS# 8RDY# RAS# ---+----~---~----+---~------~---+----~ CAS# Write # 1 ~ '-~~---" ~ Determine Strobe page hit/miss; Dala allow for address into ORA~ setup time. \ Write # 2 ----... Dete~mine ~ Strobe page hit/miss; Dolo allow for address into DRAM setup time. Write # 3 ---+- Dete;mine page hit/miss; allow for address setup time. ~ Strobe Dalo into DRAM 241261-26 Figure 3.9. Pipelining the Writes to Memory With pipelining, it is possible to achieve a throughput of many two-clock back-to-back page hit writes. (An example of write pipelining can be found in Intel Applications Note AP447) Note that pipelining may affect a subsequent read cycle; if the. CPU read occurs immediately after the write, the beginning of the read will be delayed until the DRAM write cycle has completed. . This is especially true for page miss writes where the write may take several clocks to complete. Note also that pipelined memory write systems can be combined with write buffering; this helps for the case where many back-to-back page miss writes occur. Note that both write buffering and/or pipelining require a data path device between the CPU and main memory (see Fig. 3.10). This is needed to capture the data from the CPU before RDY # or BRDY # is returned, after which point the data will become invalid. DATA 241261-27 Figure 3.10. A Data Latch is Required for Write Buffering or Pipelining To understand the performance benefits of the various methods described, systems Band F from the earlier simulations for read performance are repeated with different write performance parameters (refer to Table 3.3). Systems Band F were chosen as typical representations of paged and paged-interleaved memory controllers respectively. 2-427 AP·469 Table 3.3. Memory Systems with Different Write Performances Read PgHlt Read PgMlss Write PgHlt Write PgMlss Write Method System 81 System 82 SystemB3 System B4 System 85 System 86 4-2-2-2 4-2-2-2 4-2-2-2 4-2-2-2 4-2-2-2 4-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 3 3 2 2 2 2 6 6 6 5 5 5 Normal One buffer Pipelined Pipe lined Pipelined with two buffers Pipelined with four buffers System F1 System F2 System F3 System\F4 System F5 System F6 3-1-2-1 3-1-2-1 3-1-2-1 3-1-2-1 3-1-2-1 3-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 3 3 2 2 2 2 6 6 6 5 5 5 Normal One buffer Pipelined Pipelined Pipelined with two buffers Pipelined with four buffers The memory systems above were simulated again for the Intel486 DX2 CPU with the three applications. The results are shown in Fig. 3.11 through Fig. 3.13 below: 150.00% .. ...." .§] 145.00% 0 .... en c_ 140.00% ---- ...... ... -- .FNl ~ Ii ...:. ~ ~ ~ ~ ~-------.---~ ~ 135.00% .2 '0 :;~ .. 130.00% :2.2 125.00% .... > 120.00% u 0 - . ~ )( \UN 0 .. :;; " G '" 115.00% 110.00% 2 3 4 Memory System 5 6 241261-28 Figure 3.11. Total Execution Time versus Write Performance- for SPEC1 (UNIX) 2-428 AP-469 150.00% m 145.00% 8 ~~:-:-:.:--:~~~~~~:-:-:::: I~:I::::: ~ ~ 140.00% E-E t=Vi 135.00% ,,~ .~ '0 :; 3: g ~ e ... .-.-.- ~ ~, -. I)----IJI...... 130.00% ~ I.tJ N 125.00% :§2 ~.~ C ~ 120.00% 115.00% 110.00% +------i-----+-----+------I------t 2 3 4 5 Memory System 241261-29 Figure 3.12. Total Execution Time versus Write Performance - for Pagemaker (Windows) 150.00% ~ 0 u'" ~ ~ .~] r-:-=at 145.00% ~ 140.00% 135.00% 1-11) ,,~ .~ '0 :;3: u 0 ~ ~ 130.00% 125.00% ~ ~ "'N 120.00% ~ 115.00% 0 110.00% :s.s o 1-:; 0; '" -:-:--:--:-!Il,.. -'- .- .•. -. ~ ~ ~. ~ ~ ~ ~ ...-.---.-. 105.00% 100.00% 2 3· 4 Memory System 5 241261-30 Figure 3.13. Total Execution Time versus Write Performance - for Turbo C (~OS) 2-429 AP-469 From the results shown, the following significantly improved CPU performance: • Reducing the number of clocks for page hit writes (system B2 to B3 and F2 to F3) • Reducing the number of clocks for page miss writes (system B3 to B4 and F3 to F4) The following caused marginal improvement in the execution time: • Adding one level of buffering (from memory systems BI to B2 and FI to F2) • Adding more than two write buffers (from B4 to B5 to B6 and F4 to F5 to F6) resulting in only a 1-2% improvement. These results may be somewhat surprising considering the earlier graph showing performance degradation as the number of write wait states increases (Fig. 3.7). One would expect that adding write buffers would compensate for the slower memory write system more than they do. In order to understand the results, consider the statistics in Table 3.4 for processor execution stalls as a result of write activity as described in Section 2.5.3. The statistics are shown for the SPEC 1 trace. Note that the Stalls Because of Reads On Busy Writes dominates the increase in execution time for memory system Fl compared to the zero wait state case. Adding one write buffer (from system Fl to F2) - in an attempt to improve performance - decreases the percentage of stalls on a full write buffer from 5.3% to 4.5%. However, while the number of Stalls· Because of Reads on Busy Writes did decrease, the wait states were simply transferred to stalls while waiting for the first ready of a read and no net imp rovement is observed. One example of this situation is illustrated in Fig. 3.14. Table 3.4. Stall Statistics-for the Write Buffers for the SPEC1 (UNIX) Trace Percentage of Total Execution Time Stalled: Zero Wait State Case On Write Buffers Full Because of reads on Busy Writes Because a Read Cannot Overtake a Write Waiting for First Ready 0.3% 8.8% 1.0% 2.4% Memory System F1 5.3% 8.0% 0.5% 17.1% System F1 plus one write buffer 4.5% 4.9% 0.4% 20.9% 2-430 AP-469 Inte1486 Th1 DX2 INTERNAL EXECUTION STREAM EXTERNAL CPU BUS write posted to CPU's write buffers - - - -... ~ ADS# WRITE INSTRUCTION X INSTRUCTION Y READ (CACHE MISS) write (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) 1 (WAIT) (WAIT) BRDY# ADS# read 1 (DONE) (WAIT) BRDY# time time Inte1486 Th1 DX2 INTERNAL EXECUTION STREAM WRITE INSTRUCTION X INSTRUCTION Y READ (CACHE MISS) 241261-31 DRAM MEMORY BUS EXTERNAL CPU BUS write posted write posted to CPU's write buffers to memory write buffers ~ ADS# ~ MADS# write { BRDY# (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) .,,,·1 ADS# read MBRDY# (WAIT) MADS# ,.,d { BRDY# time (WAIT) (WAIT) (WAIT) (DONE) (WAIT) (WAIT) MBRDY# time time 241261-32 Figure 3.14. Adding Write Buffers Does Not Improve Stalls Because of Reads on Busy Writes 2-431 AP·469 In the example shown, without external memory write buffers, the cache miss read shown stalls for four clock cycles while waiting for the write cycle to complete. With memory 'frite buffers, the CPU need not wait to start the read cycle since the write completed in zero wait states. However, since main memory is still occupied with the original write cycle, the read is still delayed externally while the write completes. The net effect is that the read-stalls because of write traffic does not decrease; th e write traffic has simply been transferred from the CPU bus to the memory bus where it has to contend with the next read cycle. Note that there will be instances where the addition of external write buffers to a cacheless memory system does benefit a sequence of bus cycles. This would be the case for applications with very low external read traffic and large amounts of write traffic. In this case, the write buffers do benefit the heavy write traffic while the reads on busy writes will be a lower percentage of total stalls. 3.5 Viability of Intel486 DX2 System without an External Cache As shown in this section, the CPU performance of a cacheless, main-memory-only Intel486 DX2 CPU. based system will range from good to fair depending on the application. The correct cost-performance point . will dictate the viability of such a product. For the Windows and UNIX applications tested, with a good memory design, the Intel486 DX2 CPU will get to about 120% of the execution time of the ideal zero wait state case with the examples shown. The reciprocal of total. execution time is CPU performance; wh ich works out to 83% of maximum in this case. Of course, other system design factors will come into play, such as refresh requirements, other bus master memory requirements, etc. More exotic memory architectures may improve the performance of the cacheless Intel486 DX2 CPU system design over what has been discussed here. However, the next section will address the more straightforward method of increasing CPU performance further: adding an external cache. . 4.0 CACHE DESIGN OPTIMIZATION An external cache will supplement the on-chip 8K cache of the Intel486 DX2 CPU. The requirement for an external cache is more important for the Intel486 DX2 CPU than it was for the Intel486 DX CPU for all the reasons discussed in the previous sections. Many cache architectures have been implemented with the Intel486 DX CPU. Caches differ depending on their size, associativity, serial vs. parallel implementations, write-through vs. write-back policies, etc. This section will focus on optimizing the performance of a uniprocessing system, i.e. the CPU is the major consumer of main memory bandwidth. The architectures discussed are shown in Figure 4.1. MEMORY SYSTEM MEMORY SYSTEM MEMORY SYSTEM 241261-33 Figure 4.1. Different Cache Architectures Discussed 2-432 AP-469 4.1 Overall Effect of an External Cache on CPU Performance shown in Figures 4.2 through 4.4 for the three applications tested earlier. The same memory systems tested in the previous section are tested again with a l28K 2-way associative write-through parallel cache. This will yield the improvement achieved by the decrease in the effective number of read and burst wait states. The results are The addition of an external write-through cache reduces the performance degradation caused by main-memory wait states for the lead-off cycle of a read and for wait states during the remainder of a burst. The impact of these wait states was discussed in Section 2.5.3. :::c 170.00% u 160.00% "" .§] 150.00% .... VJ c- .2 '0 130.00% 22 120.00% 0" c 110.00% 0; '" I 140.00% ::;'" "0X"0~ "'N ..... ~ r. Nocach;j 10 Cach. 100.00% }A r- ~ rr- ~ I. I. B c D E F G Memory System 241261-34 Figure 4.2. Adding an External Cache Decreases Execution Time - for SPEC1 (UNIX) ::: c u . "E1l ~iil c- 170.00% 140.00% . x .. 130.00% :s.s 0 .. 120.00% ~ "'N ..... ~ 110.00% c 0; '" o Cach. r 150.00% .2 '0 ::;'" 00 I. No cach.1 160.00% 100.00% r- ~ t.- rr- I.- ~ r A B c D Memory System G 241261-35 Figure 4.3. Adding an External Cache Decreases Execution Time - for Pagemaker (Windows) 2-433 AP-469 ." 0 170.00% u 160.00% "" Eb 150.00% c_ .2 '0 140.00% . 130.00% ~Cii :;'" "u "0 "'N )( ]2 0" ..... ~ 0 120.00% 110.00% a; '" 100.00% A B C E D G Memory System 241261-36 Figure 4.4. Adding an External Cache Decreases Execution Time - for Turbo C (DOS) For the UNIX and Windows applications, the addition of the external cache improved the execution time by 15%-35% depending on the memory design. The cache used in this case - a 128K two-way set associative cache - does an excellent job of buffering the CPU performance from the memory system performance. However, note that even with the external cache, the execution time is still 12%-18% above the zero wait state case for these two applications. This is due to the write performance of the memory system since the cache policy in this example is write-through. Further improvement on the write performance is investigated later in this section. For the DOS application, the addition of the cache brings the performance of the Intel486 DX2 within 5% of the ideal zero wait state case. This is of course due to the lower miss rate of the CPU's internal cache and the application's low bus utilization. 2-434 4.2 Effect of Cache Size and Associativity A 128K two-way-associative, write-through, parallel, external cache was used in the previous section. As the size and associativity of the cache are varied, the CPU performance varies. This is shown in Fig 4.5. for the memory systems Band F as used earlier (see Table 4.1). Both one-way (direct mapped) and two-way set associative caches are tested with the SPEC1 application trace. Table 4.1. Memory Systems for the Write-Through Cache Test I System B I System F Read Page Hit Read Page Miss Write Page Hit Write Page Miss 4-2-2-2 8-2-2-2 3 6 3-1-2-1 6-1-2-1 3 6 AP-469 • 125.00% • • • • • • • • • • • • • • --....- r. one-way --*-- r. two-ways -+- B, one-way 120.00% - X - B, ... two-ways -----;--:-:--:-~-::-:-:-~--:--:-:-:-:---:--:.-~-~-~-~-~-~ 115.00% 110.00% 105.00% 100.00% +------t------t------t------i 64K 12BK 256K 512K 1024K Cache Size 241261-37 Figure 4.5. Total Execution Time for the Intel486 DX2 CPU as a Function of Cache Size and Associativity - for SPEC1 . Fig 4.6 illustrates the external cache hit rates for CPU read cycles. The hit rates are directly related to the total execution time; higher hit rates result in shorter execution times. 95.00% 90.00% .. -: ..... ,.' , 85.00% .2lc '" BO.OO% ~ :I: "C 75.00% '"" 70.00% c .- " "" "." .-.. ... _-:'~"':"'~'t;< , 65.00% 60.00% 64K 128K 256K Cache Size 512K 1024K 241261-38 Figure 4.6. L2 Read Hit Rate as a Function of Cache Size and Associativity (UNIX) 2-435 AP-469 4.3 Improving the Performance of a Write-Through Cache With a write-through cache, good memory write performance is necessary to achieve the best possible performance with the Intel486 DX2 microprocessor. All of the methods for improving the write performance for a cacheless system, discussed in section 3.4, also apply for the write-through cache-based system. 4.3.1 MEMORY WRITE PIPELINING The previous results for a write-through cache assumed a non-pipelined memory system with a page-hit write performance of three clocks and a page-miss performance of six. The most effective method of increasing the memory write performance further is the use of memory write pipelining. The write performance can be improved so that continuous back-to-back page-hit write. cycles can complete in zero wait-states. Pipelining can also reduce the number clocks required for a page-miss write cycle. As the write performance improves using this technique, the write-through cache system can come close to that of the ideal zero wait state system. These results are shown in Section 4.3.3 to follow. 2-436 4.3.2 EXTERNAL WRITE BUFFERS Adding one or more write buffers to a external writethrough cache-based system improves performance by a larger amount compared to the cacheless case. Figure 4.7 illustrates why. The addition of external write buffers allows the memory write cycle to be "hidden" from the CPU bus if the next CPU cycle happens to be a external cache hit read. And since the external cache read hit ratio is high, most of the delays which were present in·a cacheless system under these circumstances are removed. In essence, the on-chip cache/write-buffers have been duplicated externally to provide a multiple level architecture (see Fig. 4.8). • AP-469 Intel486™ DX2 INTERNAL EXECUTION STREAM DRAM MEMORY BUS EXTERNAL CPU BUS write posted to CPU's write buffers WRITE INSTRUCTION X INSTRUCTION Y READ (CACHE MISS) - - - -... ADS# write (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) (BUSY) - - - - . . MADS# 1 write (wAIT) (WAIT) BRDY# 1 (WAIT) (WAIT) MBRDY# ! ADS# BRDY# (got dword 1) (BUSY) cache hit read BR.DY# time time time 241261-39 Without Memory Write Buffers Intel486™ DX2 INTERNAL EXECUTION STREAM DRAM MEMORY BUS EXTERNAL CPU BUS write posted WRITE INSTRUCTION X INSTRUCTION Y READ (CACHE MISS) to CPU's write buffers write posted to memory write buffers ---"'--". ADS# - - - - . . MADS# (BUSY) (BUSY) (got dword 1) (BUSY) (got dword2) (BUSY) (got dword3) (BUSY) ADS# write { BRDY# write , BRDY# cache hit read 1 (WAIT) (WAIT) MBRDY# BRDY# BRDY# BRDY# (DONE) time time time 241261-40 With Memory Write Buffers Figure 4.7. Adding External Write Buffers to an External Cache Reduces Execution Time 2·437 Ap·469 Intel486™ DX2 CPU 66 MHz CPU Core On-Chip Cache f-- H Write Buffers ~ I-- External Cache f-- External Write Buffers I--- DRAM 241261-41 Figure 4.8. A Hierarchy of Caches and Write Buffers 4.3.3 PERFORMANCE WITH AN EXTERNAL WRITE·THROUGH CACHE To quantify the benefits of improving the write performance, the systems in Table 4.2 were tested. Fig. 4.9 shows the results of the memory systems tested with a 128K, two-way associative, write-through, parallel cache and the Intel486 DX2 CPU using the SPEC1 application trace. Table 4.2. Memory Systems Used for Write-Through Cache Test Read Pg Hit Read Pg Miss 81 82 83 84 4-2-2-2 4-2-2-2 4-2-2-2 4-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 8-2-2-2 3 3 2 2 6 6 6 System F1 System F2 System F3 System F4 3-1-2-1 3-1-2-1 3-1-2-1 3-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 6-1-2-1 3 3 2 2 6 6 6 System System System System Write Write Pg Pg Hit Miss 5 5 Write Method Normal One buffer Pipelined Pipelined Normal One buffer Pipelined Pipelined ~ 120.00% ." " u "" Ec; i=t) I::~ .~ '0 :;~ <> 0 116.00% 114.00% 112.00% 110.00% "'N x" 108.00% ].2 106.00% " L 0" I-~ " Q; D: -Q-F 118.00% 104.00% 102.00% 100.00% 3 2 Memory System 4 241261-42 Figure 4.9. Improving the Write Performance Benefits a Write Through Cache· for SPEC1 2-438 AP-469 As the write performance increases, the CPU performance approaches that of the zero wait state case. The improvement from systems Bl to B2 and from Fl to F2 illustrate the benefit of write buffering with an external cache. The improvement from systems B2 to B3 and from F2 to F3 reflect the benefit of memory write pipelining. Finally, the improvement from systems B3 to B4 and from F3 to F4 show how reducing the page-miss write performance also increases performance. 4.4 Write-Back Caches If correctly implemented, a write-back external cache can provide good performance for a uniprocessing Intel486 DX2 CPU based system. Serial write-back caches have typically been used to reduce bus utilization for multiprocessing systems. The design complexity of a write-back cache controller is typically an order of magnitude higher than for a write-through cache controller. However, correct implementation is abso~ lutely necessary if significant performance gains are to be realized with the Intel486 DX2 CPU. A write-back cache is different from a write-through cache in that it allows cache write hits to modify the cache line without updating main memory. The cache has tags that include a bit called the modified (dirty) bit. This bit is set if the cache location has been written with· new information and therefore contains information that is more recent than the corresponding information in main memory. If a subsequent read miss occurs and the line being fetched needs to fill the cache location that is currently being occupied by the modified line, the cache controller must then write the modified cache line back to main memory; hence coherency is maintained. If a CPU write is not a cache hit, the cache controller has the option of allowing the write to propagate through to memory or to fetch the cache line from memory to be merged with the new write data. The cache line fill in the second option is called a write-allocation. In the following discussions, it is assumed that no write-allocations are being performed. 4.4.1 MAIN MEMORY CONTROLLER CONSIDERATIONS The addition of an external write-back cache changes the characteristics of the main memory bus traffic. Since the cache effectively filters all CPU requests, the cycles that do propagate to main memory tend to be more distributed in their locations. This decrease in temporal and spatial locality will reduce the DRAM page hit rate as shown in Table 4.3 for a 128K, two-way associative, write-back cache with the SPEC! application trace. Compare these results to the prior results in Table 2.1 for a cacheless system. Table 4.3. Page Hit Ratios for a Write-Back Cache - for SPEC1 MEMORY CYCLES (100%) SPEC1 PGMK TURBOC Reads: Page Hits Page Misses 17.1% 25.6% 13.8% 13.9% 24.7% 12.9% Writes: 58.9% 10.2% 40.8% 21.6% Page Hits Page Misses 55.8% 4.7% Therefore, it is less beneficial with a write-back cache to implement a page-mode main memory controller. Of course, page mode DRAM accesses within the burst cycle are still important to retrieve the four words of a cache line quickly. This is also true for the write-back cycle where four dwords of the cache line must be written to memory. Memory controllers should be designed to support a burst write cycle instead of having to write each dword separately. 4.4.2 WRITE-BACK CYCLE The write-back cycle is the sequence where a cache line fill from main memory has to displace a modified line that was already in the cache. The method in which the modified line is· written back to main memory has an impact on overall CPU performance. Before analyzing the write-back cycle, consider first the architectures shown in Fig. 4.10. In the simplest implementation, a write-back cache will share the data bus with the CPU and main memory as shown in configuration X. If this is the case, then dur.ing a write-back cycle, the modified line must be written back to main memory before the cache linefill can commence. This has a detrimental effect on performance since the CPU must wait while the write-back occurs. This sequence is shown in Figure 4.11. With a data path device between the CPU-Cache bus and main memory as shown in configuration Y, the cache controller is able to defer the write-back of the modified data till after the linefill has completed. The CPU can continue execution after the linefill as long as subsequent cycles are all cache hits. In configuration Z, a wider cache bus exists between the SRAM and the data path devices. This allows the modified data to be transferred more quickly from the SRAM to the data path device, thereby allowing the cache lineflll to commence even sooner. 2-439 Ap·469 32 CPU Data Bus CPU/Memory Data Bus 241261-45 ) 241261-44 241261-43 Figure 4.10. Different Architectures will Effect CPU ,Performance with a Write-Back Cache X: Delayed Line fill ADS# ---u LrLflJ1r BRDY# Cache linefill from DRAM Write back modified line Y: Concurrent Write Back ADS# ---u BRDY# Transfer modified line to buffers CPU BUS MEMORY BUS I I Other cache hit cycles Cache linefill Write back modified line Z: Concurrent Write Back with Wide SRAM bus ADS# BRDY# ---u =;...--..... Other cache hit cycles CPU BUS MEMORY BUS ~ _______c_ac_h_e_l_in_e_fi_,,______ ~,'I~ ______~ 241261-46 Figure 4.11. Different Implementations of the Write-Back Cycle 2·440 AP·469 The results are shown in Fig. 4.12 for the Intel486 DX2 CPU running the SPEC! trace. The following systems are used to demonstrate Intel486 DX2 microprocessor performance with different cache sizes and associativities. Table 4.4. Memory Systems used for Write-Back Cache Test System A System B SystemC Reads Writes Write-Back Method (described above) 5-1-1-1 6-3-3-3 6-3-3-3 4-1-1-1 (burst) 4-4-4-4 (non-burst) 4-4-4-4 (non-burst) Concurrent Write Back Concurrent Write Back Delayed Line Fill ~ ._A 114.00% 5: " u J~" :s" .... ....... .... 112.00% .~ '0 :;30 u 0 "~ X" LoJN :s.s 0" I-~ " '" ;; -- -.-.-. ... 110.00% 1-<1> c:- -0-8 --t-C .... .... 108.00% 106.00% ~. . . -o-_ _--<~'. . . ~:..~ 104.00% r--- _ ..... .... . ------ ....... ...; • ............ . . . . . . . . . . . . -.-::-:-: -: ~.- =---+ 102.00% +------t-----t-----+-----+-----i 100.00% 64/1 64/2 128/1 128/2 Cache Size/Associativity 256/1 256/2 241261-47 Figure 4.12. Intel486TM DX2 CPU Total Execution Time with Different Cache Size, Associativity, Memory Speed and Write-Back Method· for SPEC1 The addition of a write-back cache does an excellent job of decoup!ing the CPU performance from the main memory performance as shown with memory systems A and B. However, note that memory system B (with the delayed line fill) performs poorly - even worse than a good write-through cache - unless a significant amount of cache memory is added to reduce the miss rate. 5.0 CONCLUSION This document has shown that good memory performance is especially important for the Intel486 DX2 mi- croprocessor. Business workstation designs will require excellent CPU performance and will consequently have to incorporate well-designed, high-performance cache and memory systems. In optimiZing memory performance, an external cache is essential for hiding slow main memory access times. Write-through external caches offer good performance if coupled with good memory write performance. Write-back external caches can also offer excellent performance if designed correctly. Parallel write-back caches that cannot defer the write-back cycle till after a cache line fill will perform worse than a good writethrough cache design. 2-441 · AP-485 APPLICATION NOTE Intel Processor Identification with the CPUID Instruction December 1994 2·442 I Order Number: 241618·003 INTEL PROCESSOR IDENTIFICATION WITH THE CPUID INSTRUCTION CONTENTS PAGE 1.0 INTRODUCTION ................... 2-444 1.1 Update Support ................. 2-444 2.0 DETECTING THE CPUID INSTRUCTION ...................... 2-444 3.0 OUTPUTS OF THE CPUID INSTRUCTION ...................... 2-444 3.1 Vendor-ID String ................. 2-445 3.2 Processor Signature ............. 2-446 3.3 Feature Flags ................... 2-448 4.0 USAGE GUIDELINES .............. 2-449 5.0 BIOS RECOGNITION FOR INTEL OVERDRIVETM PROCESSORS ...... 2-449 Example 1 .......................... 2-450 Example 2 .......................... 2-450 6.0 PROPER IDENTIFICATION SEQUENCE ......................... 2-450 7.0 USAGE PROGRAM EXAMPLE . .... 2-452 CONTEN,TS PAGE Examples Example 1. Processor Identification Extraction Procedure ................. 2-453 Example 2. Processor Identification Procedure in Assembly Language .... 2-459 Example 3. Processor Identification Procedure in the C Language ......... 2-467 Figures Figure 1. CPUID Instruction Outputs .... 2-445 Figure 2. Processor Signature Format on Int el386TM Processors ............... 2-447 Figure 3. Flow of Processor geLcpu_ type Procedure ...................... 2-451 Figure 4. Flow of Processor Identification Extraction Procedures ................ 2-452 Tables Table 1. Effects of EAX Contents on CPUID Instruction Output ............ 2-446 Table 2. Processor Type ................ 2-446 Table 3. Intel486TM and Pentium™ Processor Signatures ......... '....... 2-447 Table 4. Intel386TM Processor Signatures ........................... 2-447 Table 5. Feature Flag Values ........... 2-448 I 2-443 Ap·485 1.0 INTRODUCTION 1.1 Update Support As the Intel Architecture evolves, with the addition of new generations and models of processors (8086, 8088, Intel 286, Intel386™, Inte1486™, and Pentium™ processors), it is essential that Intel provides an increasingly sophisticated means with which software can identify the features available on each processor. This identification mechanism has evolved in conjunction . with the Intel Architecture as follows: • Originally, Intel published code sequences that could detect minor implementation differences to identify processor generations. • Later, with the advent of the InteI386 processor, Intel implemented processor signature identification, which provided the processor family, model, and stepping numbers to software at reset. • As the Intel Architecture evolved, Intel extended the processor signature identification into the CPUID instruction. The CPUID instruction not only provides the processor signature, but also provides information about the features· supported by and implemented on the Intel processor. New Intel processor signature and feature bits informa" tion can be obtained from the user's manual, programmer's reference manual or appropriate documentation for a processor. In addition, Intel can provide you with updated versions of the programming examples included in this application note; contact your Intel representative for more information. The evolution of processor identification was necessary because, as the Intel Architecture proliferates, the computing market must be able to tune processor functionality across processor generations and models that have differing sets of features. Anticipating that this trend will continue with future processor generations, the Intel Architecture implementation of the CPUID instruction is extensible. This Application Note explains how to use the CPUID instruction in software applications, BIOS implementations, and tools. By taking advantage of the CPUID instruction, software developers can create software applications and tools that can execute compatibly across the widest range of Intel processor generations and models, past, present, and future. 2-444 2.0 DETECTING THE CPUID INSTRUCTION Intel has provided a straightforward method for detecting whether the CPUID instruction is available. This method uses the ID flag in bit 21 of the EFLAGS register. If software can change the value of this flag, the CPUID instruction is available. The program examples at the end of this Application Note show how to use the PUSHFD instruction to change the value of the ID flag. 3.0 OUTPUTS OF THE CPUID INSTRUCTION Figure 1 summarizes the outputs of the CPUID instruction. The CPUID instruction can be executed mUltiple times, each time.with a different parameter value in the EAX register. The output depends on the value in the EAX register, as specified in Table 1. To determine the highest acceptable value in the EAX register, the program should set the EAX register parameter value to O. In this case, the CPUID instruction returns the highest value that can be recognized in the EAX register. CPUID instruction execution should always use a parameter value that is less than or equal to this highest returned value. Currently, the highest value recognized by the CPUID instruction is 1. Future processors might recognize higher values. I AP-485 The processor type, specified in bits 12 and 13, indicate whether the processor is an original OEM processor, an OverDrive™ processor, or is a dual processor (capable of being used in a dual processor system). Table 2 shows the processor type values that can be returned in bits 12 and 13 of the EAX register. While any imitator of the Intel Architecture can provide the CPUID instruction, no imitator can legitimately claim that its part is a genuine Intel part. Therefore, the presence of the Genuine Int e 1 string is an assurance that the CPUID instruction and the processor signature are implemented as described in this document. 3.1 Vendor-IO String If the EAX register contains a value of 0, the vendor identification string is returned in the EBX, EDX, and ECX registers. These registers contain the ASCII string Genuine Intel. OUTPUT IF EAX =0 EAX HIGH VALUE ____________________ ~i' IN_T_E_G_E_R ____________________ ~l 31 EBX VENDOR 10 U EDX ECX ASCII STRING (WITH HEXADECIMAL ENCODING) L RESET OUTPUT IF EAX =1 PROCESSOR SIGNATURE EDX EAX MODEL ------~ STEPPING ~ FEATURE FLAGS EDX' ~1 ______B_IT_A_R_R_A_~_(_R_e_N_r_to_T_a_b_le_5_) 0 _______ ~I *EBX and ECX are Inlel reserved. Do nol use. 241618-1 Figure 1_ CPUID Instruction Outputs I 2-445 AP-485 Table 1. Effects of EAX Contents on CPUID Instruction Output Parameter Outputs of CPUID EAX = 0 EAX - Highest value recognized EBX:EDX:ECX EAX = 1 Processor signature EDX - Feature flags EBX:ECX 1 < EAX S; highest value Vendor identification string EAX - Intel reserved (Do not use.) Currently undefined EAX > highest value EAX:EBX:ECX:EDX - Undefined (Do not use.) Table 2. Processor Type Bit Position 13,12 Value Description Original OEM Processor 00 01 , - OverDrive TM Processor 10 Dual Processor(1) 11 Intel reserved (Do not use.) NOTE: 1. Not applicable to Intel 386 and Intel486 processors. 3.2 Processor Signature Beginning with the Intel386 processor family, the processor signature has been available at reset. With processors that implement the CPUID iruitruction, the processor signature is available both upon reset and upon execution of the CPUID instruction. Figure 1 shows the format of the signature for the Intel486 and Pentium processor families. Table 3 shows the values that are currently defmed. (The high-order 18 bits are undefined and reserved.) 2-446 Older versions· of Intel486 SX, Intel486 DX and IntelDX2 processors do not support the CPUID instruction. Therefore, the processor signature is only available upon reset for tl:!ese processors. Refer to the programming examples at the end of this Application Note to determine which processors support the CPUID instruction. . On Intel386 processors, the format of the processor signature is somewhat different, as Figure 2 shows. Table 4 gives the current values. I AP-485 Table 3. Intel486TM and Pentium™ Processor Signatures Family Model Stepping(1) 0100 0000 and 0001 0010 0100 0011 0100 0011 xxxx xxxx xxxx xxxx Intel4B6 OX Processors 0100 0100 0100 xxxx Intel4B6 SL Processor(2) 0100 0101 IntelSX2TM Processors 0100 0111 0100 1000 0101 0001 0101 0010 0101 0011 0101 0101 xxxx xxxx xxxx xxxx xxxx xxxx xxxx 0101 0010 xxxx Reserved for Pentium OverDrive Processor for Pentium Processor (510\60,567\66) 0101 0100 xxxx Reserved for Pentium OverDrive Processor for Pentium Processor (735\9P, B15\ 100) Description Intel4B6 SX Processors Intel4B7TM Processors(2) IntelDX2™ and Intel DX2 OverDrive™ Processors Write-Sack Enhanced IntelDX2 Processors IntelDX4™ and IntelDX4 OverDrive Processors Pentium™ Processors (510\60,567\66) Pentium Processors (735\90, B15\ 100) Pentium OverDrive Processors Reserved for Pentium OverDrive Processor for IntelDX4 Processor NOTES: 1. Intel releases information about stepping numbers as needed. 2. This processor does not implement the CPUID instruction. RESET - - - - . . EDX §.~.=:i~fW-::-::~~ ~·'l ...5'..::....:.JIi!!B;~.~ MODEL~ 11 7 I I t t 0 I I FAMILY - - - - - - ' MAJOR STEPPING - - - - - - - - - - ' MINOR STEPPING - - - - - - - - - - - ' J 241618-2 Figure 2. Processor Signature Format on Intel386TM Processors Table 4. Intel386TM Processor Signatures Model Family . Major Stepping Minor Stepplng(1) Description 0000 0011 0000 xxxx Intel3B6™ OX Processor 0010 0011 0000 xxxx Intel3B6 SX Processor 0010 0011 0000 Intel3B6 ex Processor 0010 0011 0000 0100 0011 0000 and 0001 0000 0011 0100 xxxx xxxx xxxx xxxx Intel3B6 EX Processor Intel3B6 SL Processor RAPIDCADTM Processor NOTE: 1. Intel releases information about minor stepping numbers as needed. I 2-447 AP-485 3.3 Feature Flags When a value of 1 is placed in the EAX register, the CPUID instruction loads the EDX register with the feature flag~. The feature flags indicate which features the processor supports. A value of 1 in a feature flag can indicate that a feature is either supported or not supported, depending on the implementation of the CPUID instruction for a specific processor. Table 5 lists the currently defined feature flag values. For future processors, refer to the programmer's reference manual, user's manual, or the appropriate documentation for the latest feature flag values. Developers should use the feature flags in applications to determine which processor features are supported. By using the CPUID feature flags to predetermine processor features, software can detect and avoid incompatibilities that could result if the features are not present. Table 5. Feature Flag Values Description When Flag = 1 Comments Bit Name 0 FPU Floating-Point Unit On-Chip The processor contains an FPU that supports the Intel387 floating-point instruction set. 1 VME Virtual Mode Extension The processor supports extensions to virtual-8086 mode. PSE Page Size Extension 7 MCE Machine Check Exception 18 is defined for Pentium processor style machine checks, including CR4.MCE fc;>r controlling the feature. This feature does not define the model-specific implementation of the machine-check error logging reporting and processor shutdowns. Machine-check exception handlers may have to depend on processor version to do model-specific processing of the exception or test for the presence of the standard machine-check feature. 8 CX8 CMPXCHG8B The 8-byte (64-bit) compare and exchange instructions is supported (implicitly locked and atomic). 9 APIC On-Chip APIC Indicates that an integrated APIC is present and hardware enabled. (Software disabling does not affect this bit.) 2(1) 3 , 4-6(1) 10-31(1) (See note) The processor supports 4-Mbyte pages. (See note) (See note) NOTE: 1. Some non-essential information regarding Intel486 and Pentium processors is considered Intel confidential and proprietary and is not documented in this publication. This information is provid~d in the Supplement to the Pentium™ Processor User's Manual and is available with the appropriate non-disclosure agreements in place. Contact Intel Corporation for details. 2-448 I AP-485 4.0 USAGE GUIDELINES This document presents Intel-recommended featuredetection methods. Software should not try to identify features by exploiting programming tricks, undocumented features, or otherwise deviating from the guidelines presented in this Application Note. The following is a list of guidelines that can help programmers maintain the widest range of compatibility for their software. • Do not depend on the absence of an invalid opcode trap on the CPUID opcode to detect CPUID. Do not depend on the absence of an invalid opcode trap on the PUSHFD opcode to detect a 32-bit processor. Test the ID flag, as described in Section 2.0 and shown in Section 6.0. • Do not assume that a given family or model has any specific feature. For example, do not assume that, because the family value is 5 (Pentium processor), there must be a floating-point unit on-chip. Use the feature flags for this determination. • Do not assume that the features in the OverDrive processors are the same as those in the OEM version of the processor. Internal caches and instruction execution might vary. • Do not use undocumented features of a processor to identify steppings or features. For example, the Inte1386 processor A-step had bit instructions that were withdrawn with B-step. Some software attempted to execute these instructions and depended on the invalid-opcode exception as a signal that it was not running on the A-step part. This software failed to word correctly when the Intel486 processor used the same opcodes for different instructions. That software should have used the stepping information in the processor signature. • Do not assume that a value of 1 in a feature flag ind icates that a given feature is present, even though that is the case in the first models of the Pentium processor in which the CPUID instruction is implemented. For some feature flags that might be defined in the future, a value of 1 can indicate that the corresponding feature is not present. • Programmers should test feature flags individually and not make assumptions about undefmed bits. It would be a mistake, for example, to test the FPU bit by comparing the feature register to a binary 1 with a compare instruction. I • Do not assume that the clock of a given family or model runs at a specific frequency and do not write clock-dependent code, such as timing loops. For instance, an OverDrive Processor could operate at a higher internal frequency and still report the same family and/or model. Instead, use the system's timers to measure elapsed time. • Processor model-specific registers may differ among processors, including in various models of the Pentium processor. Do not use these registers unless identified for the installed processor. 5.0 BIOS RECOGNITION FOR INTEL OVERDRIVETM PROCESSORS A system's BIOS will typically identify the processor in the system and initialize the hardware accordingly. In many cases, the BIOS identifies the processor by reading the processor signature, comparing it to known signatures, and, upon finding a match, executing the corresponding hardware initialization code. The Pentium OverDrive processor is designed to be an upgrade to any Intel486 family processor. Because there are significant operational differences between these two processor families, processor misidentification can cause system failures or diminished performance. Major differences between the Intel486 processor and the Pentium OverDrive processor include the type of on-chip cache supported (write-back or writethrough), cache organization and cache size. The OverDrive processor also has an enhanced floating point unit and System Management Mode (SMM) that may not exist in the OEM processor. Inability to recognize these features causes problems like those described below. In many BIOS implementations, the BIOS reads the processor signature at reset and compares it to known values. If the OverDrive processor's signature is not among the known values, a match will not occur and the OverDrive processor will not be identified. Often the BIOS will drop out of the search and initialize the hardware based on a default case such as initializing the chipset for an Intel486 SX processor. Following are two common examples of system failures and how to avoid them. 2-449 AP-485 Example 1 If.(for the Pentium OverDrive processor) the system's hardware is configured to enable the write-back cache but the BIOS fails to detect the Pentium OverDrive processor signature, the BIOS may incorrectly cause the chipset to support a write-through processor cache. This results in a data incoherency problem with the bus masters. When a bus master accesses a memory location (which was also in the processor's cache in a modified state), the processor will alert the chipset to allow it to update this data in memory. But the chipset is not programmed for such an event and the bus master instead receives stale data. This usually results in a system failure. Example 2 If the BIOS does not recognize the OverDrive processor's signature and defaults to an Intel486 SX proces-· sor, the BIOS can incorrectly program the chipset to ignore, or improperly route, the assertion of the floating point error signaled by the processor. The result is that floating point errors will be improperly handled by the Pentium OverDrive processor. The BIOS may also completely disable math exception handling in the OverDrive processor. This can cause installation errors in applications tliat require hardware support for float~ ing point instructions. 2-450 Hence, when programming or modifying a BIOS, be aware of the impact of future OverDrive processors. Intel recommends that you include processor signatures for the OverDrive processors in BIOS identification routines to eliminate diminished performance or system failures. The recommendations in this application note can help a BIOS maintain compatibility across a wide range of processor generations and models.. 6.0 PROPER IDENTIFICATION SEQUENCE The cpuid3a.asm program example demonstrates the correct use of the CPUID instruction. (See Example 1.) It also shows how to identify earlier processor generations that do not implement the processor signature or CPUID instruction. This program example contains the following two procedures: • get _ cpu_ type identifies the processor type. Figure 3 illustrates the flow of this procedure. • get _ fpu_ type determines the type of floatingpoint unit (FPU) or math coprocessor (MCP). This procedure has been tested with 8086, 80286, Inte1386; Intel486, and Pentium processors. This program example is written in assembly language and is suitable for inclusion in a run-time library, or as system calls in operating systems. I AP-485 Yes Yes Yes = Yes cpuid_flag 1; indicates CPUIO Instruction present. Execute CPUIO with input of 0 to get vendor 10 string and Input values for EAX. No If highest Input value Is at least 1, execute CPUIO with Input of 1 In EAX to obtain model, stepping, family, and features. Save in cpu_type, stepping, model, and feature_flags. 241618-3 Figure 3. Flow of Processor geLcpu_type Procedure I 2-451 AP·485 7.0 USAGE PROGRAM EXAMPLE The cpuid3b.asm and cpuid3b.c program examples demonstrate applications that call get _ cpu_ type and ge t _ fpu _ type procedures and interpret the returned information. The results, which are displayed on the monitor, identify the installed processor and features. The cpuid3b. 85m example is written in assembly language and demonstrates an application that displays the returned information in the DOS environment. The cpuid3b. c example is written in the C language. (See Examples 2 and 3.) Figure 4 presents an overview of the relationship between the three program examples. Main .~#~# . . # . . . . . . . . . . . . . . . . . . . . . . n .............. ................................" ................................................ Part of cpuld3b.c and cpuld3b.asm geCcpu_type* Part of cpuld3a.asm ............................. d .................. _.n ......................................... _.....................1 Print / 241618-4 'See Figure 3. Figure 4_ Flow of Processor Identification Extraction Procedures 2-452 I Ap·485 Example 1. Processor Identification Extraction Procedure Filename: cpuid3a.asm Copyright 1993, 1994 by Intel Corp. This program has been developed by Intel Corporation. You have Intel's permission to incorporate this source code into your product, royalty free. Intel has intellectual property rights which it may assert if another manufacturer's processor mis-identifies itself as being "GenuineIntel" when the CPUID instruction is executed. Intel specifically disclaims all warranties, express or implied, and all liability, including consequential and other indirect damages, for the use of this code, including liability for infringement of any proprietary rights, and including the warranties of merchantability and fitness for a particular purpose. Intel does not assume any responsibility for any errors which may appear in this code nor any responsibility to update it. This code contains two procedures: _get_cpu_type: Identifies processor type in _cpu_type: 0=8086/8088 processor 2=Intel 286 processor 3=Inte1386(TM) family processor 4=Inte1486(TM) family processor 5=Pentium(TM) family processor _get_fpu_type: Identifies FPU type in _fpu_type: O=FPU not present l=FPU present 2=287 present (only if _cpu_type=3) 3=387 present (only if _cpu_type=3) This program has been tested with the MASM assembler. This code correctly detects the current Intel 8086/8088, 80286, 80386, 80486, and Pentium (tm) processors in the real-address mode. To assemble this code with TASM, add the JUMPS directive. jumps ; Uncomment this line for TASM TITLE DOSSEG .model cpuid3a.asm small 241616-5 I 2·453 AP-485 db db Hardcoded CPUID instruction Ofh Oa2h ENDM .data public public public public public public public public public _cpu_type _fpu_type _cpuid_flag _intel_CPU _vendor_id intel_id _cpu_signature features_ecx _features_edx features_ebx fp_status _cpu_type _fpu_type _cpuid_flag _intel_CPU _vendor_id _cpu_signature _features_ecx _features_edx _features_ebx db 0 db 0 db 0 db 0 db 11 _ _ _ _ _ _ _ _ _ _ _ _ 11 db dd dd dd dd dw II GenuinelntelII 0 0 0 0 0 .code .8086 ;********************************************************************* public _get_cpu_type _get_cpu_type proc This procedure determines the type of processor in a sys~em and sets the _cpu_type variable with the appropriate value. If the CPUID instruction is available, it is used to determine more' specific details about the processor. All registers are used by this procedure, none are preserved. To avoid AC faults, the AM bit in CRO must not be set. Intel 8086 processor check Bits 12-15 of the FLAGS register are always set on the 8086 processor. check_8086: pushf pop mov and ax cx, ax ax, Offfh push original FLAGS get original FLAGS save original FLAGS clear bits 12-15 in FLAGS 241618-6 2-454 I AP-485 push popf pushf pop and cmp mov je ax ax ax, OfOOOh ax, OfOOOh _cpu_type, 0 end_cpu_type save new FLAGS value on stack replace current FLAGS value get new FLAGS store new FLAGS in AX if bits 12-15 are set, then processor is an BOB6/BOBB turn on BOB6/BOBB flag jump if processor is BOB6/BOBB Intel 2B6 processor check Bits 12-15 of the FLAGS register are always clear on the Intel 2B6 processor in real-address mode . . 2B6 check_ B02B6: or push popf pushf pop and mov jz cx, OfOOOh cx ax ax, OfOOOh _cpu_type, 2 end_cpu_type try to set bits 12-15 save new FLAGS value on stack replace. current FLAGS value get new FLAGS store new FLAGS in AX i f bits 12-15 are clear processor=B02B6, turn on B02B6 flag if no bits set, processor is B02B6 Inte13B6 processor check The AC bit, bit #lB, is a new bit introduced in the EFLAGS register on the Inte14B6 processor to generate alignment faults. This bit cannot be set on the Inte13B6 processor. .3B6 check_ B03B6: pushfd pop mov xor push popfd pushfd pop xor mov jz push popfd it is safe to use 3B6 instructions eax ecx, eax eax, 40000h eax eax eax, ecx _cpu_type, 3 end_cpu_type push original EFLAGS get original EFLAGS save original EFLAGS flip AC bit in EFLAGS save new EFLAGS value on stack replace current EFLAGS value get new EFLAGS store new EFLAGS in EAX can't toggle AC bit, processor=B03B6 turn on B03B6 processor flag jump if B03B6 processor ecx restore AC bit in EFLAGS first Inte14B6 processor check Checking for ability to set/clear ID flag (Bit 21) in EFLAGS which indicates the presence of a processor with the CPUID 241618-7 I 2-455 AP-485 instruction . . 486 check_ 80486: mov mov xor push popfd pushfd pop xor je _cpu_type, 4 eax, ecx eax, 200000h eax eax eax, ecx end_cpu_type turn on 80486 processor flag get original EFLAGS flip ID bit in EFLAGS save new EFLAGS value on stack replace current EFLAGS value get new EFLAGS store new EFLAGS in EAX can't toggle ID bit, processor=80486 Execute CPUID instruction to determine vendor, family, model, stepping and features. For the purpose of this code, only the initial set of CPUID information is saved. flag indicating use of CPUID inst. save registers mov push push push mov CPU_ID _cpuid_flag, 1 ebx esi edi eax, 0 mov mov mov dword ptr _vendor~id, ebx dword ptr _vendor_id[+4], edx dword ptr _vendor_id[+8], ecx set up for CPUID instruction get and save vendor ID mov si, ds m o v e s , si mov mov mov cld repe jne si, offset _vendor_id di, offset intel_id cx, 12 should be length intel_id set direction flag compare vendor ID to "GenuineIntel" cmpsb if not equal, not an Intel processor end_cpuid_type mov cmp jl mov CPU_ID mov mov mov mov _intel_CPU, 1 eax, 1 encLcpuid_type eax, 1 shr eax, 8 _cpu_signature, _features_ebx,. _features_edx, _features_ecx, indicate an Intel processor make sure 1 is valid input for CPUID if not, jump to end get family/model/stepping/features eax ebx edx ecx ; isolate family 241618-8 2-456 I AP-485 and eax, Ofh set _cpu_type with family end_cpuid_type: pop edi pop esi pop ebx .8086 end_cpu_type: ret _get_cpu_type restore registers endp ;********************************************************************* public _get_fpu_type _get_fpu_type proc This procedure determines the type of FPU in a system and sets the _fpu_type variable with the appropriate value. All registers are used by this procedure, none are preserved. Coprocessor check The algorithm is to determine whether the floating-point status and control words are present. If not, no coprocessor exists. If the status and control words can be saved, the correct coprocessor is then determined depending on,tpe processor type. The Intel386 processor can work with either an Intel287 NDP or an Intel387 NDP. The infinity of the coprocessor must be checked to determine the correct coprocessor type. fninit mov fnstsw mov cmp mov jne fp_status, 5a5ahi fp_status ax, fp_status aI, 0 _fpu_type, 0 end_fpu_type check_control_w.ord: fnstcw fp_status mov ax, fp_status and ax, 103fh cmp ax, 3fh mov _fpu_type, 0 jne end_fpu_type . mov _fpu_type, 1 reset FP status word initialize temp word to non-zero save FP status word check FP status word was correct status written no FPU present save FP control word check FP control word selected parts to examine was control word correct incorrect control word, no FPU 80287/80387 check for the Intel386 processor 241618-9 I 2-457 AP·485 check_infinity: cmp jne fldl fldz fdiv fld fchs fcompp fstsw mov mov sahf jz mov end_fpu_type: ret _get_fpu_type _cpu_type, 3 end_fpu_type st fp_status ax, fp_status _fpu_type, 2 end_fpu_type _fpu_type, 3 must use default control from FNINIT form infinity , 8087/Inte1287 NDP s'ay +inf = -inf form negative infinity Inte1387 NDP says +inf <> -inf see if they are the same look at status from FCOMPP store Inte1287 NDP for FPU type see if infinities matched jump if 8087 or Inte1287 is present store Inte1387 NDP for FPU type endp end 241616-10 2-458 I AP-485 Example 2. Processor Identification Procedure in Assembly Language Filename: cpuid3b.asm Copyright 1993, 1994 by Intel Corp. This program has been developed by Intel Corporation. You have Intel's permission to incorporate this source code into your product, royalty free. Intel has intellectual property rights which it may assert if another manufacturer's processor mis-identifies itself as being "GenuineIntel" when the CPUID instruction is executed. Intel specifically disclaims all warranties, express or implied, and all liability, including consequential and other indirect damages, for the use of this code, including liability for infringement of any proprietary rights, and including the warranties of merchantability and fitness for a particular purpose. Intel does not assume any responsibility for any errors which may appear in this code nor any responsibility to update it. This program contains three parts: Part 1: Identifies processor type in the variable _cpu_type: Part 2: Identifies FPU type in the variable _fpu_type: Part 3: Prints out the appropriate message. This part is specific to the DOS environment and uses the DOS system calls to print out the messages. This program has been tested with the MASM assembler. If this code is assembled with no options specified and linked with the cpuid3a.asm module, it correctly identifies the current Intel 8086/8088, 80286, 80386, 80486, and Pentium (tm) processors in the. real-address mode. To assemble this code with TASM, add the JUMPS directive. jumps ; Uncomment this line for TASM TITLE DOSSEG .model .stack .data extrn extrn extrn cpuid3b.asm small 100h _cpu_type: byte _fpu_type: byte _cpuid_flag: byte 241618-11 I 2-459 AP-485 extrn extrn extrn extrn extrn extrn _intel_CPU: byte _vendor_id: byte _cpu_signature: dword _features_ecx: dword _features_edx: dword _features_ebx: dword The purpose of this code is to identify the processor and coprocessor that is currently in the system. The program first determines the processor type. Then it determines whether a coprocessor exists in the system. If a coprocessor or integrated coprocessor exists, the program identifies the coprocessor type. The program then prints the processor and floating point processors present and type . start: . code .8086 mov mov mov and call call call mov int ax, @data ds, ax es, ax sp, not 3 _get_cpu_type _get_fpu_type print ax, 4cOOh 21h set segment register set segment register align stack to avoid AC fault determine processor type terminate program ;******.*************************************************************** extrn ;********************************************************************* extrn ;********************************************************************* FPU_FLAG VME_FLAG PSE_FLAG MCE_FLAG. CMPXCHG8B_FLAG APIC_FLAG equ equ equ equ equ equ OOOlh 0002h 0008h 0080h OlOOh 0200h db db db db "This system has a$" "n unknown processor$" "n 8086/8088 processor$" "n 80286 processorS" .data id_msg cp_error cp_8086 cp_286 241618-12 2-460 I AP-485 db "n 80386 processorS" db db db "n 80486DX, 80486DX2 processor or" " 80487sX math coprocessorS" "n 80486SX processorS" fp_8087 fp_287 fp_387 db db db " and an 8087 math coprocessorS" " and an 80287 math coprocessorS" " and an 80387 math coprocessorS" intel486_msg intel486dx_msg intel486sx_msg inteldx2_msg inte1sx2_msg inte1dx4_msg inte1dx2wb_msg db db db db db db db db db db " Genuine Intel486(TM) processorS" " Genuine Intel486(TM) DX processorS" " Genuine Intel486(TM) SX processorS" " Genuine IntelDX2(TM) processorS" " Genuine Inte1SX2(TM) processorS" " Genuine Inte1DX4(TM) processorS" " Genuine Write-Back Enhanced" " IntelDX2(TM) processorS" " Genuine Intel Pentium(TM) processorS" "n unknown Genuine Intel processorS" pentium_msg unknown_msg i The following intel_486_0 intel_486_1 inte1_486_2 intel_486_3 intel_486_4 intel_486_5 intel_486_6 intel_486 7 intel_486_8 inte1_486_9 intel_486_a intel_486_b intel_486_c intel_486_d intel_486_e inte1_486_f i end of array 16 entries must stay intact as an array dw offset intel486dx_msg dw offset inte1486dx_msg dw offset inte1486sx_msg dw offset inteldx2_msg dw offset intel486_msg dw offset intelsx2_msg dw offset intel486_msg offset inteldx2wb_msg dw dw offset inteldx4_msg d~ offset intel486_msg dw offset intel486_msg dw offset inte1486_msg dw offset intel486_msg dw offset inte1486_msg dw offset intel486_msg dw offset inte1486_msg family_msg model_msg stepping_msg cr_1f db db db db 13,10, "Processor Family: 13,10,"Model: 13,10,"Stepping: 13,10,"$" db db db db db 13,10,"The processor is an OverDrive(TM)" " processorS" 13,10,"The processor is the upgrade processor" " in a dual processor systemS" 13,10,"The processor contains an on-chip FPU$" $" $" 241618-13 I 2·461 Ap·485 db db db db db db db db db db 13,10,"The processor " Exceptions$" 13,10, "The processor " instruction$" 13,·10, "The processor " Extensions$" 13,10,"The processor " Extensions$" 13,10,"The processor " APIC$" not - intel db' db db db db "t least an 80486 processor." 13,10,"It does not contain a Genuine Intel" "part and as a result,. the",13,10,"CPUID" " detection information cannot be determined" " at this time.$" ASC_MSG MACRO LOCAL add cmp jle add ascii_done: mov mov mov int ENDM msg ascii_done al, 30h al, 39h ascii_done al, 07h mce_msg cmp_msg Vme_msg pse_msg apic_msg print supports Machine Check" supports the CMPXCHG8B" supports Virtual Mode" supports Page Size" contains an on-chip" , I local label is it 0-91 byte ptr msg[20], al dx, oftset msg ah, 9h 21h .code .8086 proc This procedure prints the appropriate cpuid string and numeric processor presence status. If the CPUID instruction was used, this procedure prints out the CPUID info. All registers are used by this procedure, none are preserved. mov mov int dx, offset id_msg if set to 1, processor supports CPUID instruction print detailed CPUID info cmp je print_86: cmp jne ; print initial message ah, 9h 21h _cpu_type, 0 print_286 241618-14 2-462 I AP·485 rnov rnov int crnp je rnov rnov int jrnp print_286: crnp jne rnov rnov int crnp je print_287: rnov rnov int jrnp print_ 386: crnp jne rnov rnov int crnp je crnp je rnov rnov int jrnp print_ 486: crnp jne rnov crnp je rnov print_486sx: rnov int jrnp dx, offset cp_8086 ah, 9h 21h _fpu_type, 0 end-print dx, offset fp_8087 ah, 9h 21h end-print _cpu_type, 2 print_386 dx, offset cp_286 ah, 9h 21h _fpu_type, 0 end-print dx, offset fp_287 ah, 9h 21h end-print _cpu_type, print_ 486 dx, offset ah, 9h 21h _fpu_type, end-print _fpu_type, print_287 dx, offset ah, 9h 21h end-print 3 cp_ 386 0 2 fp_ 387 _cpu_type, 4 print_unknown dx, offset cp _ 486sx _ fpu_type, 0 print_486sx dx, offset cp_486 Intel processors will have CPUID instruction ah, 9h 21h end-print 241618-15 I 2-463 Ap·485 print_unknown: mov jmp dx, offset cp_error print_486sx print_cpuid_data: .486 _intel_CPU, 1 check for genuine Intel cmp not_GenuineIntel processor jne print_486_type: _cpu_type, 4 if 4, print 80486 processor cmp print-pentium_type jne ax, word ptr _cpu_signature mov shr ax, 4 ; isolate model and eax, Ofh mov dx, intel_486_0[eax*2J jmp print_common print-pentium_type: cmp _cpu_type, 5 if 5, print Pentium processor jne print_unknown_type mov dx, offset pentium_msg jmp print_common print_unknown_type: mov dx, offset unknown_msg if neither, print unknown print_common: mov int ah, 9h 21h ; print family, model, and stepping print_family: mov aI, _cpu_type ASC_MSG family_msg print_model: mov shr and ASC_MSG print family msg ax, word ptr _cpu_signature ax, 4 aI, Ofh model_msg print model msg print_stepping: mov ax, word ptr _cpu_signature and aI, Ofh ASC_MSG stepping_msg ; print stepping msg print_upgrade: mov test jz ax, word ptr _cpu_signature ax, 1000h ; check for turbo upgrade check_dp 241618-16 2-464 I AP-485 mov mov int jmp dx, offset turbo_msg ah, 9h 21h print_features check_dp: test jz mov mov int ax, 2000h print_features dx, offset dp_msg ah, 9h 21h print_features: mov and jz mov mov int ax, word ptr _features_edx ax, FPU_FLAG ; check for FPU check_MCE dx, offset fpu_msg ah, 9h 21h check_MCE: mov and jz mov mov int ax, word ptr _features_edx ax, MCE_FLAG ; check for MCE check_CMPXCHG8B dx, offset mce_msg ah, 9h 21h check for dual processor check_CMPXCHG8B: ax, word ptr _features_edx mov and ax, CMPXCHG8B_FLAG ; check for CMPXCHG8B jz check_VME dx, offset cmp_msg mov mov ah, 9h int 21h check_VME: mov and jz mov mov int ax, word ptr features_edx 'ax, VME_FLAG ; check for VME check_PSE dx, offset vrne_msg ah, 9h 21h check_PSE: mov and jz mov mov ax, word ptr _features_edx ax, PSE_FLAG ; check for PSE check_APIC dx, offset pse_msg ah, 9h 241618-17 I 2-465 AP-485 int check_APIC: mov and jz mov mov int jmp 21h ax, word ptr _features_edx ax, APIC_FLAG ; check for APIC end-print dx, offset apic_msg ah, 9h 21h end-print not_Genuinelntel: mov dx, offset not_intel ah, 9h mov int 21h end-print: mov mov int ret print endp end dx, offset cr_lf ah, 9h 21h start 241618-18 2-466 ' I AP-485 Example 3. Processor Identification Procedure in the C Language /* Filename: cpuid3b.c /* Copyright 1994 by Intel Corp. */ */ */ /* /* This program has been developed by Intel Corporation. You */ /* have Intel's permission to incorporate this source code into */ /* your product, royalty free. Intel has intellectual property */ /* rights which it may assert if another manufacturer's processor*/ /* mis-identifies itself as being "GenuineIntel" when the CPUID */ /* instruction is executed. */ /* */ /* Intel specifically disclaims all warranties, express or /* implied, and all liability, including consequential and other /* indirect damages, for the use of this code, including /* liability for infringement of any proprietary rights, and /* including the warranties of merchantability and fitness for a /* particular purpose. Intel does not assume any responsibility /* for any errors which may appear in this code nor any /* responsibility to update it. */ */ */ */ /* */ /* This program contains three parts: /* Part 1: Identifies CPU type in the variable _cpu_type: /* /* Part 2: Identifies FPU type in the variable _fpu_type: /* /* Part 3: Prints out the appropriate message. */ extern extern extern extern extern extern extern extern FPU_FLAG VME_FLAG PSE_FLAG MCE_FLAG CMPXCHG8B_FLAG APIC_FLAG char char char char char long long long */ */ */ */ */ This program has been tested with the Microsoft C compiler. If this code is compiled with no options specified and linked with the cpuid3a.asm module, it correctly identifies the current Intel 8086/8088, 80286, 80386, 80486, and Pentium (tm) processors in the real-address mode. #define #define #define #define #define #define */ */ */ */ /* /* /* /* /* /* */ *./ */ */ */ */ OxOOOl Ox0002 Ox0008 Ox0080 Ox0100 Ox0200 cpu_type; fpu_type; cpuid_flag; intel_CPU; vendor_id[12] ; cpu_signature; features_ecx; features_edx; 241618-19 I 2-467 AP-485 extern long features_ebx; main() ( get_cpu_type() ; get_fpu_type() ; print(); } print() { printf (" This system has a"); if (cpuid_flag 0) { switch (cpu_type) { case 0: printf ("n 8086/8088 processor"); if (fpu_type) printf(" and an 8087 math coprocessor"); break; case 2: printf("n 80286 processor"); if (fpu_type) printf(" and an 80287 math coprocessor"); break; case 3: printf("n 80386 processor"); if (fpu_type 2) printf(" and an 80287 math coprocessor"); else if (fpu_type) printf (" and an 80387 math coprocessor") ; break; case 4: if (fpu_type) printf ("n 80486DX, 80486DX2 processor or \ . 80487SX math coprocessor"); else printf("n 80486SX processor"); break; default: printf ("n unknown processor") ; == == else /* using cpuid instruction */ if (intel_CPU) { if (cpu_type 4) ( switch ((cpu_signature»4)&Oxf) case 0: case 1: printf(" Genuine Inte1486(TM) break; case 2: printf (" Genuine Inte1486 (TM) break; case 3: printf (" Genuine IntelDX2 (TM) break; case 4: printf (" Genuine Inte14.86 (TM) == DX processor"); SX processor") ; processor"); processor"); 241618-20 2-468 I Ap·485 break; 5: printf(" break; case 7: printf (" IntelDX2(TM) processor"); break; case 8: printf(" break; default: printf(" case Genuine IntelSX2(TM) processor"); Genuine Write-Back Enhanced \ Genuine IntelDX4(TM) processor"); Genuine Inte1486(TM) processor"); else if (cpu_type == 5) printf(" Genuine Intel Pentium(TM) processor"); else printf("n unknown Genuine Intel processor"); printf ( "\nProcessor Family: %X", cpu_type); printf (" \nModel: %X", (cpu_signature»4) &Oxf) ; printf("\nStepping: %X\n", cpu_signature&Oxf); if (cpu_signature & OxlOOO) printf("\nThe processor is an OverDrive(TM) upgrade \processor") ; else if (cpu_signature & Ox2000) printf("\nThe processor is the upgrade processor \ in a dual processor system"); if (features_edx & FPU_FLAG) printf("\nThe processor contains an on-chip FPU"); if (features_edx & MCE_FLAG) printf("\nThe processor supports Machine Check \ Exceptions") ; if (features_edx & CMPXCHG8B_FLAG) printf("\nThe processor supports the CMPXCHG8B \ instruction") ; if (features_edx & VME_FLAG) printf("\nThe processor supports Virtual Mode \ Extensions") ; if (features_edx & PSE_FLAG) printf("\nThe processor supports Page Size \ Extensions") ; if (features_edx & APIC_FLAG) printf("\nThe processor contains an on-chip APIC"); else { printf("t least an 80486 processor.\nIt does not \ contain a Genuine Intel part and as a result, the\nCPUID detection \ information cannot be determined at this time."); printf (" \n;') ;' 241618-21 I 2-469 AP·485 Revision -001 Revision History Original Issue. Date 05/93 -002 Modified Table 2. Intel486 and Pentium Processor Signatures. 10/93 -003 Updated to accommodate new processor versions. Program examples modified for ease of use, section added discussing BIOS recognition for OverDrive processors, and feature flag information updated. 09/94 2-470 I int:el. AP-496 APPLICATION NOTE Migrating from the Intel486™ SL Microprocessor to the SL Enhanced Intel486™ Microprocessor DESMOND YUEN MCG TECHNICAL MARKETING October 1993 2-471 Migrating from the Intel486™ SL Microprocessor to the SL Enhanced Intel486™ Microprocessor CONTENTS PAGE INTRODUCTION ....................... 2-473 1.0 COMPARISON OF THE SL ENHANCED INTEL486™ CPU AND INTEL486 SL CPU .................. 2-473 2.0 SYSTEM MANAGEMENT MODE IMPLEMENTATION ................. 2-474 CONTENTS PAGE 3.0 POWER MANAGEMENT ........... 2-478 3.1 STPCLK # Interrupt .............. 2-478 3.2 Global Standby Implementation .. 2-479 3.2.1 Suspend Implementation ... 2-479 3.2.1.1 Dynamic Clock Switching ................... 2-480 2.1 System Management Interrupt ... 2-474 3.2.1.2 Power Consumption .,. 2-480 2.1.1 General Design Considerations ................. 2-474 3.2.2 General Design Considerations ..... '.~ .......... 2-480 2.1.1.1 110 Trapping ........... 2-474 4.0 RESET IMPLEMENTATION ........ 2-481 2.1.1.2 Back-to-Back SMls .... 2-475 4.1 General Design Considerations .. 2-482 2.2. SMI Active (SMIACT#) ......... 2-475 2.2.1 General Design Considerations ................. 2-475 CONCLUSION ......................... 2-482 References ............................ 2-482 2.3 SMRAM Interfacing .............. 2-475 2.3.1 SMRAM Initialization ........ 2-476 2.3.2 General Design Considerations ................. 2-476 2.3.2.1 Accessing SMRAM .... 2-476 2.3.2.2 Cache Coherency ...... 2-476 2.3.2.3 External Write Buffers .. 2-477 2.3.2.4 A20M# Pin ............ 2-477 2.4 SMM Environment Initialization .. 2-477 2-472 I AP-496 INTRODUCTION Since the introduction of the Intel386™ SL microprocessor and the subsequent introduction of the Intel486™ SL microprocessor, the SL Architecture has become the de/acto standard for mobile computers. Given the industry acceptance ofSL Architecture, Intel is extending the SL Architecture to the SL Enhanced Intel486 microprocessor family. This application note describes how the same features of the SL Architecture can be implemented on the SL Enhanced Intel486 CPUs. Although this application note is written for people with experience designing Intel486 SL CPUbased mobile computers, the information provided in this document will also be useful for anyone interested in learning more about the SL Enhanced Inte1486 CPUs. The first section of this document highlights the differences between the Intel486 SL CPU and the SL Enhanced Intel486 CPU. Section two describes the architectural differences in System Management Mode (SMM). Section three discusses power management features of the Inte1486' SL CPU and the SL Enhanced Intel486 CPU. Section four explains reset implementation of the Intel486 SL CPU and the SL Enhanced Intel486 CPU. 1.0 COMPARISON OF THE SL ENHANCED Intel486™ CPU AND Intel486 SL CPU The SL Enhanced Intel486 CPU supports many of the features available in the SL Architecture. The major difference between the SL Enhanced Intel486 CPU and the Inte1486 SL CPU is level of integration. The Intel486 SL CPU is a highly integrated CPU with memory controller, ISAIPI-bus controller, and power management built into it. The SL Enhanced Intel486 CPU has retained all the SMM and power management features from the SL Architecture. Features not supported by the SL Enhanced Inte1486 CPU can easily be implemented by external hardware. Table I highlights the differences between the SL Enhanced Intel486 CPU and the Inte1486 SL CPU. Table 1. Feature Comparison of the SL Enhanced Intel486 CPU and the Intel486 SL CPU SLEnhanced Intel486 CPU Features I Intel486SL CPU System Management Mode Yes Yes 5MBASE Relocation Yes No Stop Clock Yes Yes Upgrade PowerDown Mode Yes No Package Options 168 lead PGA, 196 lead PQFP, 208 lead SQFP 196 lead PQFP, 208 lead SQFP 3.3V Operation Yes Yes Clocking Options 1X clock input or 2X clock input 2X clock input CPU Frequency Intel486 SX CPU: 25 MHz, 33 MHz Intel486 OX CPU: 33 MHz, 50 MHz Intel486 DX2 CPU: 40 MHz, 50 MHz, 66 MHz 25 MHz, 33 MHz 2-473, AP-496 2.0 SYSTEM MANAGEMENT MODE IMPLEMENTATION System Management Mode (SMM), first introduced in the SL Architecture for notebook computers, provides a unique environment for software to perform power management functions much more efficiently. Si'nce then, SMM has found its way into many new applications. The SMM hardware interface on the SL Enhanced Intel486 CPU is similar to that of the Intel486 SL CPU except for the handshaking protocol. The SL Enhanced CPUs handshake through the SMI and SMIACT# signals (see Figure 1), and the Intel486 SL CPUs handshake through the SMI and SMRAMCS# signals. SMIACT#, CPU 1/ SMI# i' , SYSTEM LOGIC 241810-1 Figure 1. Basic SMI# Hardware Interface 2.1 System Management Interrupt The system interrupts the normal program execution and invokes SMM by generating a System Management Interrupt (SMI#) to the CPU. On. the Intel486 SL CPU, the SMI # input is held low as long as the CPU is in SMM. With the SL Enhanced Intel486 CPU, SMI # input only needs to remain active for a single clock provided the SMI setup and hold times, t20 and t21, are met. SMI # will also work correctly if it is held active for an arbitrary number of clocks. 2.1.1 GENERAL DESIGN CONSIDERATIONS For Intel486 SL processor-based systems, the 82360SL 1/0 generates the SMI request. For any SMI to be rec- ognized by the SL Enhanced Intel486 CPU, the system logic must ensure all of the required timings are met. The following sections discuss the timing requirements that must be observed by the SMI generation logic interfacing to the SL Enhanced Intel486 CPU. 2.1.1.1 I/O Trapping 1/0 trapping has proven to be very useful in the SL Architecture for device. power management. Trapping the last I/O access prior to entering SMM prevents the CPU from accessing a powered-down device. With the exception of the SMFILO (System Management FILO), the SL Enhanced CPU supports the same 1/0 Instruction Restart option under SMM featured in the Intel486 SL CPU. The 1/0 Instruction Restart feature of the SL Enhanced Intel486 CPU is used the same way as the 1/0 Instruction Restart feature of the Intel486 SL CPU. When the 1/0 Instruction Restart option is enabled (by setting offset OFFOOH in the SMRAM to OFFH), the RSM instruction microcode modifies the restored EIP to point to the instruction immediately preceding the SMI # request, so that the 1/0 instruction can be re-executed. For the CPU to trap the last I/O access correctly, the external hardware must ensure the SMI# signal is asserted at least three CPU clock periods prior to asserting the RDY# signal (see Figure 2). ClK ClK2 SMI# RDY# 241810-2 NOTE: A: Setup time for recognition on 1/0 instruction boundary Figure 2. SMI # Timing when Servicing an I/O Trap 2-474 I AP-496 2.1.1.2 Back-to-Back SMls For back-to-back SMIs, the SMI # input must be held inactive for at least four clocks after it is de-asserted to reset the edge-triggered SMI detection logic. Otherwise, the second SMI # request may not be recognized (see Figure 3). 2.2 SMI Active (SMIACT #) A new pin called SMIACT# (SMI ACfive) which indicates that the CPU is operating in SMM has been added to the SL Enhanced Intel486 CPU. The CPU asserts SMIACT # in response to an SMI interrupt request on the SMI# pin. SMIACT# is driven active by the CPU before accessing the SMRAM. SMIACT # remains active until the last access to SMRAM when the CPU restores (reads) its state from SMRAM. After the RSM instruction is executed, the CPU de-asserts the SMIACT # signal. The SMIACT # signal is equivalent to the SMRAMCS# signal on the Intel486 SL microprocessor except that SMIACT# on the Intel486 SL CPU is active all the time and cannot be used as a chip select signal for external SMRAM. On the Intel486 SL microprocessor, the SMRAM is enabled automatically whenever the CPU is switched into SMM. A similar mechanism can also be implemented by using the SMIACT # signal. Whenever the SMIACT # signal is active, the SMRAM will be enabled by the system logic. If part of the system memory is overlaid by the SMRAM while the CPU is in SMM, the system logic should ensure that only the CPU and SMI handler have access to the SMRAM area. Accesses to addresses overlaid by a bus master or DMA controller when SMIACT # is active should be re-directed to the system memory underneath the SMRAM and not the SMRAM itself. While inside SMM, the CPU should be protected from system activities such as CPU RESET, interrupt requests, and NMI, etc. The SMIACT # can be used by the system logic to block off these system activities while the CPU is in SMM. 2.3 SMRAM Interfacing SMRAM resides in a unique address space so that the software operating under· SMM is transparent to the normal address space. On the Intel486 SL microprocessor, the size of SMRAM can be either 32 Kbytes or 64 Kbytes. Depending on the size of the SMRAM, the SMRAM area can be located in either 38000H3FFFFH or 30000H-3FFFFH. On the SL Enhanced Inte1486 CPU, the size of the SMRAM can be between 32 Kbytes and 4 Gbytes. The location of the SMRAM is determined by the 5MBASE (SMRAM BASE ADDRESS) register, and defaults to 5MBASE + 8000H, which is 38000H after CPU RESET. The first SMI after a CPU RESET always begins executing instructions at 38000H. 2.2.1 GENERAL DESIGN CONSIDERATIONS As previously mentioned, one of the many uses of the SMIACT # output is to enable SMRAM when the CPU is operating in SMM. Most importantly, the SMIACT# output is used by the system logic to maintain system integrity while the CPU is in SMM. CLK CLK2 Second SMI# '--L..-F SMI# 241810-3 Figure 3. Back-To-Back SMI # Timing I 2-475 AP-496 2.3.1, SMRAM INITIALIZATION The SL Enhanced Intel486 CPU family provides a new control register, 5MBASE for changing the SMRAM base address (see Figure 4). The SMRAM base address can be changed after CPU RESET by invoking a dummy SMI call to change the 5MBASE register. r3_1______________-,O ....JI Register offset 7EFBH LI_ _ _ _ _ _ _ _ I SMRAM Base Address 241810-4 Figure 4. 5MBASE Register In the SL Enhanced Intel486 CPU, a new slot is added to the CPU dump area inside the SMRAM at offset 7EF8H for changing the SMRAM base address. During'the execution of the RSM instruction, if the relocation bit is set, the CPU will read this slot and initialize the CPU to use the new 5MBASE during the next SMI. From then on, the CPU will do its context save to the new SMRAM area pointed to by the 5MBASE, store the current 5MBASE in the SMM Base slot (offset 7EF8H), and then execute the new jump vector based on the current 5MBASE. 5MBASE must start at a 32K aligned boundary. Programming the 5MBASE register to values that are not 32K aligned will cause the CPU to enter the shutdown state when executing the RSM instruction. After the SMRAM base address is changed, the new starting address for the SMI jump vector is Calculated by adding 8000H to the new SMRAM base address. The starting address for CPU state dump area will be remapped to the new SMRAM base address plus OFFFFH. A new bit (bit 17) has beeri added to the SMM Revision Identifier on the SL Enhanced Intel486 CPU to indicate whether the processor supports relocation of the SMRAM base address. With the SL Enhanced Intel486 CPU, the 5MBASE relocation bit'is always set to one to indicate the processor supports 5MBASE relocation. 2.3.2 GENERAL DESIGN CONSIDERATIONS Since the memory controller is embedded inside the Intel486 SL CPU, many design issues with SMRAM 2-476 interface are handled internally by the CPU. For theSL Enhanced Intel486 CPU; these issues must be handled by the external system logic interfacing to the CPU. 2.3.2.1 Accessing SMRAM Before the CPU can execute code inside SMM, the SMRAM must be loaded with valid SMM code. If the SMRAM is not initialized with code prior to entering SMM, executing invalid code out of the SMRAM can place the CPU in an unknown state. Thus, the external memory controller must provide a mechanism to bring the SMRAM into system address space without invoking SMM. This will allow software such as BIOS to load the SMM code into SMRAM. The Intel486 SL CPU provides a hardware mechanism to access memory overlaid by SMRAM. Although system logic is not required to' provide a mechanism to access memory located underneath SMRAM, it may be much easier to implement a suspend state (O-volt suspend or S-volt suspend) if such a mechanism is provided. 2.3.2.2 Cache Coherency Since the Intel486 SL CPU does not support a second level cache, cache coherency with SMRAM is handled completely inside the CPU. The CPU's internal cache is automatically emptied before entering SMM and after exiting SMM. The SL Enhanced Intel486 CPU does not flush its cache before entering SMM or after leaving SMM. . Cache flushing is not required if the SMRAM is located in a non-cacheable area in the memory address space or in an external address space which is not visible to the system. If the SMRAM is located in a cacheable area that overlays system memory, both the CPU internal cache and'any second level caches must be flushed before entering SMM. If SMRAM is cacheable, the CPU internal cache and any second level caches must also be flushed when exiting SMM. The following steps must be taken by the system logic to maintain cache coherency when SMM overlays normal system memory: 1. Before entering SMM, the FLUSH # pin should be asserted when SMIACT# is driven active to empty the CPU cache. ' 2. The KEN # pin must be driven inactive to stop accesses to the SMRAM area from filling in the cache line if SMRAM is not cacheable. I AP-496 3. Upon leaving SMM, if SMRAM is cacheable, the CPU cache is emptied by asserting the FLUSH # pin within one CPU CLK after the SMIACT # pin is de-asserted. It is the responsibility of the system logic to ensure that the setup and hold times for FLUSH# and SMIACT# signals are met. 2.3.2.3 External Write Buffers Like the Intel486 SL processor, the SL Enhanced Intel486 CPU empties its internal write buffers before entering SMM to prevent data in the write buffers from being written to SMRAM space. If a system supports a second level cache, the second level write buffers must also be emptied before the CPU enters SMM. It is possible that the CPU is in SMM before the second level write buffers are completely emptied· by the memory controller. In case the second level write buffer is not completely emptied, the SMIACT # signal can be used to direct the memory write cycles to either SMM space or memory space. 2.3.2.4 A20M # Pin The A20M # pin on the Inte1486 CPU is provided to emulate the address wraparound at the I Mbyte boundary which occurs on the 8086 microprocessor (see Figure 5). The SMRAM space on the Inte1486 SL CPU is always below 1 Mbyte memory address space. Memory above I Mbyte can either be accessed through the ISAWINDOW register or the MCWINDOW register. The A20M # signal is automatically driven low whenever the CPU is in SMM. When A20M # is active, all external bus cycles will drive A20 low, and all internal cache accesses will be performed with A20 low. The SL Enhanced Intel486 CPU does not provide any memory mapping mechanism to access memory above 1 Mbyte. To access memory above I Mbyte inside SMM, the software has to disable the A20M # manually through the keyboard controller. Also, if the SMRAM is located above I Mbyte and A20M # is not enabled before entering SMM, the system will crash. In this case, the CPU will attempt to access SMRAM at the relocated address with A20 low, and will thus fetch invalid code. For these reasons, the A20M # should be driven inactive prior to entering SMM and remain inactive as long as the CPU is in SMM. This can be accomplished by blocking the assertion of A20M # whenever SMIACT# is active. The state of the A20M# should be saved upon entry to SMM and restored to its original state after leaving SMM. 2.4 SMM Environment Initialization When the CPU is running in SMM, the processor is in a pseudo "real. mode" environment, but without the 64 Kbyte limit. After the SMRAM base address register has been relocated, the CPU segment registers will have values shown in Table 2 when an SMI # occurs. The CS selector register still contains the value 3000H, not the value corresponding to the new 5MBASE. The rest of the registers are still initialized to zero. If the SMRAM base address has not been relocated, the segment registers can be initialized in the same way as with the Intel486 SL processor, i.e., using the CS register which defaults to 3000H. Otherwise, the segment registers must be initialized correctly to point to the new SMRAM memory space. A20 GATE cPU CONlROL LOGIC 241810-5 Figure 5. A20M # Interface Logic I 2-477 AP-496 Normally, the data segment registers are initialized to po!nt to the SMRAM base address. Upon entering SMM, the CS BASE segment register is initialized to point to the SMRAM base address. The location of the SMRAM base address can be determined by reading the 5MBASE register in the SMRAM at offset OFEF8H (the location of the SMRAM base address can also be stored in another memory location such as CMOS RAM by the BIOS which can be retrieved by the SMM program). The 2X clock input is twice the internal frequency of the CPU, whereas the IX clock frequency is the same internal frequency of the CPU. With the IX clock, the two internal clock phases, "phase one" and "phase two", are generated by an internal Phase Lock Loop (PLL). The CPU clock input for a IX clock cannot be changed dynamically because the PLL requires a constant frequency CLK input (to within 0.1 %). 3.1 STPCLK # Interrupt The 5MBASE contains a 32-bit address and has to be shifted to the right by four bits to generate a 16-bit segment address before it can be placed in the data segment selector registers. The CS selector register cannot be initialized by writing directly to it. It has to be initialized by executing a far jump instruction to an address within the SMRAM to force the CS selector register to point to the SMRAM base address. When the CPU is in SMM, the operand size and the address size are still 16 bits but there are no limits to segment size. The physical address of an instruction is obtained by adding the value in CS segment base register to the value in EIP register, rather than the IP register. To access data anywhere within the four Gbyte logical address space, operand-size override (opcode 66H) and address-size override (opcode 67H) prefixes can be used as needed. Alternatively, SMRAM data located above I Mbyte can also be accessed by using 32-bit displacement registers. Table 2. Register Values after SMI# Segment Register Selector Base Limit CS OS ES FS GS SS 3000H OH OH OH OH OH 5MBASE 4 Gbytes 4 Gbytes 4 Gbytes 4 Gbytes 4 Gbytes 4 Gbytes 3.0 OH OH OH OH OH POWER MANAGEMENT One of the most important power management features on the Intel486 SL processor is CPU clock control. The clock control scheme on the SL Enhanced Intel486 CPU is similar to the Intel486 SL CPU, with the InteI486 SL CPU being driven by a 2X clock, and the SL Enhanced InteI486 CPU available with both the IX and 2Xclocking options. 2-478 As with the Intel486 SL CPU, the SL Enhanced Intel486 CPU provides an interrupt mechanism, STPCLK #, which allows system hardware to control the power consumption of the CPU by stopping the internal clock to the CPU. Unlike the normal interrupts, INTR and NMI, the STPCLK # interrupt does not initiate interrupt acknowledge cycles or interrupt table reads. The Stop Clock feature on the SL Enhanced Intel486 CPU has been improved, allowing the input to the STPCLK # to be driven asynchronously as well as synchronously. The major difference between asynchronous and synchronous control is that the STPCLK # interrupt latency is much shorter with asynchronous control. With the Intel486 SL CPU, the STPCLK # input is controlled asynchronously through software. The STPCLK # is asserted after doing a dummy 110 read to the STPCLK register in the 82360SL or executing an HLT instruction. The STPCLK# signal will remain asserted until a system event wakes up the CPU. If the STPCLK # input is driven asynchronously, both setup and hold times t20 and t21 must be met for the STPCLK # interrupt request to be recognized. After a STPCLK # interrupt request is recognized by the CPU, the processor will stop execution on the next instruction boundary, stop the pre-fetch unit, and then empty all internal pipelines and the write buffers. Finally, a special Stop Grant bus cycle is generated. The pin state during a Stop Grant cycle is shown in Table 3. The interrupt acknowledge cycle is terminated when the system logic returns RDY # or BRDY #. At this point the CPU is in the Stop Grant state and the internal clock is stopped. The Stop Grant cycle is similar to the HALT cycle except that the address bus has the value 04H instead of DOH. I AP-496 Table 3. Pin State During Stop Grant Cycle Signals State MIIO# 0 D/C# 0 W/R# 1 Address Bus 0000 0010H (A4 = 1) BE3#-BEO# 1011 (same as HALT) Data bus Floated Using the STPCLK # input, the SL Enhanced Intel486 CPU can be put into low power states similar to the Global Standby and Suspend states as with the Intel486 SL microprocessor. 3.2 Global Standby Implementation In an Intel486 SL processor·based system, the 82360SL puts the CPU in a low power standby state (CPU Icc - 20 rnA-50 rnA) when the system is in Global Stand· by. A similar state called Stop Grant State is provided by the SL Enhanced Intel486 CPU. The Stop Grant state can be entered by simply asserting the external STPCLK # interrupt pin. Once the STPCLK # inter· rupt is acknowledged by the CPU (i.e., after the Stop. Grant cycle is placed on the bus), the CPU is in the Stop Grant state. While in the Stop Grant state, the CPU still responds to RESET or SRESET and requests a cache invalidation (i.e., HOLD, AHOLD, BOFF# and EADS#). How· ever, the CPU does not recognize any other inputs while in the Stop Grant state. Input signals to the CPU will not be recognized until one CPU clock cycle after STPCLK # is deasserted. Stop ~1 Resta To emulate Global Standby. the stop clock control log· ic must be able to de· assert the STPCLK # signal whenever there is system activity (i.e., INTR, IRQ, NMI, and SMI#). Typically, the stop clock de·asser· tion lo~ic is implemented by logic which latches the incoming interrupt requests from the system. The CPU returns to its normal state within 10-20 CPU clock cycles after exiting the Stop Grant state. As mentioned before, the CPU does not recognize any interrupt request while the STPCLK # input is active. To prevent the interrupt request from getting lost, the interrupt request logic must ensure the interrupt signal is held active for at least one CPU clock after the STPCLK # input is de·asserted. 3.2.1 SUSPEND IMPLEMENTATION From the Stop Grant state, the CPU can go into a lower power state similar to the suspend state offered in the Intel486 SL processor. After the CPU is in the Stop Grant state, the CPU can enter the lowest power Stop Clock state (-100 )J-A-200 )J-A) by stopping the CPU clock input. The CPU clock input can be driven to either logic high or logic low during Stop Clock state. The CPU will not generate any acknowledge cycle when entering stop clock state. For a 2X clock input, the clock input to CLK2 can .be stopped on either a logic high or logic low independent of the clock phase. The CPU will go back to Stop Grant. state as soon as the CPU clock is re·started. Upon exit from Stop Clock State, the CPU clock input must be re· started in the same state when it was stopped (see Fig· ure 6). ~2 CLK2 Stop ~2 CLK2 241810-6 Figure 6. CLK2 Phase Coherence in CLK2 Stop and Restart I 2·479 Ap·496 For a CPU with a IX clock input, the CPU clock can be stopped in the same manner as a CPU with a 2X clock input. Because of the phase lock loop, the CPU will not return to the Stop Grant state right after the CPU clock input has been re-started. To allow time for the PLL to stabilize, the CPU clock input must be held at a constant frequency for a period of time equal to the PLL startup latency (as specified in the data book) before the CPU will return to the Stop Grant state. As long as the CPU clock input is stopped, the system logic must keep all the CPU input signals in the same' state before the clock was stopped. Any change in state on an input signal (except for INTR) before the CPU has returned to the Stop Grant state will result in unpredictable behavior. The CPU will not be able to recognize any interrupt request while the CPU clock is stopped. 3.2.1.1 Dynamic Clock Switching' For a CPU clock with a IX clock input, the CPU clock cannot be changed "on-the-fly". For power management as well as implementation of features such as deturbo mode, it is advantageous to run the CPU clock at a lower frequency. This can be accomplished by putting the CPU into Stop Clock state and change the CPU clock to a lower speed. After the CPU clock input frequency is changed, the clock control logic must ensure that the clock input has been running at a constant frequency for the time period necessary for the PLL to stabilize before de-asserting the STPCLK # signal. The lowest CPU clock rate for a IX part is 8 MHz (see Figure 7). 25 MHz Stop Restart 3.2.1.2 Power Consumption The Stop Grant and Stop Clock states are designed to save power. While the processor is in Stop Grant state, the input/output signals on the CPU remain at, the same state when entering the Stop Grant state, floated (data and parity signals), or driven to a different state. If some ofthe signals are driven improperly, the system can end up consuming more power. To achieve the lowest power consumption, all the possible current leakage must be eliminated. The system logic should never drive the input signals with pull-up resistors LOW and input signals with pull-down ,resistors HIGH. While in the Stop Grant state, the pull-up resistors on STPCLK# and UP# are disabled internally. The system must cOntinue to drive these inputs to the state they were in immediately before the CPU entered the Stop Grant state. For minimum CPU power consumption, all other input pins should be driven to their inactive level while the CPU is in the Stop Grant state. 3.2.2 GENERAL DESIGN CONSIDERATIONS The STPCLK # input is an asynchronous signal. The system can crash if the interface to the STPCLK # is not designed properly. Special care must be taken to ensure that all the timing requirements are met and the proper protocol is used. Listed below are some design considerations that should be considered when designing the STPCLK # interface. • The CPU cannot empty the write buffer during an HLDA cycle. Therefore, the CPU will :not acknowledge any STPCLK # request during an HLDA cycle. Stop Restart 25 MHz elK 241810-7 Figure 7. Clock SWitching on IX Clock Input, 2·480 I AP·496 • After the STPCLK # is asserted, the CPU does not generate a Stop Grant cycle until it completes the' current instruction. The latency between a STPCLK # request and the Stop Grant bus cycle depends on the current instruction, the amount of data in the CPU write buffers, and the system memory performance. • The CPU will not enter the Stop Grant state until either RDY # or BRDY # has been returned. • In response to HOLD being driven active during the Stop Grant state (when the CLK input is running), the CPU will generate HLDA and three-state all output and input/output signals that are three-stated during the HOLD/HLDA state. After HOLD is de-asserted all signals will return to the state they were in prior to the HOLD/HLDA sequence. • When the CPU enters the Stop Grant state, the internal pull-up resistor is disabled so that the CPU power consumption is reduced. The STPCLK # input must be driven high (not floated) in order to exit the Stop Grant state. • It is the responsibility of the system designer to ensure that the CPU is in the correct state prior to asserting cache invalidation or interrupt signals to the CPU. 4.0 RESET IMPLEMENTATION On a standard PC, the CPU can be reset by either hardware or software. On the SL Enhanced Intel486 CPU, ' asserting the RESET input to the CPU will also set the 5MBASE register to the default value of 30000H. In other words, the SMRAM base address will reset to 30000H whenever the operating system asserts the CPU RESET signal. For some older software, a CPU RESET is generated by the software to return the CPU to real mode from protected mode. Normally, this is not a problem if 5MBASE relocation is not used. If the SMRAM base address has been relocated, the CPU could be executing invalid SMM code from the address. The SRESET pin has the same functions as RESET except that it does not reset the 5MBASE register. For a system which uses 5MBASE relocation, the logic which generates the software CPU RESET must be tied to the SRESET input and not the RESET input on the CPU. All hardware resets should be implemented through the RESET pin (see Figure 8), and all software resets should be implemented through the SRESET pin. While inside SMM, the CPU should be protected from being reset by a software CPU RESET. SRESET should be blocked whenever the SMIACT# is active. Any request for a CPU RESET when the CPU is in SMM should be latched so it can be serviced after the CPU exits SMM. To ensure the execution of the RSM instruction does not get interrupted by the SRESET, the SRESET must be blocked until at least 20 CPU clock cycles after SMIACT # has been driven inactive. PWRGOOD CPUCLK D Q KBD OR SOFTWARE CPU RESET LOGIC RESET SRESET 241810-8 Figure 8. SRESET Interface Logic I 2-481 AP-496 4.1 General Design Considerations CONCLUSION The system desigp.er. should consider the following restrictions while implementing the CPU Reset logic: • For. SRESET to be recognized by the CPU, the SRESET input must be driven active (high) for at least 15 CPU clock cycles. • The .SRESET is not intended to be used for flushing . the on-chip cache. For compatibility with future generation Intel CPUs, the SRESET input pin .should not be used for flushing the on-chip cache. • Hardware resets should not be blocked when the CPU is in SMM so that the system can recover from a fatal system failure. While taking advantage of the benefits ofthe SLArchitecture, the SL Enhanced Intel486 CPU family provides a whole new world of opportunity for system designers to develop innovative, energy-efficient mobile and desktop designs. The SL Enhanced Intel486 microprocessor famiJy combines power management, compatibility and performance, allowing system designers to build a wide variety of machines to meet the diverse needs of a broad range of users . References Iiltel's SL Architecture: Designing Portable Applications, 1993, McGraw-Hili. Intel486 Microprocessor Family Data Book Addendum: SL Enhanced Intel486 Microprocessor Family, 1993, Intel Corporation. 2-482 I intel· AP-497 APPLICATION NOTE Managing Power with the SL Enhanced Intel486™ Microprocessor CHENG XIE MCG TECHNICAL MARKETING October 1993 2-483 Managing Power with the SL Enhanced Intel486™ Microprocessor CONTENTS PAGE 1.0 INTRODUCTION ................... 2-485 2.0 POWER MANAGEMENT FEATURES .......................... 2-485 2.1 System Management Mode ...... 2-485 2.2 Flexible Clock'Control Options ... 2-486 CONTENTS PAGE 3.0 IMPLEMENTATION OF POWER MANAGEMENT FEATURES ......... 2-489 3.1 CPU Power Control .............. 2-489 3.2 Controlling Power of Peripheral Components ...................... 2-489 3.3 Suspend and Resume .. , ........ 2-490 2.3 Different Low Power States ...... 2-486 2.3.1 Power States in 1X Mode ... 2-486 4.0 SUMMARY . ........................ 2-490 2.3.2 Transition of Power States and Latency Associated with 1X Mode .......................... 2-487 2.3.3 Power States in 2X Mode ... 2-488 2.3.4 Transition of Power States and Latency Associated with 2X Mode .......................... 2-488 2-484 I AP-497 1.0 INTRODUCTION Intel's System Management Mode (SMM), introduced as part of the Intel SL technology, has become an industry standard for portable computing. Through the utilization of SMM, system designers have a new method of adding software controlled features that operate transparently to the operating system and applications software. In portable computer systems, SMM is often used to implement the power control on various system components to conserve power consumption. Flexible clock control has also become essential to the design of power-saving computers. The new SL Enhanced Intel486™ microprocessor family incorporates all' of the best power management features which first appeared in Intel SL technology, bringing Intel486 CPU performance to portable computers and energy-efficient desktop systems. The purpose of this application note is to provide system designers with a better understanding of the theory and implementation of power management with the new SL Enhanced Intel486 CPUs. 2.0 POWER MANAGEMENT FEATURES • System Management Mode-This mode is composed of a special purpose interrupt that serves as the hardware interface and a secure memory address space that stores processor status and special software routines. It can be used to implement power management for portable systems. • Flexible Clocking Options-The clock input to the CPU can either be a IX clock or a 2X clock. For IX clock option, the internal clock of the CPU runs at the same speed as the input clock (CLK input). In the 2X clock case, the clock input (CLK2 input) needs to be twice the frequency of the internal clock. • Different Low Power States-Different low power processor states are available for various operating conditions. This feature enables the conservation of processor power consumption without sacrificing performance. • Low Voltage Power Supply Option~The SL Enhanced Intel486 CPUs can be powered by either a 5V or a 3.3V supply, with the 3.3V supply providing a 50%power savings. 2.1 System Management Mode The System Management Interrupt (SMI #) input pin on the processor provides the hardware interface for the computer system to invoke SMM. An exclusive memory address space, SMRAM, is only available for the CPU to access while in SMM. The size of SMRAM can be between 32 Kbytes and 4 Gbytes. It is used to store processor state and SMI handlers. SMI handlers are special software routines that can be designed to I control the power states of system components. Intel's SMM has a special instruction, RSM, that is responsible for exiting SMM. When executed, RSM instruction restores the processor state from SMRAM and returns control to the application that was interrupted by SMI#. The servicing of SMI # is different from that of a regular interrupt. The system invokes SMM by generating an SMI# to the SL Enhanced Intel486 CPU. Normal program execution will be interrupted as a result and the CPU will respond to the interrupt by asserting SMIACT #. This signal is used by the system to enable SMRAM space. The CPU will then save its state into the SMRAM area, starting at· address location 3FFFFH and proceeding downward in a stack-like fashion. After completing a state save, the CPU will be in a pseudo real-mode processor environment. The microcode will then direct the CPU to begin executing instructions at the absolute address of 38000H (SMRAM), at which point it will begin executing SMI handlers. SMI handlers can perform various system management functions, including system power control. The last instruction in an SMI handler is always the RSM instruction. This instruction will restore the CPU state from SMRAM, de-assert SMIACT # and return control of the system to the interrupted program. The generation of an SMI # to the CPU can be initiated . by hardware or software for the purpose of power management. The actual implementation depends on the specific chipset used and how the system is designed. Hardware can generate SMI # by pulling the SMI # pin low directly or through other chipset pins, i.e., battery level control. When the charging level of the main battery falls below a certain limit, the chipset monitoring the battery level can interrupt the normal program operation by pulling the SMI# pin low. While in SMM, the CPU can execute certain power-down SMI handlers to put the entire computer system in a lowpower mode that can operate out of a different battery source. This will enable the system to maintain its current status while allowing the main battery to be changed. SMI # can also be initiated through software. Different chipsets have various ways of interfacing to the CPU in this aspect. Most of them have dedicated timers to detect an idled device. Once these timers are enabled, the timeout will autom\ltically generate an SMI#. Exiting SMM is accomplished by the execution of the RSM instruction. Besides restoring the CPU state, the RSM instruction can also perform three other functions. The first is called "Auto HALT Restart". The System Management Interrupt request can interrupt the HALT instruction. By setting the appropriate control slot in the SMRAM space, the RSM instruction can return control to the HALT instruction or the in- 2-485 AP-497 struction immediately following the HALT instruction. The second special feature is "I/O Trap Restart". If SMI # interrupt is generated on an I/O access, the RSM instruction will re-execute that I/O instruction if its I/O Trap Restart slot in the SMRAM is set. The third function is the relocation of 5MBASE, the starting address of SMRAM. This provides system designers with the flexibility of placing SMRAM space into an area in which its integrity is best ensured. The starting address is controlled by the 5MBASE register, which has a power-on reset value of 30000H. The default SMRAM area starts at 38000H and ends at 3FFFFH. The SMRAM can be relocated to any 32K aligned area, either overlaid on top of the normal system address space or placed in a distinct address space. 2.2 Flexible Clock Control Options The standard InteI486 SX and InteI486 DX CPUs are driven by IX clock as opposed to the Inte1386 CPUs which use a 2X clock input. The SL Enhanced Intel486 CPUs are available with either the IX or the 2X clocking options. The IX clock allows simpler system design by cutting in half the clock speed required in the external system. The IX clock relies on an internal Phase Lock Loop to generate the two internal clock phases, "phase one" (phI) and "phase two" (ph2). The rising edge· of the CLK input corresponds to the start of phI. All external., timings are specified with respect to the rising edge of the CLK input. The PLL requires a constant frequency CLK input (to within 0.1 %), and therefore the CLK input cannot be changed dynamically. The IX CLK input option is essential for those processors with an on-chip clock doubler. The IX CLK input provides the fundamental timing references for the bus interface unit. The CLK input is doubled internally so that the CPU core will operate at twice the CLK input frequency, and hence twice the bus frequency. The internal clock doubler enhances all operations using the internal cache and/or not blocked by external bus accesses. This mode also uses PLL and therefore the CPU CLK input must be maintained at a constant frequency. The SL Enhanced InteI486 CPUs also offer a 2X clock option for systems that rely on dynamic frequency scaling for CPU power management. The frequency of the CLK2 input is twice the internal frequency of the CPU. The internal clock is comprised of two phases, "PH)" and "PH2". Each CLK2 period is a phase of the internal clock. All timings are referenced to the rising edge of the PHI of the CLK2 input. It is therefore important to synchronize the external circuitry with the PHI of the CLK2 input. Because the 2X clock option does not rely on the PLL to generate the internal phase clocks, the frequency of the CLK2 input can be changed dynamicallyor "on-the-fly". 2-486 2.3 Different Low Power States 2.3.1 POWER STATES IN 1X MODE Several low power modes are available on the IX microprocessors. These modes make it possible for a power-sensitive computer system to optimize power conservation. Some of the CPU power controls are realized through a' specially provided interrupt mechanism. Each of the following low power states is described in detail. • Auto Idle Power Down State This low power state is available in normal operation for the DX2 CPUs. DX2 CPUs have an internal clock doubler which doubles the IX clock input and therefore enables the CPU core to operate at twice the speed 9f the input clock. When the SL Enhanced InteI486 CPU is known to be truly idle and waiting for a ready from a memory or an I/O bus cycle read or write, the DX2 CPU will reduce its core clock rate to IX from the doubled DX2 clock. In this state, the CPU only consumes half of the normal power. More importantly, this function is transparent to software and external hardware and therefore does not cause any performance degradation. • Stop Grant State The Stop Grant state is initiated by simply asserting -the external STPCLK # interrupt pin. The Stop Grant state is us\!d to transition to the Stop Clock state. The CPU enters the Stop Grant state through the following steps: When the CPU recognizes the STPCLK # request, it will stop the execution of the normal program on the next instruction boundary, stop the pre-fetch unit, empty all internal pipelines and write buffers, generate a Stop Grant bus cycle and then stop the internal clock. This state is exited when STPCLK # pin is pulled high. The rising edge of the STPCLK # will tell the CPU to return control to the interrupted program and start to execute the instruction following the last executed instruction of the interrupted program. • Stop Clock State The Stop Clock state is entered from the Stop Grant state by completely stopping the clock input to the PLL. In this state, the CPU consumes only -100 /LA-2OO /LA of current. • Auto HALT Power Down State When the HALT instruction is executed, the CPU will issue a normal HALT bus cycle. The SL Enhanced Intel486 CPU will automatically stop the internal CPU clock, therefore causing the CPU to enter a low power state with a current of - 20 mA55 mAo I Ap·497 • Stop Clock Snoop State Cache snooping is necessary during Stop Grant and Auto HALT Power Down states in order to maintain memory coherency. When the system issues a request for cache snooping, the CPU will transparently enter the Stop Clock Snoop state and will power up for I full core clock to complete the cache snoop cycle. It will then re-freeze the clock to the CPU core and either return to the Stop Grant or the Auto HALT Power Down state. 2.3.2 Transition of Power States and Latency Associated with 1X Mode It is important to understand how the different power states are interrelated and how one transitions to another. The latency is different for different state transitions. Figure I depicts which transitions are allowed. We shall describe how the transitions are made and how much latency is associated with each transition. 1. The Auto Idle Power Down state is entered whenever the CPU is detected idle and waiting for a RDY # from either a memory or I/O read/write. This state only applies to SL Enhanced Intel486 DX2 CPUs. Both the internal CPU core clock and the current drop to half of the doubled frequency. There is no latency associated with this transition. The CPU will go back to normal operation when a RDY # is detected. 2. The Stop Grant state is entered when the STPCLK # interrupt is asserted to the CPU by the system. In this state, the clock output of PLL (or the clock input to the internal core) is stopped. The speed of the clock input to the PLL can be maintained or changed. If the clock frequency is changed, the CPU requires the clock to be held at a constant frequency for a minimum of I ms before de-asserting STPCLK #. The I ms time period is necessary so that the PLL can stabilize the input clock period. De-asserting STPCLK # returns the CPU to normal operation. The CPU will also return to its normal state when RESET or SRESET is asserted. 3. The Stop Clock state can only be entered from Stop Grant state when the clock input to the PLL is stopped. The CPU must go through Stop Grant state to return to normal operation. Before the CPU can return to Stop Grant state, the clock input has to stabilize for the I ms required by the PLL. 4. The Auto HALT Power Down state is entered when HALT instruction is executed. The clock input to the internal CPU core is automatically stopped upon the execution of HALT instruction. The clock input to the PLL cannot be changed during this state. This state is exited back to normal operation whenever any of the following events happen: INTR, NMI, SMI #, RESET or SRESET. There is no latency associated with this state transition. 241811-1 Figure 1. Power State Transitions for 1X CPUs I 2-487 AP·497 5. When the CPU is in Auto HALT Power Down state, the system can generate STPCLK # to the CPU to bring the CPU into a Stop Grant State. When the system de-asserts the STPCLK # request, the CPU will return to the Auto HALT Power Down state. There is no latency associated with this transition. HALT bus cycles will be launched whenever this transition occurs. 6, 7. Cache snooping can be performed when the CPU is either in Stop Grant state or in Auto HALT Power Down state. Cache snoop cycles begin when the CPU receives an EADS# from the system. The CPU will only wake up for I complete core clock to perform cache invalidation and then re-freeze the clock, i.e., either return to the Auto HALT Power Down state or to the Stop Grant state. 2.3.3 POWER STATES IN 2X MODE There are five operating modes for the 2X SL Enhanced Intel486 CPU. The CPU power controls are realized through a specially provided interrupt mechanism, STPCLK #. In the following, we shall describe each of the power states in detail. The CPU enters the Stop Grant state through the following steps: when the CPU recognizes the STPCLK # request, it will stop the execution of the normal program on the next instruction boundary, stop the pre-fetch unit, empty all internal pipelines and write buffers, generate a Stop Grant bus cycle and then stop the internal clock. This state is exited when STPCLK # pin is pulled high. The rising edge of the STPCLK #. will tell the CPU to return control to the interrupted program and start to execute the instruction following the interrupted instruction. • Stop Clock State The Stop Clock state is entered from the Stop Grant state by completely stopping the clock input (CLK2). In this state, the CPU consumes only ~ 100 fLA-200 fLA of current. • HALT State When the HALT instruction is executed, the CPU will issue a normal HALT bus cycle. For 2X CPUs, there is no power savings in this state. • Stop Clock Snoop State Cache snooping is necessary during Stop Grant state in order to maintain memory coherency. When the system issues a request for cache snooping, the CPU will transparently enter the Stop Clock Snoop state and will power up for 1 full core clock to complete the cache snoop cycle. • Normal State 2X CPUs do not have a PLL, and therefore do not require a stabilized clock period. This means the frequency of the input clock (CLK2) can be changed dynamically. Depending on the level of activity, CPUs do not always have to operate at full speed. Reducing the CPU speed saves power. 2.3.4 TRANSITION OF POWER STATES AND • Stop Grant State The Stop Grant state is initiated by simply asserting the external STPCLK # interrupt pin. The Stop Grant state is used to transition to the Stop Clock state. Figure 2 shows the state transitions of a 2X microprocessor. We shall describe how the transitions are made and how much latency is associated with each transition. I HALT State ,- @ LATENCY ASSOCIATED WITH 2X MODE .. I @ Normal Execution ,r ... 1 -I G) Stop Grant State G) I j~ I ~ " I Stop Clock State I 241811-2 Figure 2. Power State Transitions for 2X CPUs 2-488 I Ap·497 1. The Stop Grant state is entered when the STPCLK # interrupt is asserted to the CPU by the system. In this state, the speed of the external clock input (CLK2) can be maintained or changed. There is no latency associated with changing the CLK2 frequencies. De-asserting STPCLK # returns the CPU to normal operation. The CPU will also return to its normal state when RESET or SRESET is asserted. 2. The Stop Clock state is entered from Stop Grant state when the clock input (CLK2) is stopped. The CPU must go through Stop Grant state to return to normal operation. There is no additional delay associated with returning to normal operation. 3. This state is entered when the HALT instruction is executed. The HALT state consumes the same power as the normal state. The HALT state is exited back to normal operation whenever any of the following events happen: INTR, NMI, SMI #, RESET or SRESET. There is no latency associated with this state transition. 4. When the CPU is in HALT state, the system can generate STPCLK # to the CPU to bring the CPU into Stop Grant state. When the system de-asserts the STPCLK # request, the CPU will return to the HALT state. There is no latency associated with this transition. HALT bus cycles will be launched whenever this transition occurs. 3.0 IMPLEMENTATION OF POWER MANAGEMENT FEATURES The means of implementing power management features depend on the specific chipset used. Most of the chipsets have both hardware and software power management. There are always a number of dedicated or user-definable timers that monitor the activity of certain device(s), such as CPU or peripheral components. In a software approach, the timeout of these timers can trigger a System Management Interrupt. Upon the detection of an SMI, the CPU will execute the power management SMI handlers in the BIOS which exercise device power control through detailed programming. For a hardware-based approach, the timers can automatically be enabled to perform power controls. If the status of the specific device or the entire system needs to be saved before changing its power state, the software approach must be used. In other words, if the original status of a device or the entire system is required to return the system to its operational state before the power state change, SMM must be invoked and power control will be accomplished by the SMI handlers. This section summarizes several power control schemes that are common conceptually to all major chipsets and explains how they interact with the power management features offered by the SL Enhanced Intel486 CPUs. I 3.1 CPU Power Control Controlling SL Enhanced Inte1486 CPU power is achieved by reducing the CPU clock speed, stopping the CPU clock or shutting off the CPU power. All chipsets have the option of pre-programming the CPU speed regardless of the level of system activity there is and the CPU clock speed can be divided down by 2 to 64. Once the CPU is selected to run in a reduced speed mode and the CPU clock division is selected, the CPU will always run at a divided speed until the CPU is switched into some other mode of operation. Speed reduction is done by BIOS through programming certain register bits immediately after booting. The speed of the CPU can also be changed dynamically depending on the level of system activity. Most of the chipsets provide mechanisms of monitoring the level of system activity involving the CPU by detecting the toggling of certain CPU signal lines. The timers associated with these monitoring devices are responsible for determining when to reduce the CPU speed by timing out a programmable time period. The chipset reduces the speed of the CPU clock by dividing its clock input to the CPU after STPCLK # is asserted. For 2X SL Enhanced Inte1486 CPUs, the speed can be changed onthe-fly. For IX SL Enhanced Inte1486 CPUs, clock input has to· be stabilized for I ms before de-asserting STPCLK #. Stopping the CPU clock is accomplished in the same fashion. The CPU clock control can be achieved either through hardware or software approach. If the status of the CPU needs to be saved in order to return the system to the original state, CPU clock control should be done through the SMI handlers in SMM mode. Shutting off the power to CPU can only be done in suspend mode. 3.2 ContrOlling Power of Peripheral Components The power control of peripheral components is accomplished through idle detectors and SMI generators. Idle detectors monitor the access to the following devices: Keyboard, Video RAM, Floppy Disk Drive, Hard Disk Drive, Serial Port and Parallel Port. Idle detectors can also monitor access to a programmable address range. Some chipsets even provide additional pins to monitor other user-definable miscellaneous activities. There are timers associated with each of the idle detectors with programmable idle time period and the timers activate power control pins that are directly tied to the controlled devices. When a timer times out the programmed period, it will activate the power control pin to shut down the power to the specific device. For .ad2-489 Ap·497 dress range idle detectors, when there is no access to the monitored address range for a programmed timeout period, the timers will either reduce the clockspeed or shut off the power of the device that has the same address bits. The idle detectors can operate outside of SMM mode and can be independent of CPU state. SMI event generators generate SMI requests through a number of dedicated or user-defined events. Depending on the specific chipset used, these events can be the' activity timeout of individual devices or a colle.ction of devices. Upon the timeout, an SMI request will be generated to the CPU, which in turn invokes SMM. The SMI handlers in the SMRAM contain the software routine that controls the power state of the device(s) initiating the SMI request.· By executing this· routine, the CPU will access the power management control registers associated with the devices. After proper programming, these registers will activate the proper power control pins to shut down the power to the proper device(s). Controlling the power of peripheral devices through SMI handlers offers complete flexibility to either manage the power individually or collectively. This is very important for optimum power conservation. 2-490 .3.3 Suspend and Resume Suspend state is the deepest level of power conservation. There are two types of suspend: normal suspend and O-volt suspend. In normal suspend, only the CPU, chipset and memory sub-system are powered. System status is saved into SMRAM. The rest of the system is shut down. DRAM is refreshed with a very slow clock (64 KHz or 32 KHz). In OV suspend, the entire system, including the CPU, is shut down except the part of the system logic that is responsible for resume. The resume logic is always powered by an RTC battery. The system status is saved onto hard disk. Suspend is normally triggered by the suspend timer, and the timeout of the suspend timer is also programmable. Because of the amount of BIOS support required by Suspend, SMM must be invoked. Hardware alone cannot accomplish the task. 4.0 SUMMARY SL Enhanced Intel486 CPUs provided the best power management features that Intel SL technology offers. Intel's System Management Mode has become an industry standard for power-saving computing. Through Intel's SMM, the implementation of power management is very flexible, enabling the optimization of power conservation for different system designs. The various CPU clock control options available on SL Enhanced Intel486 processors provide the basis for runtime power management with no performance impact. All major chipsets support the Intel power management scheme with easy-to-design software and hardware interfaces. I intel· AP-498 APPLICATION NOTE Thermal Design for High Performance Notebooks VLADIMIR ALEKSIEV CHIA-PIN CHIU WENDY LUI ED WILSON DAVID YUAN MCG TECHNICAL MARKETING MCG PACKAGING THERMAL/MECHANICAL TOOLS AND ANALYSIS GROUP October 1993 I O,d., N"mb,d41.12·001 2·491 Thermal Design for High Performance Notebooks CONTENTS PAGE 1.0 INTRODUCTION ................... 2-493 2.0 THERMAL BACKGROUND ........ 2-493 2.1 Heat Transfer ................... 2-493 2.1.1 Conduction ................. 2-493 2.1.2 Convection ................. 2-493 2.1.3 Radiation ................... 2-494 CONTENTS PAGE 5.0 CALCULATING THERMAL HEADROOM ........................ 5.1 Using Thermal Headroom Graphs ........................... 5.2 Example of Graph Use ........... 5.3 Adjusting Thermal Headroom for Board Power ...................... 2-500 2-500 2-500 2-500 2.2 Thermal Impedance ............. 2-494 5.4 Experimental Measurements are Essential ......................... 2-501 3.0 POWER MODELING ............... 2-495 5.5 Secondary Effects ............... 2-501 3.1 Power Consumption Model ...... 2-495 3.2 Empirical Data ................... 2-495 6.0 IMPROVING THERMAL . HEADROOM ........................ 2-503 3.3 Typical System Power Consumption Profiles ............. 2-496 6.1 Improving Thermal Convection, Conduction and Radiation ......... 2-503 4.0 (JJA BASED ON EXTERNAL TA .... 2-496 6.1.1 Black Paint ................. 2-503 ·4.1 Measurements from Commercial Notebooks ........................ 2-496 6.1.2 Copper Foil ................. 2-503 4.2 (JJA and (JJC Measurements in an 6.1.3 Perfluorocarbon Fluid and Silicone Elastomers ............. 2-503 Actual Test Environment .......... 2-497 6.2 Optimizing System Layout ....... 2-504 4.3 System Impact on CPU (JJA ...... 2-498 6.3 Effective System Power Management ..................... 2-504 7.0 CONCLUSION ..................... 2-505 APPENDIX A .......................... 2-506 APPENDIX B .......................... 2-509 APPENDIX C .......................... 2-510 2-492 I AP-498 1.0 INTRODUCTION Today's market for notebook computers demands desktop performance in smaller and smaller form factors. Along with the higher performance comes greater power consumption, which adds unique challenges for the mobile operating environment (battery operating time, thermal management, physical dimensions, etc.). Included in this application note is a basic description of the thermal forces at work in mobile applications, with mathematical models that can be used to project system thermal parameters and aid the designer in worst-case design. The first section begins with a review of the basic thermal definitions which apply to notebook designs. Thermal data from a notebook experiment is presented to show a relationship between the temperature outside the notebook and the CPU case temperature. A model is given that helps the designer ensure the CPU thermal operating specifications are met. Next, several power consumption profiles are provided, starting with the assumed worst-case model, along with more conservative power profiles based on the degree of power management implementation. Based on these models, designers may forecast the amount of additional thermal margin their applications need. With these thermal models, power consumption profiles, and test measurements, the designers will have the necessary tools and techniques to design for the worst case, and understand that by applying simple design enhancements, they can improve the quality of their designs. The designer should ensure the measured CPU case temperature (TCASE) complies with the T CASE specifications published in the SL Enhanced Intel486 Microprocessor Data Sheet Addendum. 2.0 THERMAL BACKGROUND 2.1 Heat Transfer Designing high performance CPU notebook systems requires some knowledge of the three processes by which heat is transferred from one point to another, namely: conduction, convection, and radiation, which are 'de~ scribed in the following three sections. This knowledge will help the designer understand the subsequent methods of heat transfer and their value in maintaining the CPU within its specified T CASE limits. The formulas NOTES: 1. Physics, Second Edition, Paul A. Tiper. Worth Publishers, Inc., 1982, p. 531. 2. Physics, Second Edition, Paul A. Tiper. Worth Publishers, Inc., 1982, p. 531. describing heat transfer by conduction, convection and radiation can be shown analogous to Ohm's Law: V 1=R. The thermal current, temperature difference and thermal resistance are analogous to electrical current (I), voltage (V), and electrical resistance (R), 'respectively. 2.1.1 CONDUCTION Conduction is a process by which heat flows from a region of higher temperature to one of lower temperature within a medium (solid, liquid, or gas) or between mediums in direct physical contact(l). In a one-dimensional system, conductive heat transfer is governed by the following relation: Conductive Heat Transfer ilT q = LlkA V 1 =- R where: q = Heat flow rate (W) k = Material thermal conductivity (W Im°C) A = Cross-sectional area (m 2) ~T - = L L = Temperature gradient ("C/m) Distance of heat transfer In the preceding equations, the thermal current, q, can be viewed analogous to electrical current; ~ T analogous to voltage; and L/kA analogous to thermal resistance. To improve thermal conduction in a notebook environment, copper and other highly-thermal conductive metals can be used in the package design. 2.1.2 CONVECTION Convection is a process of energy transport by the combined action of heat conduction, energy storage, and mixing motion(2). Convection is the predominant mechanism for transferring energy between a solid surface and a fluid. In the notebook environment, this is equivalent to the heat transfer between the case surface and the ambient environment (air). The basic relation that describes heat transfer by convection from a surface to a fluid presumes a linear dependence on the difference between the temperature at the surface and deep in the fluid, and is referred to as Newtonian cooling: Convective Heat Transfer Ts - TA qc= 1/hcA I Ohm's Law Ohm's Law V 1 =- R 2-493 AP-498 where: A =. Area (m2) qc Convective heat flow rate from a surface to ambient (W) Surface area (m2) A = Ts = Surface temperature eC) = Ambient temperature eC) = Average convective heat transfer coefficient TA hC (W/m2°C) In the preceding equations, the thermal current, qc, can be viewed analogous to electrical current; T s - T A analogous to voltage; and llhc A analogous to thermal resistance. In forced convection, fluid flow is caused by an external factor such as a fan, while in free or natural convection, fluid motion is induced by density differences resulting from temperature gradients in the fluid (liquid or gas). Under the influence of gravity or other body forces, these density differences give rise to buoyancy forces that circulate the affected fluid and convect heat toward or away from surfaces wetted by the fluid. Although fans are often used to increase air convection inside desktop computers, they may not be a practical solution for notebook systems. 2.1.3 RADIATION Thermal radiation is defined as radiant energy emitted by a medium by virtue of its temperature, without the aid of any intervening medium(3). The amount of heat transferred by radiation between two bodies at temperatures T 1 and T 2 is governed by the following expression: Radiative T I> T 2 = Surface temperature eK) For radiation to be an effective method of heat transfer, compared to natural or forced convection mechanisms, a relatively large temperature difference must exist between Tl and T2' For most low-power electronic applications, these temperature differences are relatively small, and therefore, radiative effects are normally neglected. However, for high-power applications, heat transfer by radiation factors should be considered. Although heat radiation is a secondary thermal effect in a notebook system, some manufacturers are selecting coatings (i.e., black paint) for their notebook designs that are absorptive in the infrared to improve upon these thermal radiation effects. 2.2 Thermallmpedance Thermal management of an electronic system encompasses all the thermal processes and technologies which must be used to remove and transport heat from individual components to the system thermal sink in a controlled manner. The primary heat transfer processes (conduction, convection and radiation) can be combined into a single linear model (see Section 5.1). The junction-to-case (8Jd and junction-to-ambient (8JM thermal resistance values are used as measures of IC package thermal performance. These parameters are defined by the following relations: Ohm's Law Heat Transfer Tt - T~ V 1=- 1/EO" q= R where: q = E = (j Amount of heat transferred by radiation (W) Emissivity 0 < E < 1 = Stefan-Boltzmann constant, 5.67 X 10- 8 (W/m2 °K4) NOTE: 3. Physics, Second Edition, Paul A. Tiper. Worth Publishers, Inc., 1982, p. 535. where: 8JA = Junction-to-ambient thermal resistance eCIW) 8JC = Junction-to-case thermal resistance eC/W) 8CA = = = = TA = TJ Tc P Case-to-ambient thermal resistance eC/W) Average die temperature eC) Case temperature at a predefined location eC) Device power dissipation (W) Ambient temperature eC) 8JC is a measure of package internal thermal resistance from silicon die to package exterior. This value is highly dependent upon packaging material, thermal conductivity, and package geometry. 8JA measures the 2-494 I Ap·498 conductivity and convective thermal resistance from package exterior to the ambient, as welI as package internal thermal resistance. (JJA values depend on material, thermal conductivity, package geometry, and ambient conditions such as flow rates and coolant physical properties. To improve CPU thermal characteristics in a notebook system, heat sinks are sometimes mounted on the top of the CPU using highly conductive adhesive materials. Adding a heat sink will not change 0JC; however, it will improve heat conduction and convection due to the increase in surface area, resulting in a significant reduction in the case-to-ambient and, therefore, junction-toambient thermal resistance. Depending on the product and materials used, (JJA can be reduced by 10 to 30 percent. To guarantee component functionality and reliability, the maximum device operating temperature is defined and constrained by the package exterior temperature at a predefined location. The guidelines for ambient temperature specify that measurements should be taken at an undisturbed location at a certain distance away from the package-traditionally 12 inches horizontalIy from the center of the CPU. Measuring T A in the traditional manner is not possible in a notebook system. The case temperature, however, is measured at the center surface of the' package. Depending on the ambient temperature and board power in the system's environment, thermal enhancements such as heat fins or forced air cooling may be necessary to meet the case temperature requirements. 3.0 POWER MODELING 3.1 Power Consumption Model In a linear model, power consumption is governed by the folIowing equation: where: P = Power consumed by the component VCC = Supply v'!ltage Icc = Current through the component The preceding equation indicates that power consumption is linearly proportional with both supply voltage and current flowing through the component. For example, a 3.3V microprocessor will consume less power than a 5V microprocessor running the same application under equivalent operating conditions. A system's total power consumption is defined as either the sum of the power consumed by each individual module within the system, or the sum of the products of .each module's voltage supply and its current. It is equivalent to: Psystem = Pcpu + Pmemory + Pdisplay + etc 3.2 Empirical Data Several CPU Icc measurements were taken using an SL Enhanced Intel486 CPU evaluation board with various CPU modules running under two different operating environments: Windows 3.1 and Indeo™ Video software (see Table 3-1). The modules used were SL Enhanced Intel486 DX-33 CPUs (IX SQFP, IX PGA, and 2X SQFP), and SL Enhanced Intel486 DX2·50 CPUs (IX PGA). AlI of the CPU modules use a chipset with Power Management software. When comparing the IcC measurements between several application environments, the CPUs running Indeo Video software consume the most IcC current by approximately 10%. Since the Icc value represents the number of gates switching inside the CPU, and hence, the intensity of the CPU working condition, it is therefore concluded that running Indeo Video software will be close to the worst case for thermal measurement purposes. P = Vcc x Icc Table 3·1. Case Analysis of Power Consumption for Sl Enhanced Intel486 CPUs Windows 3.1 Vcc Freq Package OX 3.3 33 OX 5.0 33 OX2 5.0 50 OX 3.3 33 SOFP PGA PGA SOFP ClK CPU 1X 2X 3.5 .25.0 0.279 0.008 0 0.290 Icc Stop Grant 0.008 1.5 17.0 0.491 0.041 0 0.512 0.041 0 1.5 17.0 0.656 0.046 0 0.685 0.046 0 3.5 25.0 0.283 N/A 0 0.294 N/A 0 (JJC (JJA Active Icc Stop Grant Indeo™ Video Software STPClK Active STPClK 0 NOTES: 1. All thermal values are measured at zero airflow. 2. Measurements taken on an SL Enhanced Intel486 CPU Evaluation Board (no heat spreader, no heat sink). 3. All measurements were taken using one module of each microprocessor. . I 2-495 AP-498 3.3 Typical System Power Consumption Profiles Table A-2 in the Appendix examines the power dissipated for four typical "system" profiles. These are systems in the sense that the power used is assumed controlled (except in the first case) by either the operating system or the system hardware, aside from the CPU. These cases, however, do not attempt to add the effects of other power dissipating components that would exist in a complete PC notebook system. The first case gives the most conservative power calculation: the maximum power that can be generated by the CPU. The second case gives the typical average power. The last two suggest power consumption possibilities that could occur in a given system, or even be guaranteed by power management. Many similar combinations would also be reasonable. Vcc in each case assumed as the standard value (S.OV or 3.3V). In Case 1, the average power is calculated as the standard Vcc (3.3V or SV) times the maximum current, ICc{max), that can be drawn by a particular CPU. This calculation gives the maximum power that can be dissipated while continuously executing the most power consuming instruction sequence. This value should be used in a system design if no thermal power management is imposed, and the designer wants. to minimize potential problems even under worst-case circumstances: a conservative design. In Case 2, the average power is calculated as the standard Vcc (3.3V or SV) times the TYPICAL current, as specified in Intel486 microprocessor data books. This calculates the average heat from the CPU that would be dissipated over time while executing a typical mix of software. The designer could use this power value in a less conservative design. However, if the CPU case temperature approaches its maximum specified value and no thermal power management is applied, the Tc{max) specification could be exceeded. (Section S.S discusses the time dependency issues in averaging the thermal power generated while executing a mix of software.) In Case 3, the average power is calculated as the standard Vcc (3.3V or SV) times tlie ICc{max) for 10% of the time, and Icc{typical) for 90% of the time. This is a more conservative assumption than Case 2, allowing for some intervals in which the CPU runs at full power, and an overall thermal guardband over Case 2. Case 4 assumes that the current is distributed at the maximum for 10%, typical for 80%, and Stop Clock for 10% of the time. A mix of this sort is appropriate in a design where power management is applied to assert Stop Clock for at least 10% of the time. This mix could reduce the performance of the CPU. Suppose the system designer devises a theoretical mix of Icc{Max, Typical and Stop Clock) which is exceeded by the system only 1% of the time when running all standard 2-496 benchmarks and applications. The designer builds the system to that specification, with thermal power management responding only when the limit is exceeded. Then one can safely design the system assuming a significantly lower power than the absolute maximum, and experience performance degradation only 1% of the time. The cases above illustrate that many less than worstcase power profiles are possible, depending on the software being run, and power management options being used. Intel recommends that systems be tested for thermal problems under the worst case power usage that the customer could contrive, and in an ambient temperature equal to the maximum specified for the design. 4.0 () JA BASED ON EXTERNAL T A This section provides some experimental thermal data which will help the designer better understand some of the system level issues affecting the thermal performance of a notebook. Section 4.1 describes in one experiment how test chamber (hA can be used as an approximate thermal performance criteria in the early stages of a notebook design. Section 4.2 shows how in another experiment (hA is affected by CPU location on the motherboard inside the notebook. Section 4.3 presents a model showing the relationship between power generated by other components on the motherboard and the CPU's 8JA'requirement. 4.1 Measurements from Commercial Notebooks Since component location differs among notebook designs causing internal power densities to vary between notebooks there is no one place inside a notebook where T A can be defined in order to obtain an accurate system thermal profile. This section presents experimental thermal data collected for four different Inte1386 SL CPU notebooks running Indeo Video software and shows-as a rough estimate--that the ambient temperature (TA) outside a notebook can be used with thermal resistance (8JA) to project CPU temperture (TJ, Te). In this notebook experiment, the CPU case temperature of four different Inte1386 SL CPU notebooks (PQFP packages) were measured using K-type thermocouples a digital multimeter, and an Inte1386 DX CPU-based system with data acquisition software. To simulate maximum ICC consumption, each notebook continuously executed Indeo Video software for the duration of the experiment. Table 4-1 shows the junction temperature and 8JA calculated by using the thermal impedance equations from Section 2.2. The CPU case temperature, the maximum power consumption of 2.SW I AP-498 running Indeo Video software, and the test chamber OJe of 6°C/W as specified in the Intel386TM SL Microprocessor Data Book for the 196L PQFP were used to calculate the junction temperature of each CPU. With this calculated CPU TJ and measured ambient room temperature of 25°C, the corresponding notebook's OJA was obtained. OJA can be calculated by combining the first two equations in Section 2.2 as follows: Although the test chamber OJA for the 196L PQFP of 23°C/W was collected with only the CPU present on a test board, the value obtained approximates OJA of a notebook operating in an environment of 25°C. One possible explanation is that the conductive and radiative effects the other components have on the CPU inside the operating notebook, such as the PC board, connectors, floppy disk, shielding and plastic enclosure, actually cause the temperature inside the notebook to be lower than expected. The end result is a lower OJA for the CPU inside the operating notebook than the OJA of a lone CPU inside a test chamber where the only conductive and radiative path is to the surrounding air, as shown in Notebooks # I and # 3. Table 4-1 shows the large variation in notebook OJA caused by different system designs. Thus, the test chamber 0JA can be used as a rough guideline in the early stages of a notebook design. For a more in-depth analysis of the conditions inside an operating notebook, see Section 4.3. For the final design, the thermal performance of the system should always be verified by measuring the CPU case temperature. I 4.2 0 JA and 0 JC Measurements in an Actual Environment A test motherboard with the same form factor as that of the original Test Notebook #4 was fabricated in order to take experimental measurements of OJA and OJe (see Figure 4-1). The only differences between the two boards are the following: 1. The test board had two slots instead of the 70/80 pin connectors on the motherboard. 2. The test board only had six thermal test packages mounted on it (three on the component side and three on the solder side). Measurements from the 6 thermal test units yielded a OJA range of 20°C/W -25°C/W in the test chamber and 22°C/W-28°C/W in the Test Notebook and a OJe range of 3°C/W-5°C/W in both the test chamber and the Test Notebook (see Table 4-2). Table 4-1. 0JA Calculations for Intel386 SL CPU Notebooks Running Indeo™ Video Software Notebook # 1 TJ OJA TeASE (calculated) (calculated) 48.9 63.9 15.5 2 69.0 84.0 23.6 3 52.1 67.1 16.8 4 71.5 86.5 24.6 2-497 Ap·498 L,L...---_ _ --_----_-----~ Original Motherboard from Test Notebook with 2 Connectors Slo\s , , -------.., I r---"': Thermal Sample I I I I I ,,.----- ... , I I I I I I I Thermal : Sample I I , I , .... _----, ,I I 'fIrt _____ 1II1" Thermal Sample Thermal Test Board 'used in Test Notebook with 6 Thermal Samples 241812-1 Figure 4·1. Test Notebook Boards Table 4·2. Thermal Resistance (JJA and (JJC for the SQFP Package Test Motherboard in Test Chamber eC/W) Test Motherboard in Test Notebook ("C/W) (JJA 20-25 22-28 (JJC 3-5 3-5 NOTES: 1. Test Notebook # 4 was used to collect the experimental data. 2. 208L SQFP test package with heat spreader containing thermal test die was used in all experiments to vary and measure the power gOing into package as well as to mea, sure the temperature of the die. 3. All measurements were made with zero airflow, simulating a typical notebook environment. Ambient temperature is defined as ambient temperature outSide the notebook. The worst CPU thermal location was on the center of the solder side with (JJA = 28°C/W. The measured thermal resistance at this location was unfavorable be- 2-498 cause of the reduced CPU board area surrounding the CPU (due to the two slots) and the reduced convection cooling on the bottom side of the board. The best CPU thermal location was on the component side at one end of the test board with the most PCB area surrounding the package with (JJA = 22°C/W. This shows how CPU location and system layout can impact the overall thermal performance of a notebook. The (JJA numbers should only be used as a first order estimate in preliminary notebook designs. Since the location of a CPU inside a notebook impacts .thermal performance, T J should always be verified in the final design by (JJC and the CPU case temperature. 4.3 System Impact on CPU (}JA As notebooks evolve into smaller form factors with higher component density and smaller PCB sizes, the increasing power density inside the notebook has a large effect on CPU temperature. The power dissipated by components other than the CPU, and the layout of I AP-498 the components, as well as the thermal-mechanical characteristics of the enclosure must be taken into account in order to ensure a CPU junction temperature within specifications. In one model, such effects are approximated by the introduction of the factors Rand Pb to the thermal impedence equations in Section 2.2: R is defined as the thermal coupling factor between the CPU and the other components on the PCB. This factor takes into account the effects of the power dissipation of the other components on the CPU case and junction temperatures. Pb is defined as PCB power dissipation. This Pb value is the power dissipated by all components (except the CPU) inside the notebook. A detailed discussion of how Rand Pb are used to calculate Thermal Headroom (thermal margin) for a Test Notebook is given in Section 5.3. Figure 4-2 gives a graphical representation of the effects of PCB power on the (hA requirements. The horizontal line represents the current method of a fixed IIJA requirement of 23°C/W for this package, over the entire range of PCB power dissipated, Pb. The second line represents the model used with the factors Pb and R which takes into account the temperature rise inside the notebook and is obtained by substituting the example values TJ = 100°C, TA = 30°C, PCPU = 2W, and R = 3.9 into the preceding equation. This line shows as the board power (Pb) increases, the thermal margin inside the notebook becomes smaller due to the higher temperature environment. In both models, IIJA must fall below the line to ensure a junction temperature below specification for the given conditions. For the Test Notebook #4 the cross-over point between the two lines is 6W. Beyond this point, the package thermal resistance of 23°C/W is too high and thermal enhancements are necessary to reduce that resistance. In summary, this example shows that for the current method, a IIJA value obtained from the test chamber leaves margin for Pb less than 6W. However, for Pb greater than 6W, the junction temperature will be exceeded. Again, it is emphasized that the designer should always perform a thorough system thermal analysis to ensure the specified TeASE (max) is not exceeded. 40 35 30 ~ 25 (J ~ ='i 20 ~ ~ t- 15 10 10 4 11 12 13 14 Board Power (Pb ) (Watts) 241812-2 Figure 4-2. Determining Maximum Thermal Resistance (IIJA) for a Given Amount of Power I 2-499 AP-498 5.0 CALCULATING THERMAL HEADROOM Thermal Headroom is the temperature margin between the calculated TA(max) and the TA measured outside a given system, with a particular CPU and power. This section shows how to calculate Thermal Headroom and use it as simple model for a system's thermal properties. Then a term accounting for board power is added to the model, and the experimental measurements needed to implement this more sophisticated version are described. Finally, the significance of two secondary effects is analyzed. 5.1 Using Thermal Headroom Graphs Figures A-I and A-2 (in Appendix A) plot the calculated TA(max) vs the power being used by the CPU and are intended to facilitate quick determination of thermal headroom. The lines on the graphs indicate the (estimated) maximum allowed ambient temperature (TA in degrees Celsius) as a function of power dissipated (P in Watts). Maintaining TA below or equal to TA(max) indicates that the required Tc(max) is probably not exceeded. The graph lines are generated from the formula: iJCA is the thermal resistance to heat flow between the CPU case and the ambient environment, as specified in the Intel Packaging Handbook. As discussed in Section 4, experiments show that these parameters are approximately the same for a notebook PC with TA measured in open air outside the notebook case. The tendency of the notebook case to increase iJCA by adding extra layers of insulation is approximately offset by its action as a heat spreader, since it is thermally connected to the CPU board. Different types of CPU packages have different iJCAI values, and thus generate different lines on the graphs. For example, Figure A-I shows the SQFP, PGA and PQFP packages for the SL Enhanced Intel486 SX CPU and Figure A-2 shows the SQFP and PGA packages for the SL Enhanced Intel486 DX2 CPU. To determine thermal headroom for a given CPU, first determine the correct line for the CPU type. (Some of the CPUs are marked on the graphs. For a CPU type that is not, find its iJCA in Table A-I, and match it to a CPU type that is marked on a graph line. Or use iJCA as the slope, and TC = 85°C as the temperature axis intercept to match directly to a graph line.) Second, determine the power at which the CPU will operate. Various average power use scenarios could be appropriate for a given design; four of them are detailed in Table A-2 for ~ach different CPU (discussed in detail 2-500 in Section 3.3). One can also calculate a custom power usage profile for one's system using P = Icc X VCC, and Table A-I, which gives Icc under 3 different conditions (Active, Stop Grant and Stop Clock). Third, draw an ordinate (vertical line) at the power value (determined above) to intersect the appropriate line for the CPU package type chosen. Draw an abscissa (horizontal line) from the intercept to the TA axis, obtaining the maximum recommended ambient temperature for this system. If this T A(max) is greater than what is measured in the air outside the actual system, the design has a positive thermal headroom of T A(max) - TA(measured), as long as the effect of "board power", Pb, is neglected. If however the actual system is exceeding this TA(max), the thermal headroom is negative even before considering Pb, and thus the thermal properties of the design will need improvement ("thermal mitigation"), (See Section 4 for Pb definition, and Section 5.3 for more information about Thermal Headroom). There are numerous techniques for thermal mitigation, or improving the thermal properties of a design (i.e., a lower voltage version of the CPU or a CPU package with a lower iJCA could be used). Depending on package size and material, various iJCA values can be obtained. Various passive and active thermal management strategies are discussed in Section 6. 5.2 Example of Graph Use Consider the 5V PQFP SL Enhanced Intel486 SX-33 CPU, which is likely to have special thermal needs because of its power requirements. The graph line for it is indicated by the label (33 MHz PQFP 5V) on the Intel486 SX CPU graph, or by the fact that its slope value from Table A-I is 17.0 °CIW. The power shown for Figure A-I is for Case 1: VCC = 5.0V; Icc = 0.685 mA (from Table A-I); Active Max for 100% of the time; P = 3.43W (from Table A-2, or calculation). The ordinate is drawn from the P axis at 3.43W to intersect the intermediate slope line, and the abscissa from that point intersects the Temp. axis at about 27°C. If we assume 40°C is the lowest TA that can be readily achieved, we get a negative thermal headroom of l3°C. Designing a portable PC with this CPU clearly will require thermal mitigation. 5.3 Adjusting Thermal Headroom for Board Power Experiments have shown that the term R X Pb provides a good way to model the effects on the CPU T C due to other heat sources inside a notebook. (Here R is an experimentally determined thermal coupling coeffi- I AP-498 cient, and Pb is the board power, as defined in Section 4.). The term adds to TC, or reduces the TA(max) that is required to ensure that Tc does not exceed Tc(max): T A(max) = T c(max) - eCA x Pcpu -, R X Pb To calculate thermal headroom adjusted for the effect of Pb, subtract R X Pb from the thermal headroom calculated as above. This of course makes the headroom smaller (worse), but by how much? This depends on the size of Pb, but also on R, which is highly dependent on the particular design. In theory, the smallest value possible for R is zero: no thermal coupling between the CPU and other heat sources inside the notebook. From testing one system, the range measured experimentally (with no effort made to thermally isolate the CPU) has been 3.9 to 4.9. Values closer to zero are obtained by positioning the other high-power devices away from the CPU, and theqnally grounding them to the outer case. (Section 5.4 describes how to measure R for a given system.) 5.4 Experimental Measurements are Essential The IICA values given by Intel can be used as a rough "rule of thumb" to estimate likely thermal margins. If the thermal headroom calculated from the graphs is positive for a given design even after correcting for the board power (R = 4 would be conservative for a real notebook design), the design is most likely satisfactory. But even then, Intel recommends that the CPU Tc be measured when the complete design can be run at full power, with T A at the maximum allowed by the designer's specifications, to be really sure that T c(max) will never be exceeded. If a conservative calculation of thermal headroom (including R X Pb) is negative, it is essential that the system be tested, and improvements in thermal mitigation be made until T c(max) is never exceeded. There are several levels ofthermal experiments that can be used. The simplest is to just measure the CPU T c' and make adjustments in the design untir it never exceeds T c(max). Then the equations and graphs can be ignored; if the design meets the T c(max) specification, it does not matter if the (estimated) TA(max) is violated, as far as Intel's CPU is concerned. (Consideration should be given, however, to other components, such as a disk drive, that might have trouble due to high temperatures inside the notebook.) Suppose, however, the design is expected to be used for several variations over time, i.e., an Intel486 DX CPU now, and an Intel486 DX2 CPU later, with perhaps higher power peripherals which can also increase Pb in later versions. These later versions with more power will likely require more thermal mitigation efforts, but simply measuring T C in the first version of the notebook will give little guidance about how much more I thermal mitigation will be needed later. In this case, more detailed experiments on the first version, in order to build an accurate thermal model of the product line, can be very cost effective. This can be done using the equation from Section 5.3 and solving for Tc: In the preceding equation, T A is the actual air temperature outside the notebook during the experiment; PCPU is held fixed, and Pb is varied while the resulting Tc is measured. (TA + IICA X PcPU) is the intercept of the resulting straight line, and R is the slope. The easiest way to measure R is to disconnect the CPU (so PcPU = 0) and measure T C with Vcc at the upper and lower limits of its range (i.e., 4.5V and 5.5V). Pb is obtained for each Vcc value by Vcc X Icc. A third data point requires no measurement; when Pb = 0, T c = T A. When this semi-empirical model has been constructed for a given notebook design, it can be used to accurately determine thermal headroom for variations in both PcPU (plugging in a higher frequency CPU) and Pb (adding higher power peripherals). Of course, if the thermal mechanical design is subsequently modified, R should be measured again for the new version. If the designers e.xpect the Pb to roughly track PCPU, and R is small (as it should be in a good design), an approximation to the above model may make measurements easier: Assume Pb = C X Pcpu, where C is the coefficient relating board power to CPU power. The preceding equation for T C then becomes: + eCA R = TA + eCA x PcPU + x PcPU + = TA + (IICA + R x C) R TC = T A x x Pb C x PcPU PcPU Then (IICA + R X C) becomes a new coefficient, say IICA" which can be measured by varying PcPU and Pb together by varying Vcc over its functional range. 5.5 Secondary Effects Two kinds of secondary effects will be evaluated. The main model used in thermal analysis assumes a linear relationship between the temperature gradient and the rate of heat transfer, and also assumes a steady state (time independent) model. How valid are these assumptions? Heat transfer by conduction is governed by a linear relationship between the temperature difference and the rate of heat transfer, and heat transfer by convection can be approximated by a linear relationship, as described in Section 2.1. However, heat transfer by radiation is proportional to the fourth power of the absolute temperatures involved. To demonstrate the contribution that heat transfer by radiation makes to cooling the 2-501 AP-498 CPU, consider the largest allowed Tc (85°C) and the smallest TA that most designs would find acceptable (40°C). One of the larger CPU packages is 4.4 cm square. Assume the largest emissivity (let E = I). The temperatures must be converted to degrees Kelvin by adding 273°. Using the formula from Section 2.1, q, the radiative heat transfer in Watts is: q = (Tc4 - TA4) = 1 • (5.67 10- 8 W/(m2K4)) (0.044m)2 «358 K)4 - (313 K)4) = 0.75W EfTA To determine the significance of heat transfer by radiation in this case, one compares the 0.75W just calculated to the total heat transfer predicted by the linear approximation using the experimentally determined9cA' By rearranging the equation 9CA peratures near the values used .for the measurement. The effect of the radiative, nonlinear component will be beneficial in that the actual power radiated away from the CPU, when temperatures are higher than those used in the 9CA measurements, will be greater than predicted by the linear model. This means that T C will not increase as much as predicted by the model, for a given increase in power. Tc-TA == --P-- for P, one obtains TC-TA P=---. 9CA TC-TA = 45°C in this case, and 9CA ranges 15SC/w to 32.0°C/W for the CPUs covered in this article. These figures give a total heat power dissipation ranging from 1.41W to 2.90W. Thus, the radiative component varies between one fourth to one half of the total. This explains why the addition of black paint on a notebook case improved 9JA measurably (see Section 6). The experimental determination of 9 CA effectively incorporates the radiative component, .along with the conductive and convective components, in a combined linear approximation, which will be accurate for tem- The time independent assumption is fine if one is content to build a system to tolerate Case I maximum power, and with only passive thermal management (i.e., heat sinks and heat spreaders). However, if one assumes some power averaging over a typical mix of software, as in Cases 2, 3 and 4, one must consider the time dependent effects of bursts of maximum power, alternating with lower power periods. Also, if the design uses active thermal management, especially in a closed loop design with the system responding to a temperature sensor, the lag time between temperature sensing and response must be considered. The experimental measurements taken on an Inte1386 SL CPU notebook show how T CASE varies with time at different CPU power levels, and allow an approximate determination of thermal time constants (see Figure 5-1). The time constant (approximately 3 minutes for this notebook) is defined here as the time elapsed from when power was switched from Full On to Standby, to when the temperature has declined toward its Standby asymptote by lie. This means that it is reasonable to average the CPU power over approximately one minute when calculating average power generated by a mix of software. This is a large interval in CPU cycles (billions of CPU clocks). 80.00 70.00 60.00 50.00 Tcase (0C) f / ..-- .- \ .~ 40.00 30.00 20.00 --- 10.00 0.00 o 1000 2000 3000 4000 5000 6000 7000 8000 Time (seconds) 241812-3 Figure 5-1. TeASE of In~e1386 SL CPU (PQFP) Running Indeo™ Video Software 2-502 I AP-498 This time constant also indicates a significant lag between a temperature sensor reaching some action value (the action could be turning off the CPU clock for an interval), and a temperature response to that action. This means that the setpoint temperature (the temperature that triggers power reduction) should be somewhat below T c(max). Note also that the temperature does not come to equilibrium until about an hour after a major power change. This shows that when testing Tc to assure that it does not exceed T c(max), one should run the notebook under worst case conditions continually for at least an hour. 6.0 IMPROVING THERMAL HEADROOM By using the thermal management theories that have been reviewed here and keeping in mind the notebook platform limitation, the designer can apply proven thermal management techniques in several areas, including increasing thermal conduction and convection, optimum system layout, and power management techniques. 6.1 Improving Thermal Convection, Conduction and Radiation 6.1.1 BLACK PAINT The inside of the notebook case was painted Flat black using a paint that is highly absorptive in the infra-red as well as visible. Figure 6-1 shows that T CASE was improved by 2.3°C, or 3%. Table 6-1 shows that heat transfer improvement by radiation reduces OJA by OSC/W. The remaining experiments were performed with the inside of the notebook case painted black. 6.1.2 COPPER FOIL A copper foil (1" x 3" x 1.5 mil) was attached from the CPU to the bottom of the keyboard (constructed of aluminum material) and then to the plastic case using thermal grease. Figure 6-1 shows a 4.8°C (6%) and a 2.0°C (3%) improvement when the foil is connected from the CPU to the underside of keyboard and from the CPU to the case, respectively. After connecting the CPU to the bottom of the keyboard, OJA improved because of the higher thermal dispersion by the aluminum plate. A thicker copper foil (1" x 3" x 10 mil) was then connected from the CPU to the keyboard bottom which yielded an improvement of 11.4°C or 15%. 6.1.3 PER FLUOROCARBON FLUID AND SILICONE ELASTOMERS The most obvious method for improving thermal convection is by adding a fan to circulate the air. Unfortunately, fans are a compromise in mobile designs because of extra power and space requirements, and electromechanical noise. Intel's Thermal/Mechanical Tools and Analysis Group has collected data using more realistic techniques for reducing T CASE. Data was collected . from an actual Intel386 SL microprocessor notebook computer modified to measure T CASE. The case temperature was measured by applying 3W to the CPU with all other devices/components off. Liquid heat sinks containing perfluorocarbon fluid can offer a reasonable substitute for standard heat sinks. The heat transfer coefficient for natural convection in a perfluorocarbon fluid is greater than that of natural air convection. Measurements were taken using a perfluorocarbon liquid heat sink connected between the CPU and the plastic case. Figure 6-2 shows a TSC (10%) improvement in temperature. Table 6-1. Thermal Enhancements to OJA MJAeC/W) I Black Paint 1.5 mil Copper Foil to Case Copper Foil to Keyboard 10 mil Copper Foil to Keyboard 0.5 0.9 1.8 3.9 1.5 mil 2-503 AP-498 Notebook Temperatures with CPU Power = a.ow (all others off) 8 Block Case & 1.5 mil CU Foil to Case Black Case & 1.5 mil CU Foil to Keyboard Black Case & 10 mil CU Foil to Keyboard 241812-8 Figure 6·1. Black Paint and Copper Foil Thermal Enhancements Notebook Temperatures with CPU = a.ow (all others off) r i. o u Without Enhancements Case Pointed Black Silicone Elastomer to Slock Case Perflurocorbon to Block Case 241812-9 Figure 6·2. Silicone Elastomer and Perfluorocarbon Thermal Enhancements A type of heat sink that helps blanket uneven surfaces is the silicone elastomers heat sink. These soft materials fill air gaps between hot components and the metal chassis. A piece of silicone elastomer was cut out to the same size as the CPU and placed between the CPU and the case, yielding a lSC (2%) improvement (see Figure 6·2). connecting the CPU case to the PC case can increase thermal area thus greatly improving thermal spreading. There are many opportunities for creativity in transferring heat away from the CPU. 6.2 Optimizing System Layout One technique which has become a standard in notebook designs is using Intel's System Management Mode (SMM) to effectively monitor system activity and shut off devices to slow or control clocks when low activity is detected. Another approach could be to monitor the CPU activity or temperature and slow or stop the CPU clock when a long period of system inactivity or a high temperature is detected. Power must be optimally distributed to ensure the lowest TA' The objective is to reduce power density within the system to avoid hot spots at any particular deviCe. Keeping the CPU away from batteries and power supplies is one challenge to the system designer. Thermally 2-504 6.3 Effective System Power Management I AP-498 7.0 CONCLUSION System notebook designers need to pay close attention to the special thermal requirements for future notebook designs. These include demands for increasing computing power, and thus power consumption, in decreasing sizes. With proper thermal design, the CPU can be kept below its stated maximum case temperature, even under worst case conditions. Experiments have shown that 9CA as measured in open air in factory tests can be used for a rough estimate for an actual notebook PC design, where T A is the air temperature outside the notebook. Using this 9CA to estimate thermal headroom provides the simplest model. Experiments have also shown that adding a term. to the thermal equation that includes the board power (Pb), with a coupling coefficient (R), provides a better model. This enhanced model takes into account the effect of I other heat sources inside the notebook on the CPU TC, and allows accurate prediction of the thermal effects of upgrades in a notebook design (adding a higher power CPU and/or peripherals on the board). Also, a number of suggestions have been made about how to improve the thermal characteristics of a design, by various passive and active techniques. The information within this application note allows the designer to use initially a simple analytical model, as well as to build a semi-empirical model to correlate with the actual design. Techniques have been shown not only to increase the average thermal margin, but to eliminate thermal margin problems under worst case operating conditions. By designing for the worst case, and taking actual measurements to guarantee proper operation within spec limits, the notebook designer can provide the highest quality, most reliable products for their customers. 2-505 AP-498 APPENDIX A Table A-1. Power Consumption and Thermal Specifications for Sl Enhanced intel486 CPUs ClK CPU Vcc Freq 1X SX Vcc ICC Active Icc Icc Stop Grant Stop Clock OCA OJC OJA (calc) Pkg TCASE Typ Max Typ Max Typ Max ±0.30 ±0.30 ±0.25 ±0.25 ±0.25 ±0.25 0.250 0.300 0.430 0.430 0.590 0.590 0.315 0.385 0.560 0.560 0.685 0.685 0.020 0.025 0.035 0.035 0:040 0.040 0.040 0.050 0.065 0.065 0.080 0.080 0.0001 0.0001 0.0002 0.0002 0.0002 0.0002 0.001 0.001 0.002 0.002 0.002 0.002 4.0 4.0 1.5 3.5 1.5 3.5 36.0 36.0 17.0 20.5 17.0 20.5 32.0 32.0 15.5 17.0 15.5 17.0 3.3 3.3 5.0 5.0 5.0 5.0 25 33 25 25 33 33 SQFP SQFP PGA PQFP PGA PQFP 85 85 85 85 85 85 tol 1X OX 3.3 5.0 5.0 5.0 33 33 33 50 SQFP PGA PQFP PGA 85 85 85 85 ±0.30 +0.25 ±0.25 ±0.25 0.330 0.500 0.500 0.775 0.415 0.630 0.630 0.950 0.025 0.040 0.040 0.050 0.050 0.080 0.080 0.100 0.0001 0.0002 0.0002 0.0002 0.001 0.002 0.002 0.002 3.5 1.5 3.5 1.5 25.0 17.0 20.5 17.0 21.5 15.5 17.0 15.5 1X OX2 3.3 3.3 5.0 5.0 40 50 50 66 SQFP SQFP PGA PGA 85 85 85 85 ±0.30 ±0.30 ±0.25 ±0.25 0.375 0.460 0.775 0.975 0.450 0.550 0.950 1.200 0.020 0.035 0.023 0.045 0.040 0.065 0.050 0.090 0.0001 0.0001 0.0002 0.0002 0.001 0.001 0.002 0.002 3.5 3.5 1.5 1.5 24.0 24.0 17.0 17.0 20.5 20.5 15.5 15.5 2X SX 3.3 3.3 5.0 5.0 25 33 25 33 SQFP SQFP PQFP PQFP 85 85 85 85 ±0.30 ±0..30 ±0.25 ±0.25 0.250 0.330 0.430 0.590 0.315 0.415 N/A 0.560 0.685 0.0001 0.0001 N/A 0.0002 0.0002 0.001 0.001 0.002 0.002 4.0 4.0 3.5 3.5 36.0 36.0 20.5 20.5 32.0 32.0 17.0 17.0 2X OX 3.3 5.0 33 33 SQFP PQFP 85 85 ±0.30 0.330 0.415 N/A ±0.25 0.500 0.630 2-506 N/A 0.0001 0.001 3.5 25.0 21.5 0.0002 0.002 3.5 20.5 17.0 I -~ c[ Table A-2. Thermal Headroom based on Typical Power Consumption Profiles of SL Enhanced Intel486 CPUs Case 2 Case 1 CLK CPU 1X SX Vcc Freq 3.3 3.3 5.0 5.0 5.0 25 33 25 25 33 33 5.0 - I\) en o -.J Pkg Tc 85 85 85 85 85 PQFP 85 SOFP SOFP PGA POFP PGA Vcc tol ±0.30 ±0.30 ±0.25 ±0.25 ±0.25 Icc AVG 0.32 0.39 0.56 0.56 0.69 ±0.25 0.69 Power Thermal AVG Headrm 1.04 1.27 2.80 2.80 3.43 3.43 11.7 4.3 1.6 (2.6) (8.1 ) (13.2) Icc AVG 0.25 0.30 0.43 0.43 0.59 0.59 Power Thermal AVG Headrm 0.83 0.99 2.15 2.15 2.95 2.95 Case 4 Case 3 18.6 13.3 11.7 8.5 (0.7) (5.2) Icc AVG Power Thermal AVG Headrm Icc AVG @ Power Thermal: Headrm! AVG 0.26 0.31 0.44 0.44 0.60 0.85 1.02 2.22 2.22 3.00 17.9 12.4 10.7 7.3 (1.5) 0.23 0.28 0.40 0.40 0.54 0.76 0.92 2.00 2.00 2.70 20.5 15.6 14.0 11.0 3.1 0.60 3.00 (6.0) 0.54 2.70 (1.0) 23.3 9.1 5.6 (10.4) 1X OX 3.3 5.0 5.0 5.0 33 33 33 50 SOFP PGA POFP PGA 85 85 85 85 ±0.30 ±0.25 ±0.25 ±0.25 0.42 0.63 0.63 0.95 1.37 3.15 3.15 4.75 15.6 (3.8) (8.6) (28.6) 0.33 0.50 0.50 0.78 1.09 2.50 2.50 3.88 21.6 6.3 2.5 (15.1 ) 0.34 0.51 0.51 0.79 1.12 2.57 2.57 3.96 21.0 5.2 1.4 (16.4) 0.31 0.46 0.46 0.72 1.01 2.32 2.32 3.58 1X OX2 3.3 3.3 5.0 5.0 40 50 50 66 SOFP SOFP PGA PGA 85 85 85 85 ±0.30 0.45 ±0.30 0.55 ±0.25 0.95 ±0.25 1.20 1.49 1.82 4.75 6.00 14.6 7.8 (28.6) (48.0) 0.38 0.46 0.78 0.98 1.24 1.52 3.88 4.88 19.6 13.9 (15.1 ) (30.6) 0.38 0.47 0.79 1.00 1.26 1.55 3.96 4.99 19.1 13.3 (16.4) (32.3) 0.35 0.42 0.72 0.90 1.14 1.40 3.58 4.50 21.7 16.4 (10.4) (24.8) 2X SX 3.3 3.3 5.0 5.0 25 33 25 33 SOFP SOFP POFP POFP 85 85 85 85 ±0.30 ±0.30 ±0.25 ±0.25 0.32 0.42 0.56 0.69 1.04 1.37 2.80 3.43 11.7 1.2 (2.6) (13.2) 0.25 0.33 0.43 0.59 0.83 1.09 2.15 2.95 18.6 10.2 8.5 (5.2) 0.26 0.34 0.44 0.60 0.85 1.12 2.22 3.00 17.9 9.3 7.3 (6.0) 0.23 0.31 0.40 0.54 0.76 1.01 2.00 2.70 20.5 12.7 11.0 (1.0) 2X OX 3.3 5.0 33 33 SOFP 85 POFP 85 ±0.30 ±0.25 0.42 0.63 1.37 3.15 15.6 (8.6) 0.33 0.50 1.09 2.54 21.6 2.5 0.34 0.51 1.12 2.57 21.0 1.4 0.31 0.46 1.01 2.32 23.3 5.6 --- - --- - -- NOTES: Case temperature specifications assume a heat spreader and no heat sink. CASE 1: IcC Active (max) = 100% CASE 2: Icc Active (typ) = 100% CASE 3: Icc Active (max) 10% Icc Active (typ) 90% CASE 4: Icc Active (max) 10% Icc Active (typ) 80% Icc Stop Clock (max) = 10% » "U . "" ID 00 AP-498 90 80 +-~~~----------~----------------------------------~ 70 60 60 Tamblent (·C) 40 30 +--------------r--~--~~--------------~--~~_+~~~~~~~------~ 20 +-------------t-~--------~----------r_----_+--~~--~~------~ 10 +-------------t-~------------~~----r_----_+----------~~--~~ 0.5 1.5 2.5 4.5 3.5 Power (Watts) 241812-6 Figure A-1. Maximum Thermal Headroom for SL Enhanced Intel486 SX CPUs 90 80 SQFP3.3V / ~ ~~ 70 60 I---~" \4860X2-<40 ~ I---+-- .... I-- 1486 0)(2·50 SQFP 3.3V ~,,~ 50 '",- ~ Tomblenl (·C) 40 ~ 30 20 10 i48& 0)(2·50 PGASV . \ ~ ~ ~ \ ~ ~ ~ o o 0.5 1.5 2.5 3.5 I 4.5 Power (Watta) 241812-7 Figure A-2. Maximum Thermal Headroom for SL Enhanced Intel486 DX2CPUs 2-508 I AP-498 APPENDIX B THERMAL ENHANCEMENT VENDORS Chomerics, Inc. 77 Dragon Court Woburn, MA 01888-4014 (617) 935-4850 FAX: (617) 933-43i8 Product ID: CHO-THERM A274 (Silicon Elastomer) 3M Corporation Building 223-6S-04 3M Center St. Paul, MN 55144-1000 (612) 733-3735 or (800) 833-5045 Product ID: Fluorinert Liquid FC-77 (perfluorocarbon Fluid) I 2-509 AP-498 APPENDIX C BIBLIOGRAPHY BIBLIOGRAPHY SL Enhanced Intel486™ Microprocessor Data Sheet Addendum, Intel Corporation, 1993. Order Number 241696. Intel386™ SL Microprocessor SuperSet Data Book, Intel Corporation, 1992. Order Number 240814. 1993 Packaging Handbook, Intel Corporation, 1993. Order Number 240800. Physics, Second Edition, Paul A. Tiper. Worth Publishers, Inc., 1982, pp. 531, 535. 2-510 I intel· AP-504 APPLICATION NOTE Clock Throttling the SL Enhanced Intel486™ ·CPU in a Networked Environment PHILIP BRACE RICK BROWN TODD ERDNER JOSEPH MIDDLETON May 1994 2-511 Clock Throttling the SL Enhanced Intel486™ CPU in a Networked Environment CONTENTS PAGE 1.0 INTRODUCTION ................... 2-513 2.0 TEST HARDWARE ................ 2-514 3.0 POWER SAVINGS WITH CLOCK THROTTLlNG ....................... 2-516 4.0 OVERVIEW OF NETWORK TEST FIGURES Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Clock Throttling ............ 2-513 Test Hardware Setup ....... 2-514 Stop Grant Bus Cycle ...... 2-515 STPCLK# and INTR ....... 2-515 Effect of Deasserting STPCLK# on INTR ........ 2-516 Figure 6 5.0 NETWARE 3.11 TEST SUMMARY .......................... 2-519 5.1 Test Environment ................ 2-519 Power vs STPCLK Duty Cycle, Interrupts do not Deassert STPCLK # ........ 2-517 Figure 7 Power vs STPCLK Period, Duty Cycle = 50%, Interrupts Deassert STPCLK# ........ 2-517 5.2 Test Description .................. 2-519 Figure 8 Power vs STPCLK Period, Duty Cycle = 90%, Interrupts Deassert STPCLK# ........ 2-518 2-521 Figure 9 Test Environment .......... 2-519 2-521 Figure 10 Test Environment .......... 2-521 2-522 Figure 11 Performance Slowdown vs STPCLK# Period Relative to Peak Performance at 50% Duty Cycle (with and without STPCLK # Deassertion on INTR) ...................... 2-523 Figure 12 Performance Slowdown vs STPCLK # Period Relative to Peak Performance at 90% Duty Cycle (with and without STPCLK # Deassertion) .... 2-523 Figure A-1 STPCLK # Test Circuitry .... 2-527 PLAN ................................ 2-5.18 5.3 Test Results ..................... 2-521 6.0 NOVELL LITE TEST SUMMARY ... 6.1 Test Environment ................ 6.2 Test Description ................. 6.3 Test Results ..................... 7.0 NOVELL PRINT SERVER TEST SUMMARY .......................... 7.1 Test Environment ................ 7.2 Test Description ................. 7.3 Test Results ..................... 7.4 Test Conclusions ................ 2-522 2-524 2-524 2-524 2-525 2-525 8.0 INTEL COMPATIBILITY VALIDATION LAB TEST SUMMARY .......................... 2-525 9.0 RECOMMENDATIONS FOR CLOCK THROTTLING ....................... 2-526 APPENDIX A-STPCLK # Test Circuitry ............................ 2-527 TABLES Table 1 Network Test Loading (Fixed 25% Loading, Interrupts Disabled) ..................... 2-520 Table 2 Test Results for Novell Print Server .................... : ... 2-525 APPENDIX B-System Power Measurements with Clock Throttling ........................... 2-532 APPENDIX C-Novell Lite Complete Test Results ........................ 2-533 APPENDIX D-Typical Network Interface Card Power Requirements ....................... 2-541 2-512 I AP-504 1.0 INTRODUCTION The SL Enhanced Intel486 microprocessors contain new features to enable simple, economical, and robust power management. The'Energy Star program and a general world wide trend towards responsible energy consumption have generated interest in providing computer equipment which can enter a low power state when not being used. The SL Enhanced Intel486 microprocessors enable a simple design which is capable of power managing both the CPU itself, and the system as a whole. The SL Enhanced Intel486 microprocessors can be placed in a low power state through software or hardware. Execution of the HALT instruction will cause the CPU to automatically enter a - 20 rnA-55 rnA state called the Auto HALT Power Down state.' The CPU will issue a normal HALT bus cycle when entering this state.The CPU will transition to the Normal state on the occurrence of INTR, NMI, SMI #, RESET, or SRESET. The STPCLK # interrupt allows system hardware to control the power consumption of the CPU by stopping the internal clock (output of the PLL) to the CPU core in a controlled' manner. When STPCLK # is asserted the SL Enhanced Intel486 CPU enters a low power state, the Stop Grant state, of approximately 20 rnA to 55 rnA at the conclusion of the current bus cycle. Deasserting STPCLK # will enable the processor to resume functioning on the next CLK cycle. Periodically asserting and deasserting STPCLK # can result in significant power savings while still keeping a base level of CPU activity to ensure that interrupts are not missed, time of day lost, network connections dropped, etc. The process of rapidly asserting and deasserting STPCLK # to provide power management while maintaining a reduced level of system activity is referred to as clock throttling (see Figure 1). Clock throttling can be implemented with a variety of periods and duty cycles, or with an event driven period such as deassertion on INTR. It has been suggested that clock throttling may cause problems and performance degradation with networked systems. This paper will discuss the power savings possible with clock throttling, and results of network testing. Included also is an overview of the power requirements of typical LAN cards. Icc STPClK# Internal CPU elK STOP ClK TIMER 241988-1 Figure 1. Clock Throttling I 2-513 AP-504 2.0 TEST HARDWARE -All testing was done with the use of a daughter card which was inserted into an Intel486 CPU PGA socket on a standard Intel486 CPU motherboard. This daughter card consisted of a socket for an SL Enhanced Intel486TM CPU, an input for a signal from a pulse generatOr from which STPCLK#was derived, and two PLDs to perform the following functions. Figure 2 shows the hardware setup. • STPCLK # specification requires that if STPCLK # is asserted, it must be held .at least until the Siop Grant Bus Cycle is returned (see Figure 3). The PLD circuitry ensured that this specification was met regardless of the period and duty cycle of the signal from the pulse generator. • At the conclusion of the STPCLK # period, once STPCLK # is deasserted it must remain deasserted for a minimum of 5 clocks before being asserted again (see Figure 3). Again the PLD circuitry ensured that this specification was met. • Deassertibn of STPCLK # on INTR. The PLD circuitry was designed so that STPCLK # could be deasserted on INTR if desired. A jumper controlled whether or not this occurred (see Figure 4). The logic was designed such that INTR could only deassert STPCLK # after the Stop Grant Bus Cycle had been returned. Appendix A contains a schematic of the daughter card and the PLD equations; Ethernet Cable (Coax or twisted pair) Pulse Generator 1486™ CPU Motherboard 00 Network ~--~----Inre~ Card INTRJumper PLDCircuit Daughter Card SL Enhanced 1486™ CPU 241988-2 Figure 2. Test Hardware Setup 2-514 I AP-504 .' Hold STPCLK# at Least Until Stop Grant Bus Cycle ,,:~ , STPCLK# Must be Deasserted a Minimum of 5 clocks After RDY ' ' '.. ~: CLK ADDR RDY# 241988-3 Figure 3. Stop Grant Bus Cycle STPCLK# (No Deassertlon on INTR) STPLCK# (Deassertlon on INTR) 1L--_ _IlJ_ ....... ~ INTR 241988-4 Figure 4. STPCLK # and INTR I 2-515 AP-504 3.0 In the implementation where STPCLK # is deasserted for the remainder of the current period with INTR (see Figure 4) then the period of STPCLK # will also have an effect on the sleeping energy consumption. An unused sleeping system (ATt type PC) will experience a real time clock interrupt every 55 ms. As the period increases, the effect of deasserting STPCLK # on INTR reduces the effective duty cycle (see Figure 5). POWER SAVINGS WITH CLOCK THROTTLING Asserting STPCLK # saves power on the SL Enhanced Intel486 CPU motherboard by reducing the current to the CPU, and by reducing the current to DRAM and external cache memory. As the CPU is effectively halted no memory cycles will occur. In the case of clock throttling, as STPCLK # is asserted and deasserted, the motherboard will rapidly toggle between a low power and normal power state. The average of these two states will be the effective "sleeping state". It is obvious, therefore, that the duty cycle of STPCLK # will have a profound effect on the sleeping energy consumption. Power consumption data was collected for a variety of duty cycles and periods, and with/without deassertion of STPCLK # on INTR. Please see Figures 6 through 8. Data for charts is in Appendix B. STPCLK Period - 10 tJillllseconds · . STPCLK Period - 10 Milliseconds with DeasseHlon on INTR : . STPCLK Period - 30 Milliseconds ____~n~__~·~n~____~n~l----~n~--~~n ·· STPCLK Period - 30:MIIIIseconds with DeassertJon on INTR _---In n INTR (Real Time Cloc~) ~ · n ____~n~__~n~__~n n n 241988-5 Figure 5. Effect of Deasserting STPCLK # on INTR 2-516 I intel® 100 95 C 90 :; "~ 85 0 Ap·504 ............ ............. ~ 80 ............ ............. •,. 75 ~ a Do. E ~ 65 CI.I 60 •>- ............ 70 - ............... .......... r-.... ~ 55 50 o 20 40 100 80 STPCLK# Duty Cycle (%) , 241988-6 Figure 6. Power vs STPCLK Duty Cycle, Interrupts do not Deassert STPCLK# 100 95 90 C 85 :; "~ 80 •,. 75 Do. 70 0 ~ a E ~ •>- - .- 65 CI.I 60 55 50 o 10 20 30 STPCLK# Period (ns) 40 50 241988-7 Figure 7. Power vs STPCLK Period, Duty Cycle = 50%, Interrupts Deassert STPCLK# I 2-517 AP-504 100 95 90 ;: 0 S II. ..• ~ 85 80 75 • 70 65 = ...• t:IJ 0 a. ------- E UI .- 55 - ...-t ~ ~ ... ---~ 50 o 10 30 20 40 50 STPCLK# Period (ms) 241988-8 Figure 8. Power vs STPCLK Period, Duty Cycle = 90%, Interrupts Deassert STPCLK # Looking at the graphs it is clear that the duty cycle of the STPCLK # pulse has the most effect on the overall system power. The period of the pulse also affects the power, however not as significantly as the duty cycle variation. The period of the pulse has the greatest effect when the duty cycle is set at 90%. 4.0 OVERVIEW OF NETWORK TEST PLAN Given the almost infinite matrix of network operating systems, network topologies, network interface cards, host system bus architectures, and potential STPCLK # periods and duty cycles, an exhaustive test of all permutations is not feasible. Therefore, testing centered on the Novell network operating system, the Ethemet™ network topology, and the ISA bus archi· tecture due to their overwhelming market share. An understanding of the test philosophy and test condi· tions can allow some extrapolation of results to other environments. Maintaining the network connection while in a low power state (sleeping) is mandatory for any energy effi· cient computer. A variety of network cards and STPCLK # periods and duty cycles were tested in a PC connected to a Novell NetWaret 3.11 network. Section 5 discusses the procedures and results of the testing of the ability of clock throttled systems to maintain a net· work connection. 2·518 Peer·to·peer networks have an added complication for sleeping PCs in that any user's workstation may be con· figured as a server as well as a client. A user's sleeping PC may be accessed by another user. This access may or may not be considered a wake·up event, depending on the particular power management scheme. Section 6 covers the performance implications of transferring files from a sleeping PC in a peer·to·peer network (No· vell Lite). A similar problem to the previous one involving peer to peer networks, can occur in classical client·server net· works where user's PCs are configured as network print servers. Section 7 discusses test results of printing to a remote sleeping print server in a Netware 3.11 environ· ment. Intel's Compatibility Validation Lab continually tests new Intel CPUs in a variety of machines to guarantee all new microprocessors are completely Intel compati· ble. These workstation tests are regularly performed over a network to facilitate the process. While an ener· gy efficient computer would not normally be sleeping during these tests (as they emulate user activity), for research purposes these tests were completed with an aggressively clock·throttled machine. The details of these tests are in Section 7. I AP-S04 5.0 NETWARE 3.11 TEST SUMMARY 5~ 1 Test Environment The test environment can be summarized with the following list: was not feasible for the scope of this paper, (and not particularly useful as shown later), so several parameters were fixed at chosen values. Network Utilization • Hub The network utilization was fixed at 22.8%. [The traffic generator was set at 25% but measured values demonstrate that the actual utilization was 22.8%.1 This was fixed because preliminary tests demonstrated that network load did not have a perceptible effect on the outcome of the test. Only broadcast packets, or individually addressed packets are actually loaded into the network interface card's receive buffer. Given the assumption that the client is in a low power state because of minimal system activity, then it follows that only a minute percentage of the network traffic will correspond to the sleeping station. 22.8% was chosen as the utilization of a fairly busy network. 5.2 Test Description Interrupts will not Deassert STPCLK # This section of the documents describes the tests performed and provides explanations for some of the independent variables used in the test. As mentioned in section 4, the primary motive of the test suites was to demonstrate that the network connection is not lost when the processor enters into the power down or stop clock state. The variables initially considered to be a factor included: the network utilization, the period of the STPCLK # pulse, the duty cycle of the STPCLK # pulse, whether interrupts deasserted the STPCLK # signal or not, and several others. To perform an exhaustive test of all possible values of all possible parameters In the real world environment, this parameter will be implemented by the power management logic of the particular design. If interrupts deassert STPCLK # , then all interrupts (like the real time clock, and many others) will bring the processor out of low power mode. This scenario will provide improved performance over the case where interrupts do not deassert STPCLK #. For our tests, the worst case implementation was chosen. That is, interrupts will not deassert the STPCLK # signal. If the tests succeed in this environment, they will succeed if interrupts deassert STPCLK #. • Novell 3.11 File Server running on an Intel486 DX2 66-MHz CPU • Traffic Generating Station using Intel NetSight Professional • One Traffic Monitoring Station using Intel LanSight • Two dummy clients • One test station with an SL Enhanced Intel486 DX2 66-MHz CPU • Twisted Pair Ethernet Cable Test Station . Daughter Card 241988-9 Figure 9. Test Environment I 2-519 Ap·504 Test Station Idle The test station is not performing any operations for the duration of the test. This parameter is based on our initial assumption that the client is asleep because there is no activity. Time time that the server drops the connection when no activity is occurring. If the cable is actually disconnected from the network interface card, the server will drop the connection in less than fifteen minutes. A reasonable length of time to test would be 4x this value. So, each of the tests was run for at least one hour. As a sanity check, the STPCLK # pin was grounded (made active) and the length of time for the server to drop the connection was measured. To determine the length of time to test some of the cards, it was first necessary to determine the length of ' Table 1_ Network Test Loading (Fixed 25% Loading, Interrupts Disabled) Network Card 3Com Etherlink II Ansel 2100 Ansel 2200 Intel EtherExpress TM Card Kingsto'n KNE 2121 SMC16 Duty Cycle Time Pass! Fail Server Disconnect Time 1 ms 90% 1.1 hr pass 11 min STPCLK# Period 8ms 90% 1.1 h'r pass 55 ms 90% 1.4 hr pass 1 ms 90% 1 hr pass 8ms 90% 1.4 hr pass 55ms 90% 1 hr pass 1 rils 90% 1 hr pass 8ms 90% 1 hr pass 55ms 90% 2.4 hr pass 1 ms 90% 1 hr pass 8ms 90% 1 hr pass 55 ms 90 0<0 1 hr pass 1 ms 90% 1.4 hr pass 8ms 90% 1 hr pass 55 ms 90% 1 hr pass 1 rils 90% 1 hr pass 8ms 90% 1 hr pass 55ms 90% 2.4 hr pass 13 min 12 min 13 min 11 min 11 min NOTES: 90% is percentage time that the STPCLK# is asserted (Io~). Time is the time that each card was tested under 25% network load. Pass/Fail: A card is considered to have passed if it maintains connection with the network for 1 hour or more. Server Disconnect time is time that it takes for the network to drop a station that does not respond. This time was computed by grounding the STPCLK# (completely halting the CPU). 2-520 I AP-504 5.3 Test Results All of the LAN cards tested passed all of the tests with the initial conditions as described in Section 5.I..As all the tests passed, it is interesting to consider some of the factors at work. One factor is the packet size. The default Novell packet size is approximately 1.5 Kbytes. On a card with a receive buffer of 32K, twenty-one full size packets can be put into the buffer without dropping any. When little or no network traffic is being generated by a particular station, the number of packets addressed.to the station is small. Generally speaking, the server will send "hello?" packets every few minutes to ensure that the clients are still maintaining the connection on the network. It has been shown that the file server actually terminates the connection before the buffer would be filled with unopened messages. So, in the fifteen minutes that a server takes to confirm a lost transmission, twenty packets are not sent to the client. The client need respond to only one of these "hello" packets for the connection to be maintained. Another related factor is the protocol itself. Normal Ethernet protocol specifies that if a packet is sent without a response, the packet is resent. So, in the event the client receive buffer does fill up and a packet is lost because the receive buffer is full, the initiating station will resend the message. The chance of dropping a package is slim, and the chance of not recovering from a dropped package is even less. One scenario that has not been mentioned is the scenario whereby the client goes to sleep in the midst of a large network operation. Even if this unlikely situation should occur, it will not affect the network connection. Protocol calls for a handshaking mechanism for all packets. A request is sent and a reply is received. In the event that the client enters into a low power state, the CPU will request the data less frequently, and thus receive data less frequently than if operating in a fully awake state. So as the effective frequency of the processor decreases so will the network bandwidth required by the operation in progress. Another key consideration in the success of the tests was a simple matter of processor performance. Even when operating with a 90% duty cycle on STPCLK #, the SL Enhanced Intel486 processor compares very favorably with older processors still connected to many of today's networks. In summary, it has been shown that the STPCLK features of Intel's SL Enhanced processors will not corrupt the Novell 3.11 Client Server LAN or lose network connections under normal circumstances. 6.0 NOVELL LITE TEST SUMMARY 6.1 Test Environment The test environment can be summarized with the following list: • Novell Lite 1.1 server and client software running on SL Enhanced Intel486 DX 33-MHz CPU • Traffic generating station using Intel NetSight Professional • One traffic monitoring station using Intel NetSight Professional • One test station with SL Enhanced Intel486 DX2 66-MHz CPU configured as Novell Lite 1.1 server and client • Coaxial cable connecting the four stations I Monitor Test ServerClient Traffic Generator 241988-10 Figure 10. Test Environment I 2-521 intel® AP-S04 6.2 Test Description This section describes the tests performed and provides explanations for some of the independent variables in - the test. As mentioned in Section 4, the primary motive of the Novell Lite testing was to determine the performance implications. of clock throttling PCs in a peer-topeer networking environment. The performance measurements were done by running' two different tests while varying the period of STPCLK # assertion from 1 to 55 ms, the duty cycle of STPCLK # assertion from 50% to over 90%, and allowing and not allowing interrupts to deassert STPCLK #. The tests consisted of running two batch files which transferred files to and from the host machine. The first batch file transferred 2 files 10 times each. These files had files. sizes of 41 Kbytes and 65 Kbytes. The second batch file transferred two large files having sizes of 130 Kbytes and 333 Kbytes twice to and from the host machine. Each of these tests were run on the seven Ethernet cards in the test suite. The results can be seen in Appendix C. The range of the period from I to 55 ms was chosen because this is the minimum and maximum periods of typiCal implementations. The period was varied from ' 50% to 90% Jor maximum power savings. Data was also taken in the cases of allowing and not allowing interrupts to deassert STPCLK #, as obtaining performance results in both situations is necessary for a complete analysis.. load but makes the test more representative of an actual network. The packets generated by the traffic generator were sent to random addresses, and had no effect on the performance of the individual cards other than using up network bandwidth. The bogus packets broadcasted on the. network were 64 bytes long, and approximately 1,500 packets were sent per second. Test Station Idle Our test station was idle except when responding to the network requests generated by the test programs. This parameter is based on the initial assumption that the client is asleep because there is no activity. 6.3 Test Results Performance Performance slowdown is defined as the length of time the test took with clock throttling compared to the length of time the test took without clock throttling expressed as a percentage. A slowdown of 100% would mean that the tests took twice as long. Network Utilization The highest performance levels were obtained by allowing interrupts to deassert STPCLK #I. Using this method, the performance slowdown was never more than 60% of normal operating conditions. The best performance using STPCLK # at a 50% duty cycle was seen at a very short period (1 ms). The best performance seen in the 90% duty cycle occurs at the highest period tested, 50 ms. See Figures 11 and 12. The network utilization was fixed at 22.8%. [The traffic generator was set at 25% but measured values demonstrate that the actual utilization was 22.8%.1 The utilization was fixed because preliminary tests indicated that network utilization did not have a perceivable effect on network connection failures or on network performance. This percentage is considered a heavy traffic When interrupts were not allowed to deassert STPCLK #, the highest performance at a 90% duty cycle was on the shortest period tested. While the performance degradation (120%) is significant, this may be acceptable in environments where "sleeping" PCs are rarely accessed, or, of course, in non peer-to-peer networks where this situation will not occur. 2-522 I intel® AP-504 60.0oolo @ 60.00% . .....• ..• iii ::;~: ::::: 40.00% ::~~ • :~:~ 30.00% Deassert :::::: No Deassert 20.00% 10.00% ";;:; ~ 0.00% • 0 10 • • • 20 30 40 • 50 60 Period (ms) 241988-11 Figure 11. Performance Slowdown vs STPCLK # Period Relative to Peak Performance at 50% Duty Cycle (with and without STPCLK # Deassertion on INTR) 200.00% :::::: 180.00% :::.: :.:.:: 160.00% ;:;:: :;:;: 140.00% '" ..... . ~ iI iii ;:::; 120.00% • 100.00% Deassert No Deassert 80.00% 60.00% • • 40.00% 20.00% • 0.00% 0 10 20 • • 30 40 • 50 Period (ms) 241988-12 Figure 12. Performance Slowdown vs STPCLK# Period Relative to Peak Performance at 90% Duty Cycle (with and without STPCLK # Deassertion) I 2-523 Ap·504 The knee in the graph at approximately 30 ms could be a function of the operating system's interaction with the I/O subsystem. As the period increases with the same duty cycle, the performance diminishes because the waitiri.g time for the disk drive or LAN increases. For example many small periods would -be more efficient than fewer bigger periods. At some point, however, the system reaches the point where it can complete an entire job (i.e. one file copy request) in the period (around 30 nis). At this point, the efficiency suddenly improves. Then once again, the efficiency diminishes as the period increases. The system can do 1.1 job, 1.2 job, etc. Theoretically the system would see another increase in performance at the two job interval (around 60 ms) A connection test was also performed. The systems were all for at least one hour with STPCLK # asserted over 92% of the time with a period of 55 ms. During this time, interrupts did not deassert STPCLK #. No systems failed and no connections were lost. A period of 55 ms was chosen because this is the' maximum period that a system can be asleep without losing time from the real time clock. All the cards were tested in this environment and all the cards passed. An explanation ofthe Novell Lite protocol shows this to be a reasonable result. run If a Novell Lite machine attempts to access another machine and its first message is not acknowledged, it will retry 15 times at approximately 220 ms intervals. A retry is only Counted towards the 15 if the request is successfully transmitted onto the network media (no collision occurred). If there is heavy traffic on the network, it may transmit at longer interVals than 220 ms due to collisions. Novell Lite also uses a stop and wait protocol. The stop and wait protocol mandates that after one packet is transmitted, the next one will not be transmitted until an acknowledgment is received. If a packet is lost, the protocol will retransmit the packet (or request) after a: time-out period. Standard Ethernet was never assumed to be an error-free transmission media, and procedures are already in place to handle lost packets. Therefore, even if the packet arrives successfully; and for some reason was lost in the destination machine (which did not appear to happen), the sending machine will retransmit the package since an acknowledgment was not received. 2-524 7.0 NOVELL PRINT SERVER TEST SUMMARY 7.1 Test Environment The test environment can be summarized with the following list: • Novell 3.11 File Server running on an Intel486 DX2 66-MHz CPU • Traffic -Generating Station using Intel Netsight Professional • One Traffic Monitoring Station using Intel LanSight • Two dummy clients • One test station with an SL Enhanced Intel486 DX2 66-MHzCPU ~ Configured as a Novell 3.1 network print server - Intel EtherExpress LAN card • Twisted Pair Ethernet Cable • Hub • HP DeskJett 550C Printer 7.2 Test Description The primary motive of this test is to show that there will be no significant loss in performance from a' network workstation configured as a print server if it goes into power down mode. Testing was performed to compare normal network printing (STPCLK # disabled) with a worst case scenario in which STPCLK # is assertedand interrupts are disabled. The test consisted of printing a bitmap file and a text file from a client station to the teSt station configured as a print server. Testing was performed with the test station fully "awake" and "asleep". The "asleep" or worst case test consisted of a 25% network load (as did awake case), a STPCLK # duty cycle of 90%. and a STPCLK # period of 1 ms. The tests were performed using an Intel EtherExpress LAN card. The two file types used for testing were a 13 page 36K text file and a 308K bitmap file. I AP-504 7.3 Test Results Table 2. Test Results for Novell Print Server File Type STPCLK# 36K text file disabled 36K text file enabled 30BK bit map disabled 30BK bit map enabled Period File Print Time Total Pages NA NA 7.2 min 13 90% 1ms B.O min 13 NA NA 3.5 min 1 90% 1ms 3.B min 1 Duty Cycle NOTES: STPCLK# disabled: CPU fully "awake". STPCLK# enabled: CPU "asleep", 90% duty cycle, 1ms period, interrupts disabled. Duty Cycle: 90% is the percentage time that the STPCLK # is asserted (low). Period: STPCLK # period. Network Load: actual network load as recorded by network monitor was 22.8%. File Print Time: time to print entire file beginning at client. 7.4 Test Conclusions As expected, there was no significant difference between file print times' for a fully "awake" CPU and a "sleeping" CPU. The slight variations in time can be due to the Novell protocol itself. First, the file to be printed is transferred over the network to the network server; it is sent to the print server where it is stored in a print queue. From there it is spooled into the printer itself. Although a "sleeping" CPU transfers data much slower from the LAN card to memory, and from memory to the printer, it is sufficiently fast enough to keep up with the printer. The limiting factor here was not the CPU speed, but actually the printer speed. Also, note that this is a "worst" case, in which interrupts do NOT disable the STPCLK #. In summary, STPCLK # features do not have a significant impact on print server performance. 8.0 INTEL COMPATIBILITY VALIDATION LAB TEST SUMMARY Intel's Compatibility Validation Lab is chartered with ensuring all Intel microprocessors are Intel compatible. To this end, they extensively test all new Intel CPUs in a variety of environments. Passing their tests is an indication that the device under test is Intel compatible. A PC equipped with the daughter card to enable clock throttling was submitted to the CV Lab for testing. In effect, the testing was to determine whether or not a clock throttled CPU was Intel compatible. I For logistic reasons, the CV Lab performs its workstation tests over a network. CV Lab's workstation tests consist of running various industry application packages. The test suites emulate typical user activity in applications such as Microsoft Excelt , Wordt , etc. An Energy Star system would normally be in a full on state during such activity; however, for study purposes this testing was done with a clock throttled system to investigate any issues involved with interaction between a clock throttled CPU and heavy network traffic. Nine standard workstation tests were performed on a PC equipped with the daughter card to enable clock throttling. The following "typical" STPCLK # implementation was chosen. I. 'Period equal to 8 ms 2. Duty Cycle equal to 90% 3. STPCLK # was deasserted for remainder of period on INTR All tests were run successfully, and no network failures were observed. In conclusion, the CV Lab testing did not indicate any incompatibility between the STPCLK # clock throttled PC and the network. 2-525 AP-504 9.0 RECOMMENDATIONS FOR CLOCK THROTTLING Clock throttling provides a clean solution for hardware power management of the CPU and memory subsystem. Clock throttling offers an easy way to design a system which can enter a low power state (less than 30W) during periods of inactivity, while still providing adequate functionality to maintain a network connection and even respond to printing or data requests. Hardware power management through clock throttling means that the power managed system will not have to depend on software drivers or a particular operating system to meet Energy Star compliance. This offers the OEM greater flexibility in the system configuration, and frees the end user from concerns over software upgrades. It is important in an ATt type personal computer that the real time interrupt (once every 55 ms) not be missed. This can be guaranteed by either choosing a clock throttling period of less than 55 ms, or by momentarily deasserting STPCLK # on the real time interrupt. Clock throttling enables the system designer to choose the level of power savings necessary for a particular implementation. It has been demonstrated that increas- 2-526 ing the STPCLK # duty cycle dramatically affects the power requirements of the system, with little impact on performance of "sleeping systems". If the performance of "sleeping systems" is a major concern, then momentarily waking for system interrupts can obviate this degradation to a great extent. If the system will be awakened for the remainder of the STPCLK # period, as was done in the test daughter card, then the period of STPCLK # clock throttling will affect energy savings. An alternative method of STPCLK # clock throttling is with a variable period driven by external events. Basically STPCLK # is asserted until an external event (such as interrupt) occurs. At this time STPCLK # is deasserted for a programmable period (2 ms-4 ms), and subsequently reasserted. This dynamic clock throttling scenario comes very close to always operating the CPU in a low power state, except for when there is actually work to do (such as service the interrupt). Finally, it should be noted that STPCLK # testing revealed no inconsistencies with clock throttling an Intel486 SL Enhanced CPU in a' network environment, and therefore its proven network compatibility should be considered when judging the merits of using clock throttling as part of a power management program. I AP-504 APPENDIX A STPCLK# TEST CIRCUITRY STPCLK# Daughter Card SchemaUc INTRPLO CPUCLK 20 2 19 3 18 4 17 5 )(,..6 16 )(,..7 14 :; 8 13 ..)< 9 12 10 11 x.. INTR ~ 1 ST~ r- )(,.. GNOX- =::..- 15 ~LK' ~ I STPCLK PLO CPUCLK A4 tHIO' DlC' W/R' STP_IN' RESET STP _OllT' l<.. POE. ~ IBOFF ~ 1 24 2 23 3 22 4 21 5 20 6 19 7 18 8 17 9 16 10 11 15 14 12 13 VCC I--- ~ ~ ~ f...x ADS' f...x HLDA CPUCLK PULSE GENERATOR OllTPUT .... STP IN. INTR ~)JUMPER SL ENHANCED 14S()1M OX CPU ...... .,. TO MO;;'ERBOARD STPClK' GIS 241988-13 Figure A-1. STPCLK# Test Circuitry I 2-527 Ap·504 IN'l'R PLD (PLDasm format) Title Pattern Revision Company CHIP INTR and RDY 1 C Intel Corporation INTR 16R4 ;***********~***************************************** *******/ Works with STPCLK.PLD to deassert STPCLK on INTR after making sure that the Stop Grant cycle has occurred ;* ;* *1 *1 i~**************************************************** *******/ ; inputs pin pin pin pin pin pin pin pin ; 1 2 3 4 5 10 CLOCK ISYSRDY INTR ISTPCLK_OUT ISTP_GNT GND 11 IOE 20 VCC CPU Clock RDY# from board INTR signal to CPU STP_OUT from STPCLK.PLD Stop Grant Detect from STPCLK PLD Ground for PLD Output Enable Power outputs pin pin pin pin pin pin pin pin 12 13 14 15 16 17 18 19 ICPU_RDY IGNTRDY IINTLATCH IRDYDONE IRDYKILL IGNTLATCH ICPU_STPCLK ISTOPGATE RDY# output to CPU Internal signal STPCLK# output to CPU Internal signal 241988-14 2-528 I Ap·504 EQUATIONS VCC STOPGATE STPCLK_OUT*/INTLATCH + 'STOPGATE*STPCLK_OUT + STOPGATE*/RDYDONE GNTLATCH RDYKILL STP_GNT GNTLATCH GNTRDY GNTLATCH*/RDYKILL GENERATE RDY AFTER STOP GRANT FOR ONE BUS CYCLE SYSRDY RDYDONE RDYKILL' + STOPGATE*RDYDONE INTLATCH INTR + STOPGATE*INTLATCH STOPGATE*/INTLATCH + HOLD RDYDONE TILL STPCLK GOES AWAY HOLD INTLATCH TILL STPCLK GOES AWAY STOPGATE*/RDYDONE ;end of equations 241966-15 I 2-529 ' AP-504 STPCLK PLD (CtJPL Format) Name Partno Revision Company Assembly Location Device STPCLK. PLD; 85C060 05; Intel corporation, Inc.; CUPL; UOO; EP600; /* PLDXXXXX Intel 85c060 */ /***************************************************************/ /* Hardware implementation assert minimum STPCLK# duration */ /* until CPU acknowledges with STOP GRANT CYCLE. */ /***************************************************************/ < Inputs > pin 1 pin 2 pin 3 pin 4 pin 5 pin 6 pin 7 pin 11 pin 14 pin 16 D_C W_R /STP_IN RESET /BOFF HLDA !ADS; < Outputs > pin 8 pin 9 pin 10 pin 15 pin 19 /STP_OUT < STPCLK# output to CPU > STP_LATCH /STP_GNT GNT_DETECT; FLOATED; CPUCLK A4 M_IO < asynchronous input from system > < omit if not used> 241988-16 '2-530 I AP-504 < Logic Equations > STP_LATCH.D STP_LATCH.AR STP_IN; RESET; STP_LATCH • !STP_GRANT_STATE • GNT_DETECT • STP_OUT ) • /RESET; + STP_GRANT_STATE • STP_OUT • /HLDA * /FLOATED * ADS STP_GNT * STP_OUT; STP_GRANT_STATE * STP_OUT * STP_IN * /HLDA * /FLOATED + GNT_DETECT * STP_IN; < wait for /STP_LATCH to deassert AFTER STP_GNT active> FLOATED.D FLOATED.ar BOFF; RESET; 241988-17 I 2-531 AP-504 APPENDIX B SYSTEM POWER MEASUREMENTS WITH CLOCK THROTTLING Period = 40 rns STPCLK # Not Deasserted on Interrupts STPCLK# Asserted (rns) Duty Cycle Current (rnArnps) 50% Duty Cycle STPCLK # Deasserted on Interrupts Power (Watts) Period (rns) DC Current (rnArnps) Power (Watts) 0 0% 2.60 13.00 1 2.02 10.09 5 12.5% 2.44 12.20 5 2.05 10.23 10 25% 2.31 11.57 10 2.05 10.25 15 2.06 10.31 15 37.5% 2.14 10.68 20 50% 1.98 9.90 20 2.07 10.37 25 62.5% 1.84 9.20 25 2.09 10.45 30 75% 1.69 8.45 30 2.13 10.64 35 87.5% 1.55 7.75 35 2.13 10.65 40 2.14 10.68 90% STPCLK # Asserted STPCLK # Deasserted on interrupts Period (rns) 1 2-532 DC Current (rnArnps) Power (Watts) 1.52 7.61 5 1.56 7.78 10 1.60 8.01 15 1.65 8.25 20 1.70 8.49 25 1.74 8.70 30 1.79 8.95 35 1.83 9.15 40 1.88 9.41 I AP·504 APPENDIX C NOVELL LITE COMPLETE TEST RESULTS Network Card: Ansel 2000 Type: AUI/BNC/TPI Extended Test: Passed Stop Clock Period (ms) I Duty Cycle STPCLK Asserted (ms) STPCLK Deasserted (ms) 0.00% N/A N/A 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.0·0 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes Test 1 Test 2 12.74 7.41 15.54 9.77 13.18 9.72 16.31 10.21 13.12 7.68 18.Q1 11.53 13.73 7.85 28.72 19.88 13.45 7.96 17.74 11.09 13.07 7.85 46.57 32.46 13.45 8.45 23.34 14.33 13.56 7.85 18.34 11.58 13.4 7.96 58.6 40.31 14.06 7.96 18.56 11.47 13.4 7.63 116.33 93.09 13.4 7.79 2-533 AP-S04 Network Card: Ansel 2100 Type: AUI/BNC/TPI Extended Test· Passed Stop Clock Period (ms) 2-534 Duty Cycle STPCLK Asserted (ms) STPCLK Deasserted (ms) 0.00% N/A N/A 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 . 15.00 30 86.67% 26.00 4.00 30 _ 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 . 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes . Test 1 Test 2 12.24 7.19 13.12 7.85 13.01 7.52 15.59 9.66 13.12 7.46 16.42 12.027 13.01 . 7.3 24.88 16.64 13.07 7.57 17.13 10.32· 13.18 7.36 34.21 24.27 12.96 7.41 18.29 10.76 12.96 7.36 18.34 10.98 12.57 7.25 45.64 31.25 12.79 7.57 20.65 12.85 12.52 7.36 77.06 52.06 12.85 7.46 I AP-504 Network Card: Eagle NE2000 Plus 3 Type: AUI/BNC/TPI Extended Test; Passed Eagle Stop Clock Period (ms) 1 I Duty Cycle STPCLK Asserted (ms) sTPCLK Deasserted (ms) 0.00% N/A N/A 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes Test 1 Test 2 12.68 7.63 13.23 8.07 12.63 8.40 17.30 10.98 14.33 8.51 15.65 9.80 13.56 8.18 21.53 14.22 15.04 9.17 16.86 10.71 13.45 8.01 27.51 19.60 14.72 10.21 16.75 10.76 13.51 8.01 16.25 10.10 13.67 8.01 27.90 19.11 14.55 9.28 17.19 11.14 13.12 8.01 34.27 24.38 15.37 9.22 2-535 AP-S04 Network Card: Intel EtherExpress 16C Type: AUI/BNC/TPI Extended Test: Passed Stop Clock Period (ms) 2-536 Duty Cycle STPCLK Asserted (ms) STPCLK Deasserted (ms) 0.00% N/A N/A 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% ' 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.670/0 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes Test 1 Test 2 12.52 7.41 13.4 8.12 12.9 7.85 15.65 9.94 13.4 7.74 17.13 10.87 13.4 7.74 24.49 16.69 13.45 8.01 17.13 10.49 13.07 7.74 35.26 24.82 13.29 7.96 17.63 10.82 13.23 7.85 18.23 11.36 13.56 7.68 4,5.86 32.07 13.45 7.96 20.26 12.68 12.9 8.01 89.58 68.54 13.12 8.07 I Ap·504 Network Card: Kingston KNE2121 Type: BNC/TPI Extended Test: Passed Stop Clock Period (ms) sTPCLK Asserted (ms) sTPCLK Deasserted (ms) 0.00% N/A N/A 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 I Duty Cycle 92.73,% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes Test 1 Test 2 12.24 7.14 13.18 7.9 12.85 7.57 16.25 10.16 13.01 7.74 17.08 10.87 13.07 7.63 26.03 18.45 13.18 7.74 17.08 10.54 13.12 7.63 36.47 24.93 12.96 7.52 17.68 11.14 12.85 7.68 17.63 11.04 12.79 7.57 45.31 32.24 13.29 8.07 19.44 12.19 12.85 7.46 71.67 51.3 13.73 7.68 2-537 Ap·504 Netword Card: SMC Elite 16C Ultra Type: AUI/BNC/TPI Extended Test: Passed Stop Clock Period (ms) 2-538 Duty Cycle STPCLK Asserted (ms) STPCLK Deasserted (ms) 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 14.00 30 53.33% 16.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes , Test 1 Test 2 14.61 8.07 13.23 7.85 16.14 10.43 13.51 7.96 17.08 11.04 13.73 8.23 23.06 19.6 14.44 8.51 16.69 10.32 13.51 7.74 37.45 26.47 14.33 8.78 12.9 10.1 13.45 7.85 18.01 11.36 13.34 7.9 48.05 28.78 13.67 8.07 19.88 12.24 13.34 8.29 59.75 46.13 13.62 8.45 I AP-504 Netword Card: 3Com Etherlink III Type: AUI/BNC/TPI Extended Test· Passed Stop Clock Period (ms) Duty Cycle 0.00% I sTPCLK Asserted (ms) N/A sTPCLK Deasserted (ms) N/A Interrupts Enabled Test 1 Test 2 13.07 7.41 0.50 N/A no 14.17 8.23 0.50 yes 13.62 7.96 4.00 no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes 17.9 10.98 1 50.00% 0.50 1 50.00% 0.50 8 50.00% 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% '16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50.00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 13.73 8.07 17.24 10.93 13.45 8.07 26.36 17.63 13.73 8.18 17.41 10.71 13.18 7.79 39.49 28.56 13.62 8.12 17.57 10.82 13.45 7.9 17.63 11.09 13.18 7.52 56.29 37.84 13.18 7.57 17.85 10.82 13.12 7.57 56.79 37.84 13.34 7.74 2-539 AP-504 Averages Stop Clock Period (ms) 2-540 Duty Cycle STPCLK Asserted (ms) STPCLK Deasserted (ms) 0.00% N/A N/A 1 50.00% 0.50 0.50 1 50.00% 0.50 0.50 8 50.00% 4.00 4.00 8 50.00% 4.00 4.00 20 50.00% 10.00 10.00 20 50.00% 10.00 10.00 20 80.00% 16.00 4.00 20 80.00% 16.00 4.00 30 50.00% 15.00 15.00 30 50.00% 15.00 15.00 30 86.67% 26.00 4.00 30 86.67% 26.00 4.00 30 53.33% 16.00 14.00 30 53.33% 16.00 14.00 40 50.00% 20.00 .20.00 40 50.00% 20.00 20.00 40 90.00% 36.00 4.00 40 90.00% 36.00 4.00 55 50.00% 27.50 27.50 55 50:00% 27.50 27.50 55 92.73% 51.00 4.00 55 92.73% 51.00 4.00 Interrupts Enabled N/A no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes Test 1 Test 2 12.61 7.36 13.89 8.29 13.06 8.12 16.45 10.34 13.46 7.88 16.94 11.01 13.42 7.86 25.01 17.59 13.77 8.16 17.15 10.60 13.23 7.73 36.71 25.87 13.62 8.35 17.74 11.25 13.29 7.79 17.78 11.07 13.22 7.70 46.81 31.66 13.57 8.07 19.12 11.91 13.04 7.76 72.21 53.33 13.63 8.06 I AP-504 APPENDIX D TYPICAL NETWORK INTERFACE CARD POWER REQUIREMENTS Card (Drivers Loaded) Current (Amps) Power (Watts) Ansel 2000 0.33 1.65 Ansel 2100 0.54 2.7 Eagle NE2000 0.13 0.65 Intel EtherExpress 0.48 2.4 Kingston KNE2121 0.41 2.05 SMC Elite16C 0.46 2.3 3Com Etherlink III 0.09 0.45 Intel 82595 0.099 0.495 NOTE: It is important to properly understand the results in the above table. The network cards with a power consumption of approximately 2.5W represent a class of cards with older generation technology. The NICs with power consumption below 1W represent the newer integrated single chip Ian controllers. For Energy Star compliant systems, 2.5W could represent as much as 17% of the power budget (depending on the efficiency of the power supply). So highly optimized systems with sufficient power budget could easily accommodate any network card. A system with a smaller marginal power budget may wish to consider a NIC with the newer technology. I 2-541 inial. AP-505 APPLICATION NOTE Picking Up the Pace: Designing the IntelDX4TM Processor into Intel486™ Processor Based Desktop Systems DAVID HARRIMAN INTEL TECHNICAL MARKETING July 1994 2-542 I Order Number: 242034-001 Picking Up the Pace: Designing the IntelDX4TM Processor into Intel486™ Processor Based Desktop Systems CONTENTS PAGE 1.0 INTRODUCTION ................... 2-544 2.0 HARDWARE RECOMMENDATIONS .............. 2-544 2.1 Power Supply .................... 2-545 2.1.1 Providing 3.3V in a 5V System ........................ 2-545 2.1.2 Choosing a Power Source .. 2-547 2.1.3 Power Supply Selection for Flexible Motherboards ......... 2-548 2.1.4 VCC5 Pin Requirement ..... 2-550 CONTENTS PAGE 2.5 Placement and Layout Suggestions ...................... 2-558 2.6 Intel Verification Program ........ 2-558 2.7 Cache and Memory Considerations .................... 2-560 2.7.1 Second Level Cache ........ 2-560 2.7.2 Write-Back Support for the Future Pentium OverDrive Processor ..................... 2-560 3.0 SOFTWARE VISIBLE DIFFERENCES ...................... 2-560 2.1.5 Explanation of 5V Tolerant Inputs and TTL Compatible Outputs ........................ 2-551 3.1 Processor Identification .......... 2-561 2.2 Processor Power Supply Decoupling ....................... 2-552 4.0 SUMMARy ......................... 2-563 2.2.1 High Frequency Power Supply Decoupling .................... 2-552 APPENDIX A: DESIGN CHECKLIST ... 2-564 2.2.2 Bulk Power Supply Decoupling .................... 2-553 3.2 Cache Test Register Difference .. 2-561 APPENDIX B: DESIGN FLOWCHART ....................... 2-565 2.2.3 Why is Decoupling Necessary? .................... 2-555 APPENDIX C: SUPPORT COMPONENT VENDOR LIST ...................... 2-566 2.3 Clock Multiplier Selection ........ 2-557 APPENDIX D: TEST PROGRAM FOR BULK POWER SUPPLY DECOUPLING ....................... 2-568 2.4 Thermal Considerations ......... 2-557 2.4.1 Voltage Regulator Considerations ................. 2-557 2.4.2 Processor Considerations ... 2-558 I 2-543 AP·505 1.0 INTRODUCTION The IntelDX4TM processor is the newest and highest performing member of the Intel486™ processor family. At internal speeds of up to 100 MHz, the IntelDX4 Processor is the fastest 486, designed for users who want the best value in 486 desktop computing today. With its larger 16K internal cache size and improved core speed, the 100 MHz IntelDX4 processor outperforms the 66 MHz IntelDX2™ processor by as much as 50%. Intel's unique 5V tolerant input buffers make this performance improvement achievable with minimal modifications to existing IntelDX2 processor based desktop designs. This document provides a straightforward process for updating your IntelDX2 processor based desktop system design to match the potential of the IntelDX4 processor, while maintaining system design compatibility with previous generations of the Intel486 processor family. The IntelDX4 processor is based on proven Intel486 technology and is compatible with the huge installed base of over 50,000 applications written for the Intel Architecture. To ensure end-user investment protection, IntelDX4 processor based systems should be verified for upgradability to a future Pentium™ OverDrive™ Processor. This application note provides clear guidance on how to prepare your system for Pentium OverDrive Processor verification. NOTE: Important recommendations that should be carefully addressed for a reliable design are highlighted in bold. These recommendations must be followed precisely to help ensure that your design will be ready for manufacture with minimal r:edesign. The checklist and flowchart in Appendices A and B will help you quickly confirm that you have incorporated all of the critical design recommendations. By using these tools you can be confident that your system is ready to meet the new standard for 486 computing set by the 100 MHz InteIDX4 processor. 2.0 HARDWARE RECOMMENDATIONS The following design recommendations cover upgrading an existing IntelDX2 processor based desktop system design to support the IntelDX4 processor. It is assumed that the existing IntelDX2 processor based design supports both the processor and its corresponding Pentium OverDrive processor using a single socket. There are two categories of recommendations included: those covering features which are new to the IntelDX4 processor, and those which apply to the IntelDX2 processor as well as the IntelDX4 processor, but have renewed importance for the IntelDX4 processor. Intel appreciates your interest in the IntelDX4 processor. This document has been developed to allow you to minimize your investment in development time and bring a reliable design to market quickly. Table 1. Hardware Differences between the IntelDX4TM Processor and the IntelDX2TM Processor and Their Respective Pentium™ OverDrive™ Processors Future Pentlum™ IntelDX4TM IntelDX2™ OverDrlve™ Processor for Processor Processor the InteIDX4:rM Processor Max. Internal Speed 100 MHz 66 MHz • 3.3V Pentlum™ OverDrlve™ Processor for the IntelDX2TM Processor . 3.3V S.OV 8 Kbytes . S.OV 16 Kbytes 168-Pin PGA Pin R17 (S18") CLKMUL INC CLKMUL INC 168-Pin PGA Pin S4 (TS") VOLDET NC VOLDET INC 168-Pin PGA Pin J1 (K2") VCC5 VCC VCC5 VCC Pentium OverDrive Processor Socket Pins J1, K1 and L1 N.A. N.A. S.OV for Fan/Heatsink VCC Supply Voltage Cache Size • NOTES: • Contact your local Intel representative for details . •• Pentium OverDrive Processor socket pin number. 2-544 I AP-505 Table 1 summarizes the differences between the IntelDX4 processor and the IntelDX2 processor. Note that the operation of SMM and STPCLK are identical to the SL Enhanced IntelDX2 processor. Also note that the future Pentium OverDrive processor for the IntelDX4 processor is different from the Pentium OverDrive processor for the IntelDX2 processor. 2.1 Power Supply The new features of the InteIDX4 processor bring with them some new requirements on the processor power supply. Since the IntelDX2 processor is a SV part, existing InteIDX2 processor based designs typically run all system logic at SV. The IntelDX4 processor is a 3.3V part, so it is necessary to modify the processor power supply to provide this voltage. Care was taken in the design of the IntelDX4 processor to ensure that a single system board could be designed which would function with any Intel486 processor, while getting maximum performance from the IntelDX4 processor. 2.1.1 PROVIDING 3.3V IN A 5V SYSTEM In most system board designs, the SV system power supply is routed. to the components on the board through a dedicated board layer. With the requirement I of a new 3.3V supply for the IntelDX4 processor, it is not necessary to add a completely new power supply layer to the circuit board, as it is possible to create a 3.3V "island" around the processor in the existing power supply plane. Figure 1 shows a recommended "is· land" layout. Note the connection from the SV plane to the VCC5P pins, which will power the integrated fanl heatsink in the future Pentium OverDrive processor. The IntelDX4 processor's SV tolerant input buffers and TTL compatible outputs allow the processor to inter· face with existing TTL compatible external logic with· out requiring extra components Thus, the processor can run at 3.3V while the system logic runs at SV. The "island" needs to be large enough to include the processor, the required power supply decoupling capacitance (see section 2.2), and the necessary connection to the 3.3V source. To minimize signal degradation, the gap between the 3.3V ''i.sland'' and the SV plane should be kept small. A typical gap size is about 0.02 inches. Minimize the number of traces routed across the power plane gap, since each crossing introduces signal degra· dation due to the impedance discontinuity that occurs at the gap. For traces that must cross the gap, route them on the side of the board next to the ground plane to reduce. or eliminate the signal degradation· caused by crossing the gap. If this is not possible, route the trace to cross the gap at a right angle (90 degrees). 2-545 AP-505 ---,,..--........_-----.... ... ----..------...." ,, . ~- . I ~~ \ ) , r~~:3 __ m __ \ I \ \ / \ I ................... : : : : : : : : : ::: ::: : : •• \\ \ • ',I • • • r :::: t" I , •••• I \ : : : : L, 1 •••• 1 1-------°1 • L - J \I " « , • • • _II : : : : •• •• •• • • ~.,., I , \)' :.... I.... I.... _.I, \ rJ : : : : \ • •••• •••• : I -~ • .\. • r-:· • • • 1_______ J • • • \I • ••••••••••••••••••• ' I .• .• .• .• .• .• .• .• .• .• .• .• .• .• .• .• .• .• .• ,~ I I -_._: -.-::-:~ - :~-:~ _._: _._: _._: _._: - --.! \ \ .I I I / II , \ ~~ I \ \\ \, I" , •••••.• ••••••••• '. • • • -------- Connection- oint for 3.3 V source ...., '.......... ~ . __ ..-"-..--_------.. , ' -.-oJ . . 5VDlane 242034-3 Figure 1. Creating a Power "Island" 2-546 I AP-505 I~. .. :. ," . "; ·;~~'3·.·3 # Volt regulator (upright) and heatsjnk L Use a wide trace to power supplv connector 242034-7 242034-6 3.3V Supply Using Linear Regulator I 3.3V Supply Using System Power Supply Figure 2. Recommended Power Supply Connection Layout 2.1.2 CHOOSING A POWER SOURCE The three principle concerns which must be addressed when selecting a power source are maximum and minimum load current requirements, and response time. The processor power supply must be able to maintain correct voltage regulation at current levels below 0.5 rnA for the IntelDX4 processor in the Stop Clock State, and up to the maximum current of 3.0A for the future Penti.um OverDrive processor. The power saving technology In Intel SL Enhanced processors, including th~ IntelDX4 processor, will cause the processor to s:-"Itch to v.ery low power levels during normal operation, even If external power management is not used. For example, executing a HALT instruction will cause the IntelDX4 processor to enter the Auto HALT Power Down State, which will cause a significant reduction in the current consumption of the processor in as little as 100 ns. The transition from HALT to the Normal State will cause current consumption to return to the normal levels in a similarly short period of time. The processor power supply must be able to maintain correct voltage regulation during these transitions. I There are basically two options for supplying 3.3V to the processor, either adding a 3.3V tap to the primary system power supply or using on-board secondary regulation to derive 3.3V from the 5V system power supply. For on-board secondary regulation, a linear voltage regulator will perform adequately for most desktop and server designs. If low heat or power dissipation is a design goal, the higher complexity and cost of a switching regulator may be warranted. Switching regulators offer better efficiency, thereby lowering regulator power consumption and heat. See section 2.4 for related thermal considerations. Fig~re 2 shows recommended layouts for power supply or linear regulator connection to the 3.3V "island." Appendix C includes a list of possible vendors for pow~ er supplies and voltage regulators. 2-547 AP-505 2.1.3 POWER SUPPLY SELECTION FOR FLEXIBLE MOTHERBOARDS Using the voltage detect sense feature of the IntelDX4 processor, you may design a' flexible system motherboard which will automatically use the proper processor voltage for an IntelDX4 processor or a different Intel486 processor. It is also possible to make the selection of processor voltage an option during system board assembly. 2.1.3.1 VOLDET Automatic Voltage Select Circuit Option By sampling the VOLDET pin at power up, system boards can automatically select the processor power supply voltage, enabling a design that may use the IntelDX4 processor or a 5V Intel486 processor without jumpers or assembly time changes. The VOLDET pin is only present in the PGA package version of the IntelDX4 processor. This pin, which is an NC (No Connect) on previous Intel486 family processors, is connected internally to VSS on the IntelDX4 processor. This pin should be left unconnected in designs that do not use the voltage detect feature. Figure 3 shows an example of the use of the VOLDET pin with a linear regulator circuit to automatically select the correct 2-548 power supply voltage. If the VOLDET pin is not connected inside the processor, indicating a 5V part, the gate of MOSFET QI is pulled high, which causes it to bypass the 3.3V regulator, supplying 5V directly to the processor. Shorting the input of the regulator to the output in this way is harmless for most linear regulators, due to regulator feedback circuitry which shuts the regulator off (contact regulator manufacturers for specifics). Note that in this case, most regulators will require Q I to handle all the processors current requirements, and so it should be a high-current, low on-stateresistance MOSFET. If the VOLDET pin is connected to VSS, indicating a 3.3V part, the QI transistor is turned off, allowing the regulator to function normally. Figure 4 shows a suggested placement and layout for MOSFETQI. 2.1.3.2 Other Voltage Selection Options It is also possible to design a flexible system board where the processor supply voltage is selected by an assembly time option. There are several methods to achieve this; the key requirement being that the design must handle the maximum current of 3.0A for the future Pentium OverDrive processor. Note that normal jumpers may not be adequate, and it may be necessary to use several in parallel. I AP-505 +5V~ __________________ ~ +12V Open Drain! COllector R3 10K I---~-I VOLDET G...._ - - - - - , D2" S D IL __._____~~~._____J Inte1486 1M processor +5V Vin LINEAR Vout ~..........---'~-I'REGULATOR ~T:....--......_.....:;.;~..::.;.;:...:......--IVcc VadjL..-_ _... NOTES: • D1 internal on all MOSFETs •• D2 internal on some MOSFETs 242034-8· Figure 3. Example Voltage Auto-Select Circuit Topology (Courtesy of Linear Technology Corporation) Outline of Socket 3 , • • & •••• If •• •• •• •• •••• •••• •••• •••• •••• •••• & •••• •• ~ •• •• 242034-9 Figure 4. Suggested Placement and Layout for MOSFET Used in Optional Voltage Auto-Select Circuit I 2-549 AP-505 2.1.4 VCC5 PIN REQUIREMENT For mixed voltage systems where the processor inter· faces with 5V components, the Vccs pin must be con· nected to 5V for proper 5V tolerant buffer operation. The VCC5 input should not exceed Vcc by more than 2.25V during power-up, power-down or during opera· tion. If this requirement is not met, current flow through the pin may exceed 55 mA, and damage to the component may begin to occur. To meet this requirement, one of two things must be done: either the power supply must be designed to tum on and off such· that the difference between the Vccs and Vee voltages never exceeds 2.25V, or a lOOn resistor must be put in series with the Vccs pin to limit the current through this path (Figure 5 shows a possible layout for this connection). The lOOn series resistor is required for power supplies which do not meet the voltage difference specification, and also provides protection in the case of a power supply failure (where the 5V supply remains on, but the 3.3V supply goes to zero). Note that the Vccsp pins for the future Pentium OverDrive processor fan/heatsink unit should be connected directly to the 5V su·pply, and not through a series resistor. 3.3 Volt "Island" ••••••• • • ••• ••• 242034-10 NorE: The future Pentium™ OverDrive™ processor requires a 5V supply through the VCC5P pins. The VCC5P pins must be connected directly to the 5V supply, and not through the VCC5 pin protection resistor. Figure 5. Possible Layout for VCC5P Pin Connection 2-550 I AP-505 VCC5+0.3 {Max. input voltage} 5.0 ......,.....- Accepts input voltages in the oto 5 V range 2.0 TTL Vih 0.8 TTL Vii IntelDX4™ processor ~ 2.4 2.0 Voh (min) TTL Vih Outputs are TIL compatible 0.8 TTL Vii 0.45 Vol (max) OL&--------~--~========~---------------Outputs _ Inputs 242034-19 Figure 6. Voltage Relationships in an IntelDX4 Processor Based System with 5V External Logic 2.1.5 EXPLANATION OF 5V TOLERANT INPUTS AND TTL COMPATIBLE OUTPUTS The IntelDX4 processor and future Pentium OverDrive processor include 5V tolerant input buffers and TTL compatible outputs. This feature enables the processor to interface with existing 5V logic even though the processor is running at 3.3V internally_ In a system with the 3.3V IntelDX4 processor interfaced to 5V components, the VCC5P pin on the IntelDX4 processor must be attached to the 5V system power supply. The VCC5P pin provides a voltage reference for the processor's input buffers. With I VCC5P connected to 5V, the processor can accept input signals up to 5.3V, making its inputs compatible with 5V TTL or CMOS level outputs. Output voltages from the processor are guaranteed to be at or above 2.4V for a logic "I" and at or below 0.45V for a logic "0." This allows the IntelDX4 processor to drive TTL compatible input levels (2.0V and O.BV), but not 5V CMOS levels ("rail to rail"). Figure 6 shows the input and output voltage relationships for the IntelDX4 processor in a 5V system. In a 3.3V only system, the VCC5P pin should be connected to the 3.3V supply, as 5V tolerant operation is not required. 2-551 AP-505 2.2.1 HIGH FREQUENCY POWER SUPPLY OECOUPLING because of its very fast 100 MHz internal operation. A reliable design will include a minimum of nine 0.1 /LF capacitors and nine 0.01 ~F surface mount capacitors between power and ground, evenly distributed, close to the processor. The capacitors must be placed as close to the processor as possible, attached directly to the power and ground planes, or circuit hoard inductance will significantly reduce their effectiveness. A typical failure mode caused by inadequate high frequency decoupiing is unreliable or inconsistent program behavior. These failures are often intermittent, and are very hard to debug. Figure 7 shows a recommended layout for the high frequency capacitors, with values as shown. High frequency decoupling is critical on the InteIDX4 processor because of its high speed external bus, and Appendix C includes a list of possible vendors for power supply decoupling capacitors. 2.2 Processor Power Supply Decoupling Processor power supply decoupling is critical for reliable operation. With the IntelDX4 processor, there are two areas of concern: high frequency decoupling, necessitated by the high speed operation of the processor, and low frequency decoupling, necessitated by the power saving features of the processor. Outline of Socket 3 ~ .. ·. o.ol d! T • O.l~ • • • • • • • • ;..;..;. • • • • • • • • • • • • • • .. • • • • • • ••••••••••••••••• ••••••••••••••••• ;········ff·····' • • • •••• •• •••• "o.~ T •••• •• •••• •• ... ~ • • • • • • : £ilito& • • • • : ~~Ol& • ~~ • • • • • • ; 2:W ..... T !fo~~ • • • • • • '.... •• & " • • •• • • • • :...0:lT T ,0.1 .. •• •••• ,.... .. .' •• •••• ••••••••• ••••••• ••••••••••••••••• ••••••••••••••••• • • • • •....• _•.. _•.................. • • • • •__ •...• • • }.;···L .' ···O.O~···· --'--'-"'-'-' . • • •• •• •• •• •• •• •• •• •• •• •• •• •• •• •• •• •• o.l k T ~.~ 242034-11 NOTE: All values in microFarads Figure 7. Recommended High Frequency Capacitor Values and Layout 2-552 I AP-505 2.2.2 BULK POWER SUPPLY DECOUPLING Bulk, or low frequency, decoupling is needed on all SL Enhanced Intel486 processors, including the InteIDX4 processor, since the processor may switch between nor· mal and low power states very quickly, causing large instantaneous current changes. To properly handle these instantaneous current changes, all designs must have adequate bulk decoupling. In 5V only systems, the processor can use the bulk decoupling capacitance all over the system board, but with the processor on a separate power plane "island," it is necessary to place adequate bulk capacitance on the processor "island." For bulk decoupling, multiple capacitors each in the range of 10 IJ-F to 1()() IJ-F are typically used in parallel to achieve the required capacitance while maintaining a low effective series resistance (ESR). You can determine the amount of bulk decoupling required with the following formula: C;:; (ill' ilT) I ilV where ~I is the maximum change in current, ~ T is the time it takes the power supply to adjust to the current change, ~ V is the allowable voltage change to remain within specification. The effective series resistance (ESR) must also be taken into account. You can find the maximum allowable ESR with this formula: ESR ;:; ilV I ill where ~V and ~I C;:; (2.8A· 15,...5) I 0.3V = 140,...F with a maximum allowable ESR: ESR ;:; 0.3V I 2.BA = 0.1 H1 Placing four 47 IJ-F tantalum surface mount capacitors in parallel, directly between the power and ground planes, will reduce the ESR below this limit and provide adequate capacitance. Figure 8 shows a recommended layout for this example. For an example program which exercises the power saving features, Appendix D includes a program that alternates between HALT and normal operating modes with a keypress. This represents a typical load change for bulk capacitance testing with the IntelDX4 processor, but does not cover the future Pentium OverDrive processor. The Intel Verification Program provides a Voltage Regulator Transient Tester in the IVP Pretest Kit. This tool simulates worst case load changes with the future Pentium OverDrive processor. Another tool available from Intel for power transient testing is the Power Validator, which includes on board limit testers and failure indicators. To order the Power Validator or the IVP Pretest Kit, contact your local Intel representative. are the same as in the first equation. For example, for the future Pentium OverDrive processor, the maximum change in current is about 2.8A. I The response time of a linear regulator may be around 15 IJ-s (contact regulator manufacturer for precise value). With no guard band, the maximum allowable supply voltage deviation from 3.3V is O.3V, yielding the following: Appendix C includes a list of possible vendors for power supply decoupling capacitors. 2-553 AP-505 Outline of Socket 3 , 47 ~n n n ••••.•••••• ••••• • • • • • ••••• ••••• •••• •••• •••• •••• •••• •••• •••• •••• •••• •••• •••• ••••• ••••• •• ••• ••••• n •• n n ••••• __ •• __ •• n •••• n. __ •• __ •••••• n ••.•••• n n n n n •••••••••••• • • • • • * • • • • • • • • •••••••••••••• •••••••••••••• ••••• •••• •••• •••• •••• •••• •••• •••• •••• •••• •••• •••••••••• •••• •••••••••••••• •••••••••••••• •••••••••••••• , 47 ~47. 242034-12 NOTE: All values in microFarads Figure 8. Recommended Bulk Decoupling Capacitor Values and Locations 2-554 I AP-505 2.2.3 WHY IS DECOUPLING NECESSARY? CMOS logic only consumes significant current when switching. This allows great power savings by shutting down elements of a processor not being used. With the IntelDX4 processor, virtually the entire chip may be shut down and quickly restarted. Switching this amount of current on and off in very short periods of time may cause serious power supply voltage surges and droops in systems with inadequate bulk decoupling. Adequate power supply bulk decoupling capacitance, located near the processor, is necessary to filter these surges and droops (see Figure 8). The delayed response of voltage regulators to load increases is the principle cause of power supply droops, with the amount of droop depending on the response time of the regulator as well as power supply inductance. Adequate bulk capacitance is necessary to provide a current reservoir until the power supply or regulator can respond to the load increase. Surges are caused by inductance in the power supply, their severity being determined by the value of the inductance and the speed with which the current load drops. Most regulators only boost the regulated supply if it falls below the specified voltage, with the regulator turning off if the output voltage rises above the specified voltage. This means that a fast regulator will not lessen the effect of voltage surges. Another factor concerning the filtering ability of the bulk capacitance is the effective series resistance of the capacitor(s), which I is an element of the non-ideal behavior of real components. The effect of this resistance must be low enough to not offset the desired filtering effect of the capacitance. Figure 9 shows an oscilloscope measurement of a surge in the power supply at the processor in an IntelDX4 processor based system with poor low frequency decoupling capacitance as the processor enters the Auto HALT Power Down State. In addition, the increased internal speed of the IntelDX4 processor over the IntelDX2 processor means higher frequency noise components in the processor power supply. Traffic on the external bus causes high frequency power supply current spikes due to the large number of external outputs switching. High frequency (low inductance) capacitors, connected between the power and ground planes, near the processor, are required to filter these high frequency components of the noise. The inductive effects of circuit board traces and component leads become critical at these frequencies. For this reason, it is critical that the high frequency capacitors are placed as near as possible to the processor, using short traces to minimize inductance. Surface mount capacitors placed directly next to the processor and inside the socket cavity are recommended (see Figure 7). Figure 10 shows an oscilloscope measurement of the power supply at the processor in an IntelDX4 processor based system with poor high frequency decoupling capacitance. 2-555 Ap·505 Tek 1IIDiI1. OOGS/s . ! . . 343 Acqs T fl~--·! . . . + I ······:····:····:····:····t Ch1 Min 3.290 V ·:····:····:····:···1:::::::.: .... :. .... :. ....f . . r-!-{-i-I-+-}-,H-~++i-j-+;.-!-I-t-~l-I-I-i-j-l-i-+- . Ch1 Max 3.602 V Ch1 Pk-Pk 312mV ~: -....................... ·.·r··················· ~ ~::.::::::: .. :·::·.:::.:I::··::::::.·::··::·: .... .~. + t. 25 Mar 1994 15:40:14 242034-13 Figure 9. Oscilloscope Measurement Showing Power Supply Surge with. Processor Entering HALT Mode in System with Poor Bulk Decoupling Te k IIIDiI 2.0 OGS/s .. :" 1244 Acqs If-..------.. ----:r. ------) '!" !'" 1"' "'! T ························l·················· ... :.... :......... :···1··· :.... :.... :..... :.... Ch1 Min 3.074 V Ch1 Max 3.458 V Ch1 Pk-Pk 384mV . . . . . . t . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -1. . . . . . . . . . . . . . . . . . . . . . . . . . . .L, OOmVQ . 3.11 V 25 Mar 1994 15:25:03 242034-14 Figure 10. Oscilloscope Measurement Showing Power Supply Noise in System with Poor High Frequency Decoupling 2-556 I AP-505 2.3 Clock Multiplier Selection 2.4 Thermal Considerations The clock multiplier on the IntelDX4 processor may be selected in the system using the CLKMUL pin. This pin is an INC (Internal No Connect) on the previous Intel486 processors. The CLKMUL pin is sampled during cold (power on) processor resets. The clock multiplier cannot be changed during warm resets, and SRESET cannot be used to select the clock multiplier ratio. Typically, CLKMUL will be connected via a jumper to the proper setting. In systems with 33 MHz bus speeds, the IntelDX4 processor will operate with an internal clock frequency of 100 MHz if the CLKMUL pin is left unconnected, or is connected to Vcc. Existing InteIDX2 processor based systems running at 66 MHz internally with a 33 MHz bus speed require no modification to use the InteIDX4 processor at 100 MHz internally and 33 MHz externally if the CLKMUL pin is left unconnected. In SO MHz bus systems, it is necessary to connect the CLKMUL pin to Vss to achieve 100 MHz internal frequency operation. These relationships are shown in Table 2, and an example circuit is shown in Figure 11. In desktop systems, proper thermal management is critical to prevent system reliability problems caused by excess heat. The principle concern is with the 3.3V power supply when implemented with secondary onboard regulation. Table 2. Clock Multiplier Selection CLKMUL at RESET External Clock Clock Multiplier Frequency Internal' Clock Frequency Vccor Not Driven 3 25 MHz 33 MHz 75 MHz 100 MHz Vss 2 50 MHz 100 MHz 2.4.1 Voltage Regulator Considerations Special thermal consideration is required for systems implementing the 3.3V processor supply with a linear voltage regulator. The lOW dissipation of the future Pentium OverDrive processor requires a 3.0A power supply current. A linear voltage regulator will dissipate approximately 5W of power to provide 3.0A. To keep the voltage regulator within thermal specification, a heatsink will be necessary to dissipate the heat generated by the regulator. If the temperature of the regulator is not maintained within the manufacturer's specification, improper regulator operation may occur, jeopardizing regulator and processor reliability. The size and performance of the voltage regulator heatsink are dependent upon the regulator specifications and system air flow. If the regulator is located adjacent to the upgrade processor socket, the ambient tempera. ture should not exceed SS'C; the limit for the future Pentium OverDrive processor. The following formula may be used to calculate the performance of a heatsink in a particular application: where 8CA is the maximum allowable thermal resistance of the heatsink and insulator, TJ is the maximum regulator junction temperature, TA is the maximum allowable ambient temperature, Po is the maximum power dissipated in the regulator, and 8JC is the thermal resistance from the regulator junction to its case. IntelDX4TM Processor Vcc - 3X For example, for the Linear Technology LTI08SCT regulator, the values would be 12S'C for TJ, SS'C for TA, 3'C/W for 8JC so: CLKMUL .....-t~c::=t--.. ()CA = (125'C - 55'C) I 5W - 3'C/W = 11 'C/W (for the LT1085CT) 242034-1 Figure 11. Jumpers for Clock Multiplier Selections I In this example, the thermal resistance of the heatsink and insulator (8cA> cannot exceed Il'C/W to meet specifications in still air. Appendix C includes a list of possible vendors for heatsinks. 2-557 AP-505 intel® 2.4.2 PROCESSOR CONSIDERATIONS 2.6 Intel Verification Program The power consumption of the IntelDX4 processor is comparable to that of the IntelDX2 processor, so existing thermal solutions that adhere to the published specifications should be adequate for the IntelDX4 processor. The Intel Verification Program establishes mmlmum system design criteria for reliable and straightforward CPU upgradability with the Pentium OverDrive processors. The criteria encompass physical, functional, electrical, thermal, and installation attributes of the Pentium OverDrive proces· sors. Upon successful system design verification by Intel, licensed OEMs will be able to advertise and promote their branded systems as "Intel Verified." An Intel published list of verified systems will pro· mote consumer awareness providing greater confidence in both system and upgrade buying decisions. The future Pentium OverDrive processor upgrade for the IntelDX4 processor will have an on-package fanl heatsink and is thermally equivalent to the Pentium OverDrive processor for IntelDX2 processor based systems. The on-package fanlheatsink unit is powered by the VCC5P pins, which should be connected directly to the 5V system power supply. 2.5 Placement and Layout Suggestions Figure 12 shows a complete suggested layout for the processor including the 3.3V power supply "island," adequate power supply decoupling, a linear voltage regulator, and a possible placement for the FET used with the VOLDET automatic voltage select circuit option. Note the placement of the capacitors close to the processor, and the wide connection from the voltage regula~ tor to the "island." Figure 13 shows the clearance required for the IntelDX4 processor and the future Pentium OverDrive processor. The future Pentium OverDrive processor is physically larger than the IntelDX4 processor, so it is not sufficient to merely provide clearance for the IntelDX4 processor if the system is to be upgradeable with the future Pentium OverDrive processor. 2-558 Converting An "Intel Verified: For the Pentium OverDrive Processor" System Design The Intel Verification program for IntelDX4 processor based system designs is similar to the recently introduced program for the IntelDX2 processor based systems. The criteria are basically the same with additional electrical and thermal tests. For IntelDX2 processor based designs that meet the criteria for the Pentium OverDrive processor, the additional criteria for the future Pentium OverDrive processor for the IntelDX4 processor pertain to the 3.3V supply specifications and voltage supply thermal requirements. This application note addresses both issues. Contact your local Intel representative for further information on the Intel Verification Program. I Ap·505 3.3 Volt ~ ~ :< 100 Ohm resistor (112 W) ........ • • • • • • • • • • • • • • • • • • • • • ••••••••• •••••• • • • • • • • • • • III ...C ...:C' ~ II III Borrel Shifter Ii! ~ h Segmentation Unit Basel Index 'tI Register File (; ~ 32 n III ~ ... a Attribute ~ ~ ;= !l iil micro-instruction ~ c m 0' n :0; C . iii' IQ DI 3 ~ L--...J. floating Point Unit t.P. Register File Control and Protection Test Unit Control RO~ Linear Address 8us .u. 32 PCO, PWT Paging Unit II 1 Address Drivers 32 20 Physicol Address Translation Lookaside Buffer Displacement 8us Instruction Decode Decoded Instruction Path Bk Byte Cache 12SR .., Prefelcher 32 I Code Stream '/2' I. I Bus Interface Cache Unit 2 Registers PLA l> 32 Descriptor Limit and ALU (/I Clock doubler 32-bit Dolo Bus 32 Byte Code , Queue 2 x 16 Bytes 32 ;:r CLK A2-A3 8EO#-1 E3# +---+ Write Buffers 4 x 80 ----- ... -----Dato Bus Transceivers Bus Control Request Sequencer 00-03 . ~, /R# o/c# ~/IO# PCO,PWT RDY# l OCK# PLOCK# BOF'F# A20t.4# BREO HOLD HLOA RESET INTR NI I FERR# GNNE# ------------ ~ 8RDY# 8LAST# Burst 8us Control ------------ ~ ------------- ~t ------------ ~' Bus Size Control 8516# B58# Coche Control LU5H# EADS# PCHK# Parity Generation and Control ~ 290436-1 Z -I m r- o < ...CCD ... ~' ~ "CI :D o n m c.l N 0'1 ~ :D tn INTEL OverDrive™ PROCESSORS 6.0 DIFFERENCES IN FUNCTIONALITY BETWEEN THE OverDrive™ PROCESSOR FAMILY AND THE Intel486™ SX AND Intel486™ DX PROCESSORS The Intel OverDrive processors are an enhanced family of Intel486 microprocessors. There are, however, four functional differences. First, the Intel OverDrive processors have an internal clock doubling (InteISX2, Inte1DX2) or clock tripling (InteIDX4) circuit which decreases the time required to execute instructions. Second, the Intel OverDrive processor family does not support the JTAG boundary scan test feature. Third, the Intel OverDrive processors have different processor revision identifications than the Intel486 SX or Intel486 DX processors. Finally, the IntelDX4 OverDrive processor contains a 16 KByte cache, as opposed to the 8 KByte cache on the IntelSX2 and IntelDX2 OverDrive processors. These four differences are described in the following sections, according to how they affect the processor functionality. 6.1 Hardware Interface The bus of the Intel OverDrive processors has been designed to be identical to the Intel486 Microprocessor bus. Although the external clock is internally doubled or tripled, and data and instructions are manipulated in the processor core at twice or three times the external frequency, the external bus is functionally identical to that of the Intel486 processor. The four boundary scim test signals (TCK, Test clock; TMS, Test Mode select; TDI, Test Data Input; TDO, Test Data Output), defined for some Intel486 processors, are not specified for the Intel486 DX2 OverDrive processor. The UP# (Upgrade Present) signal, which is defined as an input for some Intel486 processors, is an output signal on the Intel OverDrive processor. The UP# pin on the Intel OverDrive processor provides a logical low output signal which can be used to enable logic to recognize and configure the system for the Intel OverDrive processor. This signal is identical to the MP# output defined for the Intel487 SX Math CoProcessor. Refer to Section 7 for examples of use of the UP# signal. The DX register always contains the component identifier at the conclusion of RESET. The Intel OverDrive processor has a different revision identifier in the DL register than the Intel486 SX or Intel486 DX microprocessors (refer to Section 8.1). When the OverDrive processor is installed in a system the 3-26 component identifier is supplied by the OverDrive processor, rather than the .original processor. The stepping identification portion of the component identification will change with different revisions of the OverDrive processor. The designer should only assume that the component identification for the OverDrive processor will be 045x for the IntelSX2 OverDrive processor, 043x for the IntelDX2 OverDrive processor and 148x for the IntelDX4 OverDrive processor, where "x" is the stepping identifier. 6.2 Testability As detailed in Section 6.1, the Intel OverDrive processor does not support the JTAG boundary scan testability feature. 6.3 Instruction Set Summary The Intel OverDrive processor supports· all Intel486 extensions to the 8086/80186/80286 instruction set. In general, instructions will run faster on the Intel OverDrive processors than on the Intel486 microprocessor. Specifically, an instruction that only uses memory from the on-chip cache executes at the full core clock rate while all bus accesses execute at the bus clock rate. To calculate the elapsed time of an instruction, the number of clock counts for that instruction must be multiplied by the· clock period for the system. The instruction set clock count summary tables from the Intel486 SX and Intel486 DX Microprocessor Data Sheets can be used for the OverDrive processor with the following modifications: - ClOck counts for a cache hit: This value represents the number of internal processor core clocks for an instruction that requires no external bus accesses or the base core clocks for an instruction requiring external bus accesses. - Penalty clock counts for a cache miss: This value represents the worst-case approximation of the additional number of external clock counts that are required for an instruction which must access the external bus for data (a cache miss). This number must be multiplied by 2 (for the IntelSX2 and IntelDX2 OverDrive processors) or 3 (for the IntelDX4 OverDrive processor) to convert it to an equal number of internal processor core clock counts and added to the base core clocks to compute the number of core clocks for this instruction. The actual number of core clocks for an instruction with a cache miss may be less than the base clock counts (from the cache hit column) plus the penalty clock counts. (2 times the cache miss column number for the IntelSX2 and Inte1DX2, 3 times the cache miss column number for the InteIDX4). The clock INTEL OverDrive™ PROCESSORS counts in the cache miss penalty column can be a cumulative value of external bus clocks (for data reads) and internal clocks for manipulating the data which has been loaded from the external bus. The number of clocks which are related to external bus accesses are correctly represented in terms of internal core clocks by multiplying by two. However, the clock counts related to internal data manipulation should not be multiplied by two. Therefore the total number of processor core clock counts for an instruction with a cache miss represents a worst-case approximation. To calculate the execution time for an OverDrive processor instruction, multiply the total processor core clock counts by the core clock period. For example, in a 25 MHz system upgraded with a 50 MHz IntelDX2 OverDrive processor, the core clock period is 20 ns (1/50 MHz). Additionally, the assumptions specified below should be understood in order to estimate instruction execution time. A cache miss will force the OverDrive processor to run an external bus cycle. The Intel486 microprocessor 32-bit burst bus is defined as r-b-w. Where: r = The number of bus clocks in the first cycle of a burst read or the number of clocks per data cycle is a non-burst read. b = The number of bus clocks for the second and subsequent cycles in a burst read. w = The number of bus clocks for a write. The fastest bus the OverDrive processor can support is' 2 -1 - 2 assuming 0 waits states. The clock counts in the cache miss penalty column assume a 2 - 1 - 2 bus. For slower busses add r - 2 clocks to the cache miss penalty for the first dword accessed. Other factors also affect instruction clock counts. Instruction Clock Count Assumptions 1. The external bus is available for reads or writes at all times. Else add bus clocks to reads until the bus is available 2. Accesses are aligned. Add three core clocks to each misaligned access. 3. Cache fills complete before subsequent accesses to the same line. If a read misses the cache during a cache fill due to a previous read or prefetch, the read must wait for the cache fill to complete. If a read or write accesses a cache line still being filled, it must wait for the fill to complete. 4. If an effective address is calculated, the base register is not the destination register of the preceding instruction. If the base register is the destination register of the preceding instruction add 1 to the core clock counts shown. Back-to-back PUSH and POP instructions are not affected by this rule. 5. An effective address calculation uses one base register and does not use an index register. However, if the effective address calculation uses an index register. 1 core clock may be added to the clock shown. 6. The target of a jump is in the cache. If not, add r clocks for accessing the destination instruction of a jump. If the destination instruction is not completely contained in the first dword read, add a maximum of 3b bus clocks. If the destination instruction is not completely contained in the first 16 byte burst, add a maximum of another r + 3b bus clocks. 7. If no write buffer delay, w bus clocks are added only in the case in which all write buffers are full. 8. Displacement and immediate not used together. If displacement and immediate used together, 1 core clock may be added to the core clock count shown. 9. No invalidate cycles. Add a delay of 1 bus clock for each invalidate cycle if the invalidate cycle contends for the internal cache/external bus when the OverDrive processor needs to use it. 10. Page translation hits in TLB. A TLB miss will add 13,21 or 28 bus clocks + 1 possible core clock to the instruction depending on whether the Accessed and/or Dirty bit in neither, one or both of the page entries needs to be set in memory. This assumes that neither page entry is in the data cache and a page fault does not occur on the address translation. 11. No exceptions are detected during instruction execution. Refer to interrupt core Clock Counts Table for extra clocks if an interrupt is detected. 12. Instructions that read multiple consecutive data items (i.e., task switch, POPA, etc.) and miss the cache are assumed to start the first access on a 16-byte boundary. If not, an extra cache line fill may be necessary which may add up to (r + 3b) bus clocks to the cache miss penalty. 3-27 INTEL OverDrive™ PROCESSORS· 7.0 allows the Intel486 processor to directly recognize when the Intel OverDrive processor socket is populated. When the UP# pin is driven active to the Intel486 processor, the Intel486 processor tri-states all of its output pins and enters power-down mode. INTEL OverDrive™ PROCESSOR CIRCUIT DESIGN 7.1 Upgrade Circuit for Intel486 Processor-Based Systems with UP# Figure 7-1 shows the Intel OverDrive processor socket circuit for Intel486 processor-based systems using UP#. The Upgrade Present input, UP# pin, I~ CT Rl ClK AD DR DA TA -- H lD A - DATA ADDR CTRl ~ ClK HLDA I - DATA ClK FlUSH# ADDR CTRl HLDA I-- FLUSH# Inlel OVERDRIVETM UP# PROCESSOR BOFF# UP# i486™ Processor BOFF# IGNNE# FERR# HOLD HOLD R II IGNNE# II II II FERR# Vee BOFF# n FlUSH# II HOLD 290436-6 Figure 7-1. Intel OverDrive™ Socket Circuit Diagram for Systems Based on Intel486TM Processors That Have the UP# Input Pin 3-28 INTEL OverDrive™ PROCESSORS NOTE: 8.0 BIOS AND SOFTWARE The following should be considered when designing a system for upgrade with an Intel OverDrive processor. 8.1 Intel OverDrive™ Processor Detection The component identifier and the steppinglrevision identifier for the Intel OverDrive processors is readable in the DH and DL registers, respectively, immediately after RESET. The value loaded into each register is defined in Table 8-1. The "x" value defines the device stepping. Table 8-1. CPU 10 Values Processor Intel486DX OHReg. OLReg. 04h Oxh, 1xh Intel486SX 04h 2xh IntelSX2 OverDrive 04h 5xh IntelDX2 OverDrive 04h 3xh IntelDX4 OverDrive 14h 8xh As it is difficult to differentiate between Intel486 DX processor and some of the Intel OverDrive processors in software, it is recommended that the BIOS save the contents of the DX register immediately after RESET. This will allow the information to be used later, if required, to identify an Intel OverDrive processor in the system. Alternately, for those OverDrive processors supporting it, the CPUID instruction can be used to identify the processor. Refer to the Intel486 Microprocessor Data Book for additional information on the CPUID instruction and its use. Initialization routines for IntelSX2 OverDrive processor and Intel486 SX processor-based systems should check for the presence of a floating point unit and set the CRO register accordingly (refer to the Intel486 SX Microprocessor Data Book for specific details). In addition, the BIOS should check for the presence of the 16 KByte cache in the IntelDX4 OverDrive processor. 8.2 Timing Dependent Loops The Intel OverDrive processors execute instructions at two times (for the IntelSX2 and IntelDX2 OverDrive processors) or three times (for the IntelDX4 OverDrive processor) the frequency of the input clock. Thus, software (or instruction based) timing loops will execute faster on the Intel OverDrive processor than on the Intel486 DX or Intel486 SX processor (at the same input clock frequency). Instructions such as NOP, LOOP, and JMP $+2, have been used by BIOS to implement timing loops that are required, for example, to enforce recovery time between consecutive accesses for 1/0 devices. These instruction based timing loop implementations may require modification for systems intended to be upgradable with the Intel OverDrive processors. In order to avoid any incompatibilities, it is recommended that timing requirements be implemented in hardware rather than in software. This provides transparency and also does not require any change in BIOS or 1/0 device drivers in the future when moving to higher processor clock speeds. As an example, a timing routine may be implemented as follows: The software performs a dummy 1/0 instruction to an unused 1/0 port. The hardware for the bus controller logic recognizes this 1/0 instruction and delays the termination of the 110 cycle to the processor by keeping RDY # or BRDY # deasserted for the appropriate amount of time. 3-29 INTEL OverDrive™ PROCESSORS 9.0 ELECTRICAL DATA The following sections describe recommended electrical connections for the Intel OverDrive processor, and its electrical specifications. 9.1 Low inductance capacitors and interconnects are recommended for best high frequency electrical performance. Inductance 'can be reduced by shortening circuit board traces between the Intel OverDrive processor and decoupling, capacitors as much as possible. Capacitors specifically for PGA packages are also commercially available. Power and Grounding 9.1.3 OTHER CONNECTION RECOMMENDATIONS 9.1.1 POWER CONNECTIONS Power and ground connections must be made to all external Vee and GND pins of the Intel OverDrive processor. On the circuit board, all Vee pins must be connected on a Vee plane. All Vss pins must be likewise connected on a GND plane. 9.1.2 POWER DECOUPLING RECOMMENDATIONS Liberal decoupling capacitance should be placed near the Intel OverDrive processor. The· Intel OverDrive processor driving its 32-bit parallel address and data busses at high frequencies· can cause transient power surges, particularly when driving large capacitive loads. N.C. pins should always remain unconnected. For reliable operation, always connect unused inputs to an appropriate signal level. Active LOW inputs should be connected to Vee through a pullup resistor. Pullups in the range of 20 K.!1 are recommended. Active HIGH inputs should be connected to GND. 9.2 Maximum Ratings Table 9-1 lists the absolute maximum ratings for each of the OverDrive processors. This table is a stress rating only, and functional operation at the maximums is not guaranteed. Functional operating conditions are given in Section 9.3, D.C. Specifications, and Section 9.4, A.C. Specifications. Table 9-1. Absolute Maximum Ratings Case Temperature under Bias Storage Temperature Voltage on any Pin with Respect to Ground Supply Voltage with Respect to Vss 3-30 IntelSX2TM OverDrive™ IntelDX2TM OverDrive™ IntelDX4TM OverDrive™ + 110°C - 65°C to + 150°C -0.5V to Vee + 0.5V -65°C to +110°C -30°C to + 110°C -65°6 to - 0.5V to + 6.5V -65°C to + 150°C -0.5V to Vee + 0.5V - 0.5V to + 6.5V - 30°C to + 125°C + 0.5V - 0.5V to Vee -0.5Vto +6.5V INTEL OverDrive™ PROCESSORS 9.3 D.C. Specifications 9.3.1 IntelSX2™ OverDrive™ PROCESSOR D.C. SPECIFICATIONS The D.C. specifications for each of the OverDrive processors are contained in the tables in Sections 9.3.1, 9.3.2 and 9.3.3. For additional information, refer to the appropriate Intel microprocessor handbook. Table 9-2 details the D.C. Specifications of the IntelSX2 OverDrive processor. Table 9-2. D.C. Specifications for the IntelSX2TM OverDrive™ Processor Functional operating range: VCC Symbol Parameter = 5V ± 5%; TSINK = O°C to + 85°C Max Unit VIL Input low Voltag~ -0.3 +0.8 V VIH Input High Voltage 2.0 VCC + 0.3 V VOL Output low Voltage 0.45 V (Note 1) VOH Output High Voltage 2.4 III Input leakage Current -15 IIH Input leakage Current (all pins except SRESET) Input leakage Current for SRESET ICC Power Supply Current ClK = 25 MHz IlL Input leakage Current ILO Output leakage Current CIN Input Capacitance PGA COUT Output or 1/0 Capacitance PGA CCLK ClK Capacitance PGA Min Typ 700 -15 Test Condition V (Note 2) 15 fJoA (Note 3) 200 300 fJoA fJoA (Note 4) 950 rnA -400 fJoA 15 fJoA 20 pF (Note 7) 20 pF (Note 7) 20 pF (Note 7) (Note 5) NOTES: 1. This parameter is measured at: 2. 3. 4. 5. 6. 7. Address, Data, BEn: 4.0 mA Definition, Control: 5.0 mA This parameter is measured at: Address, Data, BEn: -1.0 mA Definition, Control: -0.9 mA This parameter is for inputs without pullups or pulldowns and OV :s: VIN :s: Vee. This parameter is for inputs with pulldowns and VIH = 2.4V. This parameter is for inputs with pull ups and VIL = 0.45V. When the processor is in Stop Grant state, the leeu of the host processor is less than 2 mAo Fe = 1 MHz; Not 100% tested. 3-31 INTEL OverDrive™ PROCESSORS 9.3.2 IntelDX2TM OverDrive™ PROCESSOR D.C. SPECIFICATIONS Table 9-3 details the D.C. Specifications of the IntelDX2 OverDrive processor. Table 9-3. D.C. Specifications for the IntelDX2TM OverDrive™ Processor Symbol Parameter Min Max Unit +0.8 V Notes Vil Input low Voltage -0.3 VIH Input High Voltage 2.0 VOL Output low Voltage V (Note 2) VOH Output High Voltage V (Note 3) ICC Power Supply Current ClK = 33 MHz ClK = 25 MHz rnA (Note 4) III Input leakage Current JJ-A (Note 5) IIH Input leakage Current 200 JJ-A (Note 6) -400 JJ-A (Note 7) ±15 JJ-A III V ILO Output leakage Curre CIN Input Capacitance 13 pF FC Co I/O or Output Capacitance 17 pF FC CClK ClK Capacitance 15 pF Fc NOTES: 1. The function operating temperature range is: OverDrive processor-25 MHz, Tsink = O'C to + 95'C OverDrive processor-33 MHz, Tsink = O'C to + 95'C 2. This parameter is measured at! Address, Data, BEn 4.0 mA Definition, Control 5:0 mA 3. This parameter is measured at: Address, Data, BEn -1.0 mA Definition, Control -0.9 mA 4. Typical supply current: 775 mA @ ClK = 25 MHz 975 mA @ ClK = 33 MHz 5. This parameter is for inputs without internal pullups or pulldowns and 0 s VIN s Vee. 6. This parameter is for inputs with internal pulldowns and VIH = 2.4V. 7. This parameter is for inputs with internal pullups and VIL = 0.45V. S. Not 100% tested. 3-32 = = = 1 MHz(S) 1 MHz(S) 1 MHz(S) INTEL OverDrive™ PROCESSORS 9.3.3 IntelDX4™ OverDrive™ PROCESSOR D.C. SPECIFICATIONS Table 9-4 details the D.C. Specifications of the IntelDX4 OverDrive processor. Table 9-4. D.C. Specifications for the IntelDX4TM OverDrive™ Processor Functional operating range: Vcc = 5V + 5%, TSINK = O°C to + 95°C. Symbol Min Max Unit Input low Voltage -0.3 +0.8 V VIH Input High Voltage 2.0 Val Output low Voltage VOH Output High Voltage Icc Power Supply Current ClK = 25175 MHz ClK = 33/100 MHz Vil Parameter Vcc + 0.3 "".0.45 2.4 ,~"~~~~ ::i\ Notes .V V (Note 1) V IOH = -2 1200 1550 rnA (Note 2) 85 110 rnA rnA rnA (Note 3) (Note 5) rnA :." .... •... Icc Stop Grant Power Supply Current in Stop ClK = 25175 MHz ClK = 33/100 MHz '.: IcC Stop Clock Power Supply Current in St Clock State 20 III Input leakage Current::~¥i ±15 IIH Input leakage Current 200 III Input leakage Current -400 ILO Output leakage Current ±15 /-LA /-LA /-LA /-LA CIN Input Capacitance 13 pF Fc Co 1/0 or Output Capacitance 17 pF Fc CClK ClK Capacitance 15 pF Fc • (Note 4) e:.•. (Note 6) (Note 7) = = = MHz(8) MHz(8) MHz(8) NOTES: 1. This parameter is measured at: 4.0 rnA: Address, Data, BEn 5.0 rnA: Definition, Control 2. The maximum and typical values shown here are design estimates. Typical supply current: IcC ~ 835 rnA @ ClK = 25 MHz Icc = 1085 rnA @ ClK = 33 MHz 3. The Icc Stop Grant specification refers to the ICC value once the IntelDX4 OverDrive processor enters the Stop Grant or Halt Auto Powerdown State. 4. The Icc Stop Clock specification refers to the Icc value once the IntelDX4 OverDrive processor enters the Stop Clock State. VIH and VIL levels must be Vcc and OV, respectively, in order to meet the Icc Stop Clock specification. 5. This parameter is for inputs without pullups or pulldowns and 0 ~ VIN :S: Vcc. 6. This parameter is for inputs with pulldowns and VIH = 2.4V. 7. This parameter is for inputs with pullups and VIL = 0.45V. 8. Not 100% tested. 3-33 INTEL OverDrlve™ PROCESSORS 9.4 A.C. Specifications The A.C. specifications for each of the OverDrive processors are contained in the tables in Sections 9.4.1, 9.4.2· and 9.4.3. These specifications 'consist of output delays, input setup requirements and input hold requirements. All A.C. specifications are relative to the rising edge of the ClK signal. A.C. specification measurements are defined by Figures 9-1 through 9-6. All timings are referenced to 1.5V, unless otherwise specified. Inputs must be driven to the voltage levels indicated by Figure 9-3 when A.C. specifications are measured. Intel OverDrive processor output delays are specified with minimum and maximum limits, measured as shown. The minimum Intel OverDrive processor delay times are hold times provided to external circuitry. Intel OverDrive processor input setup and hold times are specified as minimums, defining the smallest acceptable sampling window. Within the sampling win'dow, a synchronous signal must be stable for correct Intel OverDrive processor'operation. Table 9-5 defines the A.C. timing specifications for a 33 MHz system. Table 9,7 defines the A.C. timing specifications for a 25 MHz system. Table 9-8 defines the A.C. timing specifications for a 20 MHz system. Table 9-,8 defines the A.C. timing specifications' for a 16 MHz system. Each Intel OverDrive processor meets the A.C. specifications for the processor it is upgrading. For example, a 100 MHz IntelDX4 OverDrive processor meets the system A.C. timing specifications for the 33 MHz processor it is upgrading. Refer to Sections 9.4.1 through 9.4.3 for any timing differences from those specified in the following tables. For additional information, refer to the appropriate Intel microprocessqr handbook. 3-34 9.4.1 IntelSX2™ OverDrive™ PROCESSOR A.C. SPECIFICATIONS The IntelSX2 OverDrive processor can be placed into an existing 16 MHz, 20 MHz or 25 MHz Intel486 system, doubling the internal processor speed to 32 MHz, 40 MHz or 50 MHz, respectively, Tables 9-6 through 9-8 contain the A.C. timing specifications for the processors in those systems. 9.4.2 IntelDX2TM OverDrlve™ PROCESSOR A.C. SPECIFICATIONS The IntelDX2 OverDrive processor can be placed into an existing 16 MHz, 20 MHz, 25 MHz or 33 MHz Intel486 system, doubling the internal processor speed to 32 MHz, 40 MHz, 50 MHz or 66 MHz, respectively. Tables 9-5 through 9-8 contain the A.C. timing specifications for the processors in those systems. 9.4.3 IntelDX4TM Ovei"Drlve™ PROCESSOR A.C. SPECIFICATIONS The IntelDX4 OverDrive processor can be placed into an existing 16 MHz, 20 MHz, 25 MHz or 33 MHz Intel486 system, tripling the internal processor speed to 48 MHz, 60 MHz, 75 MHz or 100 MHz, respectively. Tables 9-5 through 9-8 contain the A.C. timing specifications for the processors in those systems. intel® INTEL OverDrive™ PROCESSORS Table 9·5. 33 MHz Intel Processor Characterjstics(1) Vcc = 5V ± 5%; T sink = See Note 6;CI = 50 pF unless otherwise specified(3) Symbol Parameter Figure Min Max Unit Frequency 8 33 MHz Notes t1 ClKPeriod 30 125 ns t1 a ClK Period Stability 0.1% A. t2 ClK High Time 11 ns 5.1 at2V t3 ClKlowTime 11 ns 5.1 atO.8V ?VtoO.8V 1X Clock Driven to OverDrive processor 5.1 Adjacent Clocks t4 ClKFaliTime 3 ns 5.1 t5 ClK Rise Time 3 ns 5.·1.:: ~t:6:8V to 2V t6 A2-A31, PWT, PCO, BEO-3#, M/IO#, O/C#, W/R#, AOS#, lOCK#, SMIACT#, FERR#, BREQ, HlOA Valid Delay t7 A2-A31, PWT, PCO, BEO-3#, M/IO#, O/C#, W/R#, AOS#, lOCK# Float Delay t8 PCHK# Valid Delay (Note 2) t8a 5.4 (Note 4) 5.5 (Note 4) t9 BLAST #, PLOCK # Float Oela 5.6 (Note 2) t10 00-031, OPO-3 Write Oat 5.5 (Note 4) 5.6 (Note 2) t11 t12 EAOS # Setup Time t13 EAOS# Hold Tim ns 5.2 t14 KEN#, BS16~'/BS ns 5.2 ns ns 5.2 5 ns 5.3 5.3 t15 t16 ROY#, BROY# Se 5.2 t17 ROY#, BROY# 3 ns t18 HOlO,AHO 6 ns 5.2 t18a BOFF# 7 ns 5.2 t19 HOLD, 3 ns 5.2 t20 RESET, FlUSH#, A20M#, NMI, INTR, SMI#, STPClK#, SRESET, IGNNE# Setup Time 5 ns 5.2 t21 RESET, FLUSH #, A20M #, NMI, INTR, SMI#, STPClK#, SRESET, IGNNE# Hold Time 3 ns 5.2 t22 00-031, OPO-3, A4-A31 Read Setup Time 5 ns 5.2,5.3 t23 00-031, OPO-3, A4-A31 Read Hold Time 3 ns 5.2,5.3 , BOFF # Hold Time NOTES: 1. To be used for 66 MHz IntelDX2 and 100 MHz IntelDX4 OverDrive processors. 2. Not 100% tested. Guaranteed by design characterization. 3. All timing specifications assume CL = 50 pF. 4. The minimum Intel OverDrive processor output valid delays are hold times provided to external circuitry. 5. A reset pulse width of 15 CLK cycles is required for warm resets. Power-up resets require RESET to be asserted for at least 1 ms after Vee and CLK are stable. 6. TSINK temperatures are: IntelSX2 OverDrive processor: O'C to + 85'C IntelDX2 OverDrive processor: O'C to + 95'C IntelDX4 OverDrive processor: O'C to + 95'C 3·35 intel® INTEL OverDrive™ PROCESSORS Table 9.-6. 25 MHz Intel Processor Characteristlcs(l) Vec = 5V ±5%; Tsink = Symbol See Note 6; C, = 50 pF unless otherwise specified(3) Parameter Max Unit 25 MHz 125 ns 0.1 % t.. Figure Frequency Min 8 tl ClK Period 40 tla ClK Period Stability t2 ClK High Time 14 ns 5.1 t3 ClKlowTime 14 ns 5.1 L! CLKFaliTime t5 ClK Rise Time te A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA, SMIACT#, Valid Delay t7 A2-A31, PWT, PCD, BEO-3#,M/IO#, D/C#, W/R#, ADS#, lOCK# Float Delay ts PCHK# Valid Delay tSa BLAST#, PlOCK# Valid Del 4 ns 4 ns Notes 1X Clock Driven to OverDrive Processor 5.1 3 (Note 2) (Note 4) 5.5 (Note 4) t9 5.6 (Note 2) tl0 5.5 (Note 4) (Note 2) tll ns 5.6 t12 ns 5.2 t13 ns 5.2 t14 ns 5.2 t15 ns 5.2 5.3 t16 8 ns t17 3 ns 5.3 8 ns 5.2 FF # Setup Time tiS t19 HOLD, 3 ns 5.2 t20 SH#, A20M#, NMI, SMI#, RESET, STPClK#, SRESET, INTR, IGNNE# Setup Time 8 ns 5.2 t21 RESET, FLUSH #, A20M #, NMI, SMI #, STPClK#, SRESET, INTR, IGNNE# Hold Time 3 ns 5.2 t22 DO-D31, DPO-3, A4-A31 Read Setup Time 5 ns 5.2,5.3 t23 DO-D31, DPO-3, M-A31 Read Hold Time 3 ns 5.2,5.3 , BOFF # Hold Time NOTES: 1. To be used for 50 MHz Inte1SX2, 50 MHz or 60 MHz IntelDX2 and 75 MHz or 100 MHz IntelDX4 OverDrive processors. 2. Not 100% tested. Guaranteed by design characterization. 3. All timing specifications assume CL = 50 pF. 4. The minimum Intel OverDrive processor output valid delays are hold times provided to external circuitry. 5. A reset pulse width of 15 ClK cycles is required for warm resets. Power-up resets require RESET to be asserted for at least 1 ms aiter Vee and ClK are stable. . . 6. TSINK temperatures are: IntelSX2 OverDrive processor: ODC to +S5DC IntelDX2 OverDrive processor: ODC to + 95DC IntelDX4 OverDrive processor: ODC to + 95DC 3-36 intel® INTEL OverDrive™ PROCESSORS Table 9·7. 20 MHz Intel Processor Characteristics(1) VCC = 5V ±5%; TSINK = Symbol See Note 6; CI = 50 pF unless otherwise specified(3) Min Max Unit Frequency Parameter 8 20 MHz t1 ClK Period 50 125 ns t1a ClK Period Stability 0.1% !:J. t2 ClK High Time 16 t3 ClK low Time 16 t4 ClK Fall Time 6 t5 ClK Rise Time 6 t6 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA, SMIACT# Valid Delay t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, LOCK# Float Delay Figure Notes 1 X Clock Driven to OverDrive Processor 5.1 Adjacent Clocks ns 5.1 at2V ns 5.1 at O.8V ns 5.1 2Vto 0.8V . ns 5.1 ~Q,8V to 2V ;\{KJote 4) 3 (Note 2) te PCHK# Valid Delay (Note 4) tea BlAST#, PlOCK# Valid Delay (Note 4) t9 BLAST #, PLOCK # Float Del 5.6 t10 ns t11 (Note 4) 5.6 (Note 2) t12 EADS # Setup Ti ns 5.2 t13 EADS # Hold Ti ns 5.2 t14 KEN#, BS16ffiiiBS t15 (Note 2) 5.5 ns 5.2 ns 5.2 t16 10 ns 5.3 t17 3 ns 5.3 5.2 t1e upTime 12 ns t19 FF # Hold Time 3 ns 5.2 t20 H#, A20M#, NMI, SMI#, RESE STPClK ,SRESET, INTR, IGNNE# Setup Time 12 ns 5.2 (Note 5) t21 RESET, FlUSH#, A20M#, NMI, SMI#, STPClK#, SRESET, INTR, IGNNE# Hold Time 3 ns 5.2 (Note 5) t22 00-031, DPO-3, A4-A31 Read Setup Time 6 ns 5.2,5.3 t23 00-031, DPO-3, A4-A31 Read Hold Time 3 ns 5.2,5.3 NOTES: 1. To be used for 50 MHz Inte1SX2, 50 MHz or 60 MHz IntelDX2 and 75 MHz or 100 MHz IntelDX4 OverDrive processors. 2. Not 100% tested. Guaranteed by design characterization. 3. All timing specifications assume CL = 50 pF. 4. The minimum Intel OverDrive processor output valid delays' are hold times provided to external circuitry. 5. A reset pulse width of 15 ClK cycles is required for warm resets. Power·up resets require RESET to be asserted for at least 1 ms after Vcc and ClK are stable. 6. TSINK temperatures are: IntelSX2 OverDrive processor: O'C to + e5'C IntelDX2 OverDrive processor: O'C to + 95'C IntelDX4 OverDrive processor: O'C to + 95'C 3-37 intel® INTEL OverDrive™ PROCESSORS Table 9-8. 16 MHz Intel Processor Characterlstlcs(1) Vcc = 5V ±5%; TSINK = See Note 6; CI = 50 pF unless otherwise specified Symbol Parameter Figure Notes Min Max Unit Frequency 8. 16 MHz t1 ClK Period 62.5 125 ns t1a ClK Period Stability 0.1% t. t2 ClK High Time 20 ns 5.1 at2V t3 ClK low Time 20 ns 5.1 atO.8V t4 ClK Fall Time t5 ClK Rise Time t6 A2-A31, PWT, PCD, BEO-3#, MIIO#, D/C#, W/R#, ADS#, lOCK#, FERR#, BREQ, HlDA, SMIACT# Valid Delay t7 A2-A31, PWT, PCD, BEO-3#, M/IO#, D/C#, W/R#, ADS#, lOCK# Float Delay ts PCHK# Valid Delay 8 ns 8 ns 1X Clock Driven to OverDrive Processor 5.1 Adjacent Clocks 2Vto 0.8V ote4) 3 (Note 2) (Note 4) (Note 4) tSa t9 5.6 (Note 2) tlO 5.5 (Note 4) (Note 2) tll ns 5.6 t12 ns 5.2 t13 ns 5.2 t14 ns 5.2 t15 ns 5.2 t16 12 ns 5.3 t17 4 ns 5.3 12 ns 5.2 4 ns 5.2 t1s t19 F # Setup Time OFF# Hold Time t20 RESE .' H#,A20M#,NMI,SMI#, STPClK#';'SRESET, INTR, IGNNE# Setup Time 14 ns 5.2 (Note 5) t21 RESET, FlUSH#, A20M#, NMI, SMI#, STPClK#, SRESET, INTR, IGNNE# Hold Time 4 ns 5.2 (Note 5) t22 DO-D31, DPO-3, A4-A31 Read Setup Time 10 ns 5.2,5.3 t23 DO-D31, DPO-3, A4-A31 Read Hold Time 4 ns 5.2,5.3 NOTES: 1. To be used for 50 MHz Inte1SX2, 50 MHz or 60 MHz IntelDX2 and 75 MHz or 100 MHz IntelDX4 OverDrive processors. 2. Not 100% tested. Guaranteed by design characterization. 3. All timing specifications assume CL = 50 pF. 4. The minimum Intel OverDrive processor output valid delays are hold times provided to external circuitry. 5. A reset pulse width of 15 CLK cycles is required for warm resets. Power-up resets require RESET to be asserted for at least 1 ms after Vee and CLKare stable. 6. TSINK temperatures are: IntelSX2 OverDrive processor: O'C to + S5'C IntelDX2 OverDrive processor. O'C to + 95'C IntelDX4 OverDrive processor: O'C to + 95'C 3-38 INTEL OverDrlve™ PROCESSORS 1.5V IS ~------Il------~ 290436-7 Figure 9·1. elK Waveforms Tx Tx Tx Tx ClK [ EAD5# [ ~~"_ _ _-+-___1oW01jWo B58#. B516#, [ KEN# ~~~~ _ _~_ _~~~ BOFF# (1 8a), "AHOlD, [ HOLD ~~~~ _ _~_ _~~~ RESET, FlU5H#, [ A20M#, IGNNE#, INTR, NMI A4-A31 [ (READ) ~~~~--~--~~~ ....._ _ _ _ _.oG&:~ ~~,Mg 290436-8 Figure 9·2. Input Setup and Hold Timing Tx Tx Tx ClK [ RDY#, BRDY# [ ~~~~_ _~_ _..J~~- 00-031 [ DPO-DP3 ~~r==!:::=E~-~ 290436-9 Figure 9·3. Input Setup and Hold Timing 3-39 INTEL OverDrlve™ PROCESSORS CLK [ BROY#, ROY# [ 00-031 OPO-OP3 [ PCHK# [ 290436-10 Figure 9-4. PCHK # Valid Delay Timing Tx CLK [ A2-A31. PWT, PCO, BEO-3#, M/IO#, o/c#, W/R#, AOS#, LOCK#, FERR#, BREQ, HLOA [ 00-031, OPO-3, (WRITE) [ BLAST#, :LOCK# [ Tx Tx Tx 290436-11 Figure 9~5. Output Valid Delay Timing Tx Tx Tx Tx CLK [ A2-A31, PWT, PCO, BEO-3#, M/IO#, o/c#, W/R#, AOS#, [ LOCKI#. FERR#, BREQ, .HLOA 00-D31, DPO-3, . (WRITE) [ BLAST#, PLOCK# [ 290436-12 Figure 9-6. Maximum Float Delay Timing 3-40 INTEL OverDrive™ PROCESSORS 10.0 MECHANICAL DATA 10.1 The following sections describe the physical dimensions of the OverDrive processor packages and heat sinks. Figure 10-1 describes the physical dimensions of the PGA packages (16B-lead PGA and 169-lead PGA) used with the Intel8X2, IntelDX2 and IntelDX4 OverDrive processors. ¢1.65 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ R~. @@@@@@@@@@@@@@@@@ *=@@@@@@@@@@@@@@@@@ 1 @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ @@@ C3""",\ @ @ @ "./ @@@ @@@ @@@ o@@@@ @ @@(1) @@@ @@O@@@@@@@@@@@.Hj@@ @@@@@@@@@@@@@@@o@ L-@@@@@@@@@@@@@@@o@ " ili .t Ui ----- ~ - ( PIN Package Dimensions 1. Pin not present on 168 lead PGA package. REf. SWAGGED .5 CHAMfER (INDEX CORNER) - ,- Fc:=:ftt] SWAGGED PIN DETAIL I.... A.-I- ~ BASE PLANE-- PIN (4 PL) 0 ""1 I ¢B (ALL PINS) ----- D \ SEATING PLANE ~ L 290436-17 Family: Ceramic Pin Grid Array Package Millimeters Symbol Min Max A 3.56 4.57 A, 0.64 1.14 A2 2.8 3.5 A3 1.14 Inches Notes Min Max 0.140 0.180 SOLID LID 0.025 0.045 SOLID LID SOLID LID 0.110 0.140 SOLID LID 1.40 0.045 0.055 B 0.43 0.51 0.017 0.020 0 0, 44.07 44.83 1.735 1.765 40.51 40.77 1.595 1.605 2.29 2.79 0.090 0.110 2.54 3.30 0.100 0.130 e, L N S, ISSUE 168,169 1.52 IWS 168,169 2.54 REVX Notes 0.060 0.100 7/15/88 Figure 10-1. OverDrive™ Processor Package Dimensions 3-41 INTEL OverDrive™ PROCESSORS 10.2 Heat Sink Dimensions 10.2.1 0.25" HEAT SINK There are two different heat sinks used on the .Intel OverDrive processors. The IntelSX2 and IntelDX2 OverDrive processors both use the 0.25" heat sink. The IntelDX4 OverDrive processor uses the 0.6" heat sink. Both, heat sinks are described in the fol· lowing sections. Figure 10·2 describes the physical dimensions of the 0.25" heat sink used with the IntelSX2 and IntelDX2 OverDrive processors. Table 10·1 lists the physical dimensions. 1-1'- - - - : ----1-1 c·~ f 0.25" HEAT SINK r ~ .-LJ ~~~~~~~~~~~C=~~ .-/' ADHESIVE f G ~'-~I~~UrP_GRrA_DrE_PRrO_CErS_SOrR_'rPG_ArP_ArCK_ArGE~~TTI~~I:-r . .~ 290436-19 Figure 10-2. Dimensions, IntelSX2TMllntelDX2TM OverDrive™ Processor with 0.25" Heat Sink Table 10-1.0.25" Heat Sink Dimensions Dimension (inches) 3-42 Minimum Maximum A. Heat Sink Width 1.520 1.550 B. PGA Package Width 1.735 1.765 C. Heat Sink Edge Gap 0.065 0.155 D. Heat Sink Height 0.212 0.260 E. Adhesive Thickness 0.008 0.012 F. Package Height from Stand-Offs 0.140 0.180 G. Total Height from Package Stand·Offs to Top of Heat Sink 0.360 0.452 INTEL OverDrive™ PROCESSORS 10.2.2 0.6" HEAT SINK Figure 10-3 describes the physical dimensions of the 0.6" heat sink used with the IntelDX4 OverDrive processors. The maximum and minimum dimensions for the PGA package with heat sink are shown in Table 10-2. As the table shows, the maximum height of the IntelDX4 OverDrive processor from the pin stand-offs to the top of the heat sink, including the adhesive thickness, is 0.780 inches. A minimum clearance of 0.25" should be allowed above the top of the heat sink. ~C'~ t B I- A 0.6" HEAT SINK D Ll /ADHESIVE ""'--UPGRADE PROCESSOR, PGA PACKAGE - I f G .~ I .~ Iff ! 290436-36 Figure 10-3. Dimensions, IntelDX4™OverDriveTM Processor with 0.6" Heat Sink Table 10-2.0.6" Heat Sink Dimensions Dimension (inches) Minimum Maximum A. Heat Sink Width 1.520 1.550 B. PGA Package Width 1.735 1.765 C. Heat Sink Edge Gap 0.065 0.155 D. Heat Sink Height 0.580 0.600 E. Adhesive Thickness 0.006 0.012 F. Package Height from Stand-Offs 0.140 0.180 G. Total Height from Package Stand-Offs to Top of Heat Sink 0.720 0.780 3-43 intel® INTEL OverDrive™ PROCESSORS The standard product markings and logo for the Intel OverDrive processor with the attached heat sink will be included on a 1in 2 plate located on the top, center of the heat sink. 11.0 THERMAL MANAGEMENT The heat generated by the Intel OverDrive processor requires that heat dissipation be managed carefully. All OverDrive processors are supplied with a heat sink attached with adhesive to· the package. System designs must, therefore, provide sufficient clearance (a minimum of 0.25" above the heat sink) for the processor and the attached heat sink. The heat sink is omni-directional, allowing air to flow from any direction in order to achieve adequate cooling. The thermal resistance values for the OverDrive processors with an attached heat sink are shown in Table 11-1 through Table 11-3. Section 10 contains the physical dimensions for each of the heat sinks and packages used. Table 11-1. Thermal Resistance, IntelSX2TM OverDrive™ Processor with Attached Heat Sink Airflow (LFM) 8JS = 1.soC/W I I 0 8JA("C/W) 13.0 I I 200 8.0 400 6.0 I I I I 600 5.0 I 800 T 4.5 1000 4.25 Table 11-2. Thermal Resistance, IntelDX2TM OverDrive™ Processor with Attached Heat Sink Airflow (LFM) 8JS = 2.soC/W I I 0 8JA(OC/W) 14.0 200 10.0 I I 400 7.5 I I T I 600 6.2 800 5.7 Table 11-3. Thermal Resistance, IntelDX4TM OverDrive™ Processor with Attached Heat Sink 8 JS = 2.0°C/W 8JA("C/W) 3-44 Airflow (LFM) 0 11.5 I I 50 10.7 I I 100 9.5 I I 200 7.0 INTEL OverDrive™ PROCESSORS APPENDIX A DESIGN CONSIDERATIONS Intel has designed the family of OverDrive processors so that they can be installed by the end user. PC manufacturers can support this by implementing the design considerations listed in Table A-1. Table A-1. Design Considerations Design Consideration Implementation Visible OverDrive Processor Socket The Intel OverDrive processor socket should be easily visible when the PC's cover is removed. Label the Intel OverDrive processor socket and the location of pin 1 by silk screening this information on the PC board. Accessible OverDrive Processor Socket Make the Intel OverDrive processor socket easily accessible to the end user (I.e., do not place the Intel OverDrive processor socket under a disk drive). If a Low Insertion Force (UF) or screw machine socket is used, position the Intel OverDrive processor socket on the PC board such that there is ample clearance around the socket. Foolproof Chip Orientation Intel packages all Intel OverDrive processors in a 169-pin, PGA package. The 169th pin is called the "key" pin and insures that the Intel OverDrive processor fits into a 169-pin socket in only the correct orientation. Supplying a 169-pin socket as the Intel OverDrive processor socket eliminates the possibility of end users or resellers damaging the PC board or Intel OverDrive processor by powering up the system with the Intel OverDrive processor incorrectly oriented. Zero Insertion Force Upgrade Socket The high pin count of the Intel OverDrive processor makes the insertion force required for installation in a screw machine PGA socket excessive for end users or resellers. Even most Low Insertion Force (UF) sockets often require more than 60 Ibs. of insertion force. A Zero Insertion Force (ZIF) socket insures that the chip insertion force does not damage the PC board. If the ZIF socket has a handle, be sure to allow enough clearance for the socket handle. If a UF or screw machine socket is used, additional PC board support is recommended .. "Plug and Play" Jumper or switch changes should not be needed to electrically configure the system for the Intel OverDrive processor. Thorough Documentation Describe the Intel OverDrive processor's installation procedure in the'PC's User's Manual. 3-45 INTEL OverDrive™ PROCESSORS APPENDIX B ZIF AND LIF SOCKET VENDORS The following list provides examples of sockets which can be used for Intel486 SX and Intel486 OX CPU-based systems. NOTE: This is not a comprehensive list. Intel has not tested all of the socket vendors' sockets listed below and cannot guarantee that these sockets will meet every PC manufacturer's specific requirements. Zero Insertion Force (ZIF) and Low Insertion Force (LIF) Socket Vendors: 1. AMP Inc. 219 American Avenue Greensboro, NC 27409-1803 Tel: (800) 522-6752 2. Augat Inc. 425 John Dietsch Blvd. Attleboro Falls, MA 02763 Tel: (800) 999-7646 3. Aries Electronics P.O. Box 130 Frenchtown, NJ 08825 Tel: (908) 996-3891 3-46 8. Foxconn International Inc. 930 West Maude Avenue Sunnyvale, CA 94086 Tel: (800) 727-3699 4. JAE 142 Technology Drive Irvine, CA 92718-2401 Tel: (714) 753-2600 7. Loranger International Corp. 817 Fourth Avenue Warren, PA 16365 Tel: (814) 723-2250 5. Thomas and Betts 200 Executive Center Drive P.O. Box 24901 Greenville, SC 29616-2401 Tel: (803) 676-2900 6. Yamaichi Electronics 1420 Koll Circle, Suite B San Jose, CA 95112 Tel: (800) 769-0797 4 Peripheral Components I intel® ~[Q)W~OO©~ OOOIF@~IMl~'iJ'O@OO 82091AA ADVANCED INTEGRATED PERIPHERAL (AlP) PC Compatible I/O Solution Floppy Disk Controller Features • -100 • Single-Chip Percent Software Compatible for Notebook and Desktop Platforms: - 82078 Floppy Disk Controller Core -Two 16550 Compatible UARTs - One Multi-Function Parallel Port -IDE Interface -Integrated Back Power Protection -Integrated Game Port Chip Select - 5V or a.av Supply Operation with 5V Tolerant Drive Interface - Full Power Management Support - Supports Type F DMA Transfers for Faster I/O Performance - No Wait-State Host I/O Interface - Programmable Interrupt Interfaces - Single Crystal/Oscillator Clock (24 MHz) - Software Detectable Device ID - Comprehensive Powerup Configuration • • The 82091AA is 100 Percent Compatible with EISA, ISA and AT Host Interface Features - 8-Bit Zero Wait-State ISA Bus Interface - DMA with Type F Transfers - Five Programmable ISA Interrupt Lines -Internal Address Decoder Port Features • -Parallel All IEEE Standard 1284 Protocols Supported (Compatibility, Nibble, Byte, EPP, and ECP) - Peak Bi-Dlrectlonal Transfer Rate of 2 MB/sec - Provides Interface for Low-Cost Engineless Laser Printer - 16-Byte FIFO for ECP -Interface Backpower Protection with Industry Standard 82077SL and 82078 -Integrated Analog Data Separator 250K, aOOK, 500K, and 1 MBits/sec - Programmable Powerdown Command - Auto Powerdown and Wakeup Modes -Integrated Tape Drive Support - Perpendicular Recording Support for 4 MB Drives - Programmable Write PreCompensation Delays - 256 Track Direct Address, Unlimited Track Support -16-Byte FIFO - Supports 2 or 4 Drives Compatible UART Features • -16550 Two Independent Serial Ports - Software Compatible with 8250 and 16450 UARTs -16-Byte FIFO per Serial Port - Two UART Clock Sources, Supports MIDI Baud Rate Interface Features • -IDEGenerates Chip Selects for IDE Drives -Integrated Buffer Control Logic - Dual IDE Interface Support • • Power Management Features - Transparent to Operating Systems and Applications Programs -Independent Power Control for Each Integrate~ Device 100-Pin QFP Package (See Packaging Spec. 240800) The 82091AA Advanced Integrated Peripheral (AlP) is an integrated 1/0 solution containing a floppy disk controller, 2 serial ports, a multi-function parallel port, an IDE interface, and a game port on a single chip. The integration of these 1/0 devices results in a minimization of form factor, cost and power consumption. The November 1994 Order Number: 290486-003 4-1 82091AA floppy disk controller is the 82078 core. The serial ports are 16550 compatible. The parallel port supports all of the IEEE Standard 1284 protocols (ECP, EPP, Byte, Compatibility, and Nibble). The IDE interface supports 8- or 16-bit programmed liD and 16-bit DMA. The Host Interface is an 8-bit ISA interface optimized for type "F" DMA and no wait-state liD accesses. Improved throughput and performance, the 82091 AA contains six 16-byte FIFOs-two for each serial port, one for the parallel port, and one for the floppy disk controller. The 82091AA also includes power management and 3.3V capability for power sensitive applications such as notebooks. The 82091AA supports both motherboard and add-in card configurations. DMA IRQ ISA 4 Serial Port 5 26 Host Interface COM x 16550 Serial Port 8 16550 IDE Floppy Drive Oscillator 4 16 IDE Interface LPTx FDC 82078 2 Parallel Port COMx Clock Parallel Port Buffer Direction or Game Port Chip Select PPDIR or GAME Chip Select 290486-1- Figure 1. 82091AA Advanced Integrated Peripheral Block Diagram 4·2 82091AA ADVANCED INTEGRATED PERIPHERAL (AlP) CONTENTS PAGE 1.0 OVERViEW ............................................................................. 4-8 1.1 3.3/5V Operating Modes ............................................................. 4-11 2.0 SIGNAL DESCRIPTION ................................................................ 4-11 2.1 Host Interface Signals ............................................................... 4-13 2.2 Floppy Disk Controller Interface ........................... , .......................... 4-15 2.3 Serial Port Interface ................................................................. 4-17 2.4 IDE Interface ........................................................................ 4-18 2.5 Parallel Port External Buffer Control! Game Port ...................................... 4-19 2.6 Parallel Port Interface ................................................................ 4-20 2.6.1 COMPATIBILITY PROTOCOL SIGNAL DESCRIPTION .......................... 4-21 2.6.2 NIBBLE PROTOCOL SIGNAL DESCRIPTION ................................... 4-22 2.6.3 BYTE MODE SIGNAL DESCRIPTION ........................................... 4-23 2.6.4 ENHANCED PARALLEL PORT (EPP) PROTOCOL SIGNAL DESCRIPTION ...... 4-24 2.6.5 EXTENDED CAPABILITIES PORT (ECP) PROTOCOL SIGNAL DESCRIPTION .. 4-24 2.7 Hard Reset Signal Conditions ........................................................ 4-26 2.8 Power And Ground .................................................................. 4-27 3.0 I/O ADDRESS ASSIGNMENTS ........................................................ 4-27 4.0 AlP CONFIGURATION ................................................................. 4-29 4.1 Configuration Registers ................................................. , ............ 4-29 4.1.1 CFGINDX, CFGTRGT-CONFIGURATION INDEX REGISTER AND TARGET PORT .............................................................................. 4-30 4.1.2 AIPID-AIP IDENTIFICATION REGiSTER ....................................... 4-32 4.1.3 AIPREV-AIP REVISION IDENTIFiCATION ..................................... 4-32 4.1.4 AIPCFG1-AIP CONFIGURATION 1 REGiSTER ................................ 4-33 4.1.5 AIPCFG2-AIP CONFIGURATION 2 REGiSTER ................................ 4-34 4.1.6 FCFG1-FDC CONFIGURATION REGISTER ................................... 4-36 4.1.7 FCFG2-FDC POWER MANAGEMENT AND STATUS REGiSTER .............. 4-37 4.1.8 PCFG1-PARALLEL PORT CONFIGURATION REGiSTER ...................... 4-38 4.1.9 PCFG2-PARALLEL PORT POWER MANAGEMENT AND STATUS REGISTER ......................................................................... 4-40 4.1.10 SACFG1-SERIAL PORT A CONFIGURATION REGiSTER .................... 4-42 4.1.11 SACFG2-SERIAL PORT A POWER MANAGEMENT AND STATUS REGISTER ........................................................................ 4-43 4.1.12 SBCFG1-SERIAL PORT B CONFIGURATION REGiSTER .................... 4-46 I 4-3 CONTENTS PAGE 4.1.13 SBCFG2-SERIAL PORT B POWER MANAGEMENT AND STATUS REGISTER ........................................................................ 4-48 4.1.14IDECFG-IDE CONFIGURATION REGiSTER .................................. 4-50 4.2 Hardware Configuration .............................................................. 4-51 4.2.1 SELECTING THE HARDWARE CONFIGURATION MODE ....................... 4-51 4.2.2 SELECTING HARDWARE CONFIGURATION MODE OPTIONS ................. 4-52 4.2.3 HARDWARE CONFIGURATION TIMING RELATIONSHiPS ..................... 4-55 4.2.4 HARDWARE BASIC CONFIGURATION ......................................... 4-57 4.2.5 HARDWARE EXTENDED CONFIGURATION MODE ............................ 4-58 4.2.6 SOFTWARE ADD-IN CONFIGURATION ........................................ 4-59 4.2.7 SOFTWARE MOTHERBOARD CONFIGURATION .............................. 4-60 5.0 HOST INTERFACE .................................................................... 4-61 6.0 PARALLEL PORT ..................................................................... 4-62 6.1 Parallel Port Registers ............................................................... 4-62 6.1.1ISA-COMPATIBLEAND PS/2-COMPATIBLE MODES ........................... 4-63 6.1.1.1 PDATA-Parallel Port Data Register (ISA-Compatible and PS/2-Compatible Modes) .... ',' ................................................................... 4-64 6.1.1.2 PSTAT-Status Register (ISA-Compatible and PS/2-Compatible Modes) .... 4-64 6.1.1.3 PCON-Control Register (lSA-Compatible and PS/2-Compatible Mode) ..... 4-67 6.1.2 EPP MODE ...................................................................... 4-69 6.1.2.1 PDATA-Parallel Port Data Register (EPP Mode) ........................... 4-69 6.1.2.2 PSTAT-Status Register (EPP Mode) ...................................... 4-70 6.1.2.3 PCON-Control Register (EPP Mode) ...................................... 4-72 6.1.2.4 ADDSTR-EPP Auto Address Strobe Register (EPP Mode) ................. 4-73 6.1.2.5 DATASTR-Auto Data Strobe Register (EPP Mode) ........................ 4-74 6.1.3 ECP MODE ............................................................. ; ....... 4-74 6.1.3.1 ECPAFIFO-ECP Address/RLE FIFO Register (ECP Mode) ................ 4-75 6.1.3.2 PSTAT-Status Register (ECP Mode) ...................................... 4-76 6.1.3.3 PCON-Control Register (ECP Mode) ...................................... 4-78 6.1.3.4 SDFIFO-Standard Parallel Port Data FIFO ................................ 4-80 6.1.3.5 DFIFO-Data FIFO (ECP Mode) ........................... , ......... , ...... 4-81 6.1.3.6 TFIFO-ECP Test FIFO Register (ECP Mode) .............................. 4-82 6.1.3.7 ECPCFGA-ECP Configuration A Register (ECP Mode) ......... , ........... 4-83 6.1.3.8 ECPCFGB-ECP Configuration B Register (ECP Mode) .... '................. 4-84 6.1.3.9ECR ECP-Extended Control Register (ECP Mode) ...................... l •• 4-85 6.2 Parallel Port Operations .............................................................. 4-88 6.2.1 ISA-COMPATIBLE AND PS/2-COMPATIBLE MODES ........................... 4-88 6.2.2 EPP MODE ..................................................................... 4-90 6.2.3 ECP MODE ..................................................................... 4-92 4-4 I CONTENTS PAGE 6.2.3.1 FIFO Operations ........................................................... 4-95 6.2.3.2 DMA Transfers ............................................................. 4-95 6.2.3.3 Reset FIFO and DMA Terminal Count Interrupt ............................. 4-95 6.2.3.4 Programmed 1/0 Transfers ................................................. 4-95 6.2.3.5 Data Compression ......................................................... 4-96 6.2.4 PARALLEL PORT EXTERNAL BUFFER CONTROL ............................. 4-96 6.2.5 PARALLEL PORT SUMMARY .................................................. 4-96 7_0 SERIAL PORT ......................................................................... 4-97 7.1 Register Description ................................................................. 4-97 7.1.1 THR(A,B)-TRANSMITTER HOLDING REGiSTER .............. , ............... 4-99 7.1.2 RBR(A,B)-RECEIVER BUFFER REGISTER .......... " " " .................... 4-99 7.1.3 DLL(A,B), DLM(A,B)-DIVISOR LATCHES ,(LSB AND MSB) REGISTERS ........ 4-99 7.1.4IER(A,B)-INTERRUPT ENABLE REGISTER .... : ............................. 4-101 7.1.5I1R(A,B)-INTERRUPT IDENTIFICATION REGiSTER .......................... 4-102 7.1.6 FCR(A,B)-FIFO CONTROL REGiSTER ....................................... 4-104 7.1.7 LCR(A,B)-LlNE CONTROL REGiSTER ....................................... 4-106 7.1.8 MCR(A,B)-MODEM CONTROL REGiSTER ................................... 4-108 7.1.9 LSR(A,B)-LlNE STATUS REGISTER ......................................... 4-109 7.1.10 MSR(A,B)-MODEM STATUS REGISTER .................................... 4-112 7.1.11 SCR(A,B)-SCRATCHPAD REGiSTER ....................................... 4-113 7.2 FIFO Operations ................................................................... 4-114 7.2.1 FIFO INTERRUPT MODE OPERATION ........................................ 4-114 7.2.2 FIFO POLLED MODE OPERATION ............................................ 4-114 8.0 FLOPPY DISK CONTROLLER ........................................................ 4-115 8.1 Floppy Disk Controller Registers .................................................... 4-115 8.1.1 SRB-STATUS REGISTER B (EREG EN= 1) .................................. 4-117 8.1.2 DOR-DIGITAL OUTPUT REGISTER .......................................... 4-118 8.1.3 TOR-ENHANCED TAPE DRIVE REGISTER .................................. 4-119 8.1.4 MSR-MAIN STATUS REGiSTER ............................................. 4-121 8.1.5 DSR-DATA RATE SELECT REGiSTER ....................................... 4-122 8.1.6 FDCFIFO-FDC FIFO (DATA) ................................................. 4-125 8.1.7 DIR-DIGITAL INPUT REGiSTER ............................................. 4-126 8.1.8 CCR-CONFIGURATION CONTROL REGiSTER .............................. 4-127 8.2 Reset .................................... : ......................................... 4-128 8.2.1 HARD RESET AND CONFIGURATIONHEGISTER RESET ..................... 4-128 8.2.2 DOR RESET vs DSR RESET .................................................. 4-128 8.3 DMA Transfers ..................................................................... 4-128 8.4 Controller Phases .................................................................. 4-128 8.4.1 COMMAND PHASE ........................................................... 4-128 8.4.2 EXECUTION PHASE .......................................................... 4-129 I 4-5 CONTENTS PAGE 8.4.2.1 Non-DMA Mode Transfers from the FIFO to the Host ...................... 4-129 8.4.2.2 Non-DMA Mode Transfers from the Host to the FIFO ...................... 4-129 8.4.2.3 DMA Mode Transfers from the FIFO to the Host ........................... 4-129 8.4.2.4 DMA Mode Transfers from the Host to the FIFO .. , .......... : ............. 4-129 8.4.3 DATA TRANSFER TERMINATION ............................................. 4-130 8.5 Command Set/Descriptions ........................................................ 4-130 8.5.1 STATUS REGISTER ENCODING .............................................. 4-144 8.5.1.1 Status Register 0 ......................................................... 4-145 8.5.1.2 Status Register 1 ......................................................... 4-145 8.5.1.3 Status Register 2 ......................................................... 4-146 8.5.1.4 Status Register 3 ......................................................... 4-146 8.5.2 DATA TRANSFER COMMANDS ............................................... 4-147 8.5.2.1 Read Data ................................................................ 4-147 8.5.2.2 Read Deleted Data ....................................................... 4-148 8.5.2.3 Read Track ............................................................... 4-149 8.5.2.4 Write Data ................................................................ 4-149 8.5.2.5 Verify ..................................................................... 4-150 8.5.2.6 Format Track ............................................................. 4-151 8.5.2.7 Format Field .............................................................. 4-152 8.5.3 CONTROL COMMANDS ....................................................... 4-153 8.5.3.1 READ 10 Command ....................................................... 4-153 8.5.3.2 RECALIBRATE Command ................................................ 4-153 8.5.3.3 DRIVE SPECIFICATION Command ....................................... 4-153 8.5.3.4 SEEK Command .......................................................... 4-154 8.5.3.5 SENSE INTERRUPT STATUS Command ................................. 4-155 8.5.3.6 SENSE DRIVE STATUS Command ....................................... 4-155 8.5.3.7 SPECIFY Command ...................................................... 4-155 8.5.3.8 CONFIGURE Command .................................................. 4-156 8.5.3.9 VERSION Command ..... , ................................................ 4-157 8.5.3.10 RELATIVE SEEK Command ............................................. 4-157 8.5.3.11 DUMPREG Command ................................................... 4-157 8.5.3.12 PERPENDICULAR MODE Command .......................... : ......... 4-157 8.5.3.13 POWERDOWN MODE Command ........................................ 4-158 8.5.3.14 PART 10 Command ...................................................... 4-159 8.5.3.15 OPTION Command ...................................................... 4-159 8.5.3.16 SAVE Command ........................................................ 4-159 8.5.3.17 RESTORE Command .................................................... 4-159 8.5.3.18 FORMAT AND WRITE Command ........................................ 4-160 4-6 I CONTENTS PAGE 9.0 IDE INTERFACE ...................................................................... 4-160 9.1 I DE Registers ...................................................................... 4-160 9.2 IDE Interface Operation ............................................................. 4-161 10.0 POWER MANAGEMENT ............................................................ 4-163 10.1 Power Management Registers ..................................................... 4-163 10.2 Clock Power Management ......................................................... 4-163 10.3 FDC Power Management .......................................................... 4-163 10.4 Serial Port Power Management .................................................... 4-164 10.5 Parallel Port Power Management .................................................. 4-164 11.0 ELECTRICAL CHARACTERISTICS ................................................. 4-165 11.1 Absolute Maximum Ratings ........................................................ 4-165 11.2 DC Characteristics ................................................................ 4-165 11.3 Oscillator ........................................................ c ••••••••• _ •••••• 4-168 11.4 AC Characteristics ................................................................ 4-169 11.4.1 CLOCK TIMINGS ............................................................. 4-176 11.4.2 HOST TIMINGS .............................................................. 4-176 11.4.3 FDC TIMINGS ................................................................ 4-179 11.4.4 PARALLEL PORT TIMINGS .................................................. 4-180 11.4.5 IDE TIMINGS .................................... '" .......................... 4-184 11.4.6 GAME PORT TIMINGS ....................................................... 4-184 11.4.7 SERIAL PORT TIMINGS ...................................................... 4-185 12.0 PINOUT AND PACKAGE INFORMATION ........................................... 4-186 12.1 Pin Assignment ................................................................... 4-186 12.2 Package Characteristics ........................................................... 4-190 13.0 DATA SEPARATOR CHARACTERISTICS FOR FLOPPY DISK MODE .............. 4-192 13.1 Write Data Timing ................................................................. 4-194 13.2 Drive Control ...................................................................... 4-194 13.3 Internal PLL .................................. : .................................... 4-195 APPENDIX A-FDS FOUR DRIVE SUPPORT . ............................................ 4-196 A.1 Floppy Disk Controller Interface Signals ............................................. 4-196 A.2 DOR-Digital Output Register ...................................................... 4-197 A.3 TDR-Enhanced Tape Drive Register .............................................. 4-200 A.4 MSR-Main Status Register ........................................................ 4-202 I 4-7 82091AA 1.0. OVERVIEW The major functions of the 82091 AA are shown in Figure 1. A brief description of each of these functions is presented in this section. Host Interface The 82091 AA host interface is an 8-bit direct-drive (24 rnA) ISA Bus/X-Bus interface that permits the CPU to access its registers through read/write operations in I/O space. These registers may be accessed by pro, grammed I/O and/or DMA bus cycles. With the. exception of the IDE Interface, all functions on the 82091AA require only 8-bit data accesses. The 16-bit access required for the IDE Interface is supported through the appropriate chip selects and data buffer enables from the 82091AA.· Figure 2 shows an example system implementation with the 82091AA located on an ISA Bus add-in card. This add-in card could also be used in a PCI-based system as shown in Figure 3. For motherboard implementations, the 82091 AA can be located on the X-Bus as shown in Figure 4. Floppy Disk Drives (up to 4) IDE Interface Game Port Chip Select 290486-2 Figure 2. Block Diagram of the 82091AA on the ISA Bus 4-8 82091AA Host Bus Host/PCI Bridge Floppy Disk Drives (up to 4) IDE Interface Game Port Chip Select 290486-3 Figure 3. Block Diagram of the 82091AA in a PCI System Host Bus ISAIX Interface Serial PortA Serial PortB ISABus Parallel IDE Floppy Port Interface Disk Drives (up to 4) 290486-4 Figure 4. Block Diagram of the 82091AA on the X-Bus 4-9 82091AA Floppy Disk Controller The 82091AA's enhanced floppy disk controller (FDG) incorporates several new features allowing for easy implementation in both the portable and desktop markets. It provides a low cost, small form factor solution targeted for 5.0V and 3.3V platforms. The FDC supports up to four drives. The 82091AA's FDC implements these new features while remaining functionally compatible with 82078/ 82077SLl82077AA/8272A floppy disk controllers. Together, with a 24-MHz crystal, a resistor package' and a device chip select, these devices allow for the most integrated solution available. The integrated analog PLL data separator has better performance than most board level discrete PLL implementations and can be operated at 1 Mbps/500 Kbps/ 300 Kbps/250 Kbps. A 16-byte FIFO substantially improves system performance and is ideal for multimaster systems (e.g., EISA). Serial Ports The 82091 AA contains two independent serial ports that provide asynchronous communications that are equivalent to two 16550 UARTs. The serial ports have identical circuitry and provide the serial communication interface to a peripheral device or modem via Serial Port A and Serial Port B. Each serial port can be configured for one of eight address assignments; The standard PC/AT compatible logical address assignments for COM1, COM2, COM3, and COM4 are supported. The serial ports perform serial-to-parallel conversion on data characters received from a peripheral device or modem, and parallel-to-serial conversion on data characters received from the host. The serial ports can operate in either FIFO mode or non-FIFO mode. In FIFO mode, a 16-byte transmit FIFO holds data from the host to be transmitted on the serial link and a 16-byte receive FIFO that buffers data from the serial link until read by the host. 4-10 ,The serial ports contain programmable baud rate generators that divide the internal reference clock by divisors of 1 to (2 16 - 1), and produce a 16x clock for driving the transmitter and receiver logic. The internal reference clock can be programmed to support MIDI. The serial ports have complete modem-control capability and a prioritized interrupt system. Parallel Port The 82091AA provides a multi-function parallel port that transfers information between the host and peripheral device (e.g., printer). The parallel port interface contains nine control/status lines and an 8-bit data bus. The standard PC/AT compatible logical address assignments for LPT1, LPT2, and LPT3 are supported. The parallel port can be configured for one of four modes and supports the following IEEE Standard 1284 parallel interface protocol standards: Parallel Port Mode ISA-Compatible Mode PS/2-Compatible Mode EPP Mode ECP Mode Parallel Interface Protocol Compatibility, Nibble Byte EPP ECP For ISA-Compatible and PS/2-Compatible modes, software controls the handshake signals on the parallel port interface to transfer data between the host and peripheral device. Status and Control registers permit software to monitor the state of the peripheral device and generate handshake sequences. The EPP parallel port interface protocol increases throughput by specifying an automatic handshake sequence. In EPP mode, the 82091AA parallel port automatically generates this handshake sequence in hardware to transfer data between the host and peripheral device. 82091AA In addition to a hardware handshake on the parallel port interface, the ECP protocol specification also defines DMA and FIFO capability. To minimize processor overhead data transfer tolfrom a peripheral device, the 82091 AA parallel port, in ECP mode, provides a 16-byte FIFO with DMA capability. IDE Interface The 82091AA supports the IDE (Integrated Drive Electronics) interface by providing chip selects and lower data byte control. Two chip selects are used to access registers on the IDE device. Separate lower and upper byte data control signals are provided. With these control signals, minimal external logic is needed to implement 16-bit IDE 1/0 and DMA interfaces. Game Port The 82091 AA provides a game port chip select signal for use when the 82091AA is in an add-in card application. This function is assigned to 1/0 address location 201 h. Note that when the 82091 AA is located on the motherboard, this feature is not available. Power Management 82091AA power management provides a mechanism for saving power when the device or a portion of the device is not being used. By programming the appropriate 82091 AA registers, software can invoke power management to the entire 82091 AA or selected modules within the 82091AA (e.g., floppy disk controller, serial port, or parallel port). There are two methods for applying power management-direct powerdown or auto powerdown. Direct powerdown turns off the clock to a particular module immediately placing that module into a powerdown state. This method removes the clock regardless of the activity or status of the module. When auto powerdown .is invoked, the module enters a powerdown state (clock is turned off) after certain conditions are met and the module is in an idle state. 1.1. 3.3V 15V Operating Modes The 82091AA can operate at a power supply of 3.3V, 5V or a mix of 3.3V and 5V. The mixed power supply mode provides 5V interfaces for the floppy disk controller and parallel port while all other 82091 AA interfaces and internal logic (including the floppy disk controller and parallel port internal circuitry) operate at 3.3V. The mixed mode permits 5V floppy disk drives and parallel port peripherals to be used in a 3.3V system without external buffering. NOTE: . 3.3V operation is available only in the 82091AA. 2.0. SIGNAL DESCRIPTION This section describes the 82091 AA signals. The interface signals are shown in Figure 5 and described· in the following tables. Signal descriptions are organized by functional group. Note that the "#" symbol at the end of a signal name indicates the active, or asserted, state occurs when the signal is at a low voltage level. When" #" is not present after the signal name, the signal is asserted when at the high voltage level. The terms assertion and negation are used extensively. This is done to avoid confusion when working with a mixture of "active-low" and "active-high" signals. The term assert, or assertion, indicates that a signal is active, independent of whether that level is represented by a high or low voltage. The term negate, or negation, indicates that a signal is inactive. The following notations are used to describe pin types: I Input Pin o Output Pin 1/0 Bi-Directional Pin 4-11 82091AA SA[10:0] SD[7:0] 10CHRDY 10RC# 10WC# AEN NOWS# RSTDRV TC Host IRQ[7:3]# DTRB# RTSB# DCDB# DSRB# CTSB# RIB# PPDREQ FDDREQ PPDACK# FDDACK# X1/0SC X2 PD[7:0] STROBE# AUTOFD# INIT# SELECTIN# SELECT BUSY FAULT# PERROR ACK# Floppy Disk Game Port Chip Select IDE RDDATA# WRDATA# HDSEL STEP# DIR# WEI TRKO# INDX# WP# FDMEO#IMEEN# FDSO#IMDSO# FDME1#/DSEN# FDS1#IMDS1# DSKCHG DRVDEN[1 :0] IDECS[1 :0]# DEN# HEN# 1016# 290486-5 Figure 5. 82091AA Signals 4-12 82091AA 2.1 Host Interface Signals Signal Name Type Description ISASIGNALS SA[10:0] I SYSTEM ADDRESS BUS: The 82091 AA decodes the standard ISA 1/0 address space using SA[9:0]. SA10 is used along with SA[9:0] to decode the extended register set of the ECP ~arallel port. SA[10:0] connects directly to the ISA system address bus. SD[7:0] 1/0 SYSTEM DATA BUS: SD[7:0] is a bi-directional data bus. Data is written to and read from the 82091 AA on these signal lines. SD[7:0] connect directly to the ISA system data bus. 10RC# I I/O READ COMMAND STROBE: 10RC# is an 1/0 access read control signal. When a valid internal address is decoded by the 82091 AA and 10RC # is asserted, data at the decoded address location is driven onto the SD[7:0] signal lines. 10WC# I I/O WRITE COMMAND STROBE: 10WC# is an 1/0 access write control signal. When a valid internal address is decoded by the 82091 AA and 10WC# is asserted, data on the SD[7:0] signal lines is written into the decoded address location at the rising edge of 10WC #. NOWS# 0 NO WAIT·STATES: End data transfer signal. The 82091AA asserts NOWS# when a valid internal address is decoded by the 82091 AA and the 10RC # or 10WC # signal is asserted. This reduces the total bus cycle time by eliminating the waitstates associated with the default 8-bit 1/0 cycles. NOWS# is not asserted for IDE accesses or DMA accesses. This is an open drain output pin. 10CHRDY 0 I/O CHANNEL READY: The 82091 AA uses this signal for parallel port data transfers when the parallel port is in EPP mode. In this case, the 82091AA negates 10CHRDY to extend the cycle to allow for completion of transfers tolfrom the peripheral attached to the parallel port. When the parallel port is in EPP mode, the 82091 AA negates 10CHRDY to lengthen the ISA Bus cycle if the parallel port BUSY signal is asserted. The 82091 AA also uses 10CHRDY during hardware configuration time (see Section 4.0, AlP Configuration). If 10WC# /lORC# is asserted to the 82091AA during hardware configuration time, the 82091AA negates 10CHRDY until hardware configuration time is completed. This is an open drain output pin. AEN I ADDRESS ENABLE: AEN is used during DMA cycles to prevent the 82091AA from misinterpreting DMA cycles from valid 1/0 cycles. When negated, AEN indicates that the 82091 AA may respond to address and 1/0 commands addressed to the 82091 AA. When asserted, AEN informs the 82091 AA that a DMA transfer is occurring. When AEN is asserted and a xDACK # signal is asserted, the 82091 AA responds to the cycle as a DMA cycle. RSTDRV I RESET DRIVE: RSTDRV forces the 82091 AA to a known state. All 82091 AA registers are set to their default state. X1/0SC· I CRYSTAL 1/0SCILLATOR: Main clock input signal can be a 24 MHz crystal connected across X1 and X2 or a 24 MHz TTL level clock input connected to X1. X2 I CRYSTAL2: This signal pin is connected to one side of the crystal when a crystal oscillator is used to provide the main clOCk. If an external oscillator I clock is connected to X1, this pin is not used and left unconnected. 4-13 82091AA 2.1 Host Interface Signals (Continued) Signal Name Description Type DMASIGNALS FDDREO 0 FLOPPY DISK CONTROLLERDMA REQUEST: The 82091 AA asserts FDDREO to request service from a DMA controller for the FDC module. This signal is enabled/disabled by bit 3 of the Digital Output Register (DOR). When disabled, FDDREO is tri-stated. ' FDDACK# I FLOPPY DISK CONTROLLER DMA ACKNOWLEDGE: The DMA controller asserts this signal to acknowledge the FDC DMA request. When asserted, the IORC# and IOWC# inputs are enabled during DMA transfers. This signal is enabled/disabled by bit 3 of the DOR. PPDREO 0 PARALLEL PORT DMA REQUEST: Parallel port DMA service request to the system DMA controller. This signal is only used when the parallel port is in ECP hardware mode and is always negated when the parallel port is not in this mode. In ECP hardware mode DMA requests are enabled/disabled by bit 3 of the ECP Extended Control Register (ECR). When disabled, PPDREO is tri-stated. PPDACK# I PARALLEL PORT DMA ACKNOWLEDGE: The DMA controller asserts this signal to acknowledge the parallel port DMA request. When asserted the IORC# and IOWC# inputs are enabled during DMA transfers. This signal is enabled/disabled by bit 3 of the ECR Register. TC I TERMINAL COUNT: The system DMA controller asserts TC to indicate it has reached the last programmed data transfer. TC is accepted only when FDDACK # or PPDACK # is asserted. INTERRUPT SIGNALS IR03,IR04 0 INTERRUPT 3AND 4: IR03 and IR04 are associated with the serial ports and can be programmed (via the AIPCFG2 Register) to be either active high or active low. These signals can be configured for a particular serial channel via hardware configuration (at powerup) or by software configuration. Under Hardware Configuration IR03 is used as a serial port interrupt if the serial port is configured at address locations 2F8h-2FFh or 2E8h-2EFh. IR04 is used as a serial port interrupt if the serial port is configured at address locations 3F8h-3FFh or 3E8h-3EFh. Under Software configuration IR03 and IR04 are independently configured (Le., the IRO does not automatjcally track the communication port address assignment). . These interrupts are enabled/disabled globally via bit 3 of the serial port Modem Control Register (MCR) and for specific conditions via the Interrupt Enable Register (IER). IR03 and IR04 are.tri-stated when not enabled. IROS,IRO? 0 INTERRUPT REQUEST 5: IROS and IRO? are associated with the parallel port and can be programmed (via AIPCFG2 Register) to be either active high or active low. Either IROS or IRO? is enabled/disabled via PCFG1 Register to signal a parallel port interrupt. The interrupt not selected is disabled and tri-stated. During hardware configuration (see Section 4.0, AlP Configuration), IROS is used if the parallel port is assigned to 2?8h-2?Fh and IRO? is used if the parallel port interrupt is assigned to either 38Ch-38Fh or 3?8h-3?Fh. 4-14 82091AA . 2 1 Host Interface Signals (Continued) Signal Name Description Type INTERRUPT SIGNALS (Continued) IRa6 0 INTERRUPT REQUEST 6: IRa6 is associated with the floppy disk controller and can be programmed (via the AIPCFG2 Register) to be either active high or active low. In non-DMA mode this signal is asserted to signal when a data transfer is ready. IRa6 is also asserted to signal the completion of the execution phase for certain FDC commands. This signal is enabled/ disabled by the DMAGATE bit in the Digital Output Register of the FDC. The signal is tri-stated when disabled. 2.2 Floppy Disk Controller Interface Signal Name Type RDDATA# I READ DATA: Serial data from the disk drive. WRDATA# 0 WRITE DATA: MFM serial data to the disk drive. Precompensation value is selectable through software. HDSEL 0 HEAD SELECT: Selects which side of a disk is to be accessed. When asserted (low), side 1 is selected. When negated (high), side 0 is selected. STEP# 0 STEP: STEP# supplies step pulses (asserted) to the drive to move the head between the tracks during a seek operation. DIR# 0 DIRECTION: Controls the direction the head moves when a step signal is present. The head moves toward the center when DIR # is asserted and away from the center when negated. WE# 0 WRITE ENABLE: WE # is a disk drive control signal. When asserted, WE # enables the head to write to the disk. TRKO# I TRACKO: The disk drive asserts this signal to indicate that the head is on track o. INDX# I INDEX: The disk drive asserts this signal to indicate the beginning of the track. WP# I WRITE PROTECT: The disk drive asserts this signal to indicate that the disk drive is write-protected. DSKCHG I DISK CHANGE: The disk drive asserts this signal to indicate that the drive door has been opened. The state of this signal input is available in the Digital Input Register (DIR #). DRIVDENO DRIVDEN1. 0 DRIVE DENSITY: These signals are used by the disk drive to configure the drive for the appropriate media density. These signals are controlled by theFDC's Drive Specification Command. Description 4-15 82091AA 2.2 Floppy Disk Controller Interface (Continued) Signal Name Type FDME1 #/ p DSEN#(1) 0 Description FLOPPY DRIVE MOTOR ENABLE 1, IDLE, OR DRIVE SELECT ENABLE: This signal pin has two functions(1). FDME1 # is the motor enable for drive 1. FDME1 # is directly controlled via the Digital Output Register (DOR) and is a function of the mapping based on the BOOTSEL bits in the Tape Drive Register (TOR). The Drive Select Enable (DSEN #) function is only used in a four floppy drive system (see Appendix A, FDC Four Drive Support). 0 FDS1#/ MDS1(1) FLOPPY DRIVE SELECT1, POWERDOWN, OR MOTOR DRIVE SELECT 1: This signal pin has two functions(1). FDS1 # is the floppy drive select for drive 1. FDS1 # is controlled by the select bits in the DOR and is a function of the mapping based on the BOOTSEL bits in the TDR. The Motor Drive Select 1 (MDS1) function is only used in a four floppy drive system (see Appendix A, FDC Four Drive Support). FDMEO#/ MEEN#(1) 0 FLOPPY DRIVE MOTOR ENABLE 0 OR MOTOR ENABLE ENABLE: This signal pin has two functions(1). FDMEO# is the motor enable for drive O. FDMEO# is directly controlled via the Digital Output Register (DOR) and is a function of the mapping based on the BOOTSEL bits in the Tape Drive Register (TOR). The Motor Enable Enable (MEEN #) function is only used in a four floppy drive system (see Appendix A, FDC Four Drive Support). FDSO#/ MDSO(1) 0 FLOPPY DRIVE SELECT O.OR MOTOR DRIVE SELECT 0: This signal pin has two functions(1). FDSO# is the floppy drive select for drive o. This output is controlled by the drive select bits in the DOR and is a function of the mapping based on BOOTSEL bits in the TDR. The Motor Drive Select 0 (MDSO) function is only used in a four floppy drive system (see Appendix A, FDC Four Drive Support). NOTE: 1. The function selected for these pins is based on the FDDQTY bit in the FCFG 1 Register as shown in the following table. Signal Pin 2 Drive System . (FDDQTY = 0) 4 Drive System (FDDQTY=1) FDME1 #/DSEN# FDME1 # DSEN# FDS1#/MDS1# FDS1# MDS1 FDMEO#/MEEN# FDMEO# MEEN# FDSO#/MDSO FDSO# MDSO When FDDQTY = 1, these signal pinS are used to control. an external decoder for a four floppy disk drive system as described in Appendix A, FDC Four Drive Support. 4-16 82091AA 2.3 Serial Port Interface Serial Port A signal names end in the letter A and Serial Port B signal names end in the letter B; Serial Port A and B signals have the same functionality. Signal Name Type Description I CLEAR TO SEND: When asserted, this signal indicates that the modem or data set is ready to exchange data. The CTS# signal is a modem status input whose condition the CPU can determine by reading the CTS bit in Modem Status Register (MSR) for the appropriate serial port. The CTS bit is the compliment of the CTS# signal. The DCTS bit in the MSR indicates whether the CTS# input has changed state since. the previous reading of the MSR. CTS # has no effect on the transmitter. DCDA#, DCDB# I DATA CARRIER OETECT: When asserted, this signal indicates that the data carrier has been detected by the modem or data set. The DCD# signal is a modem status whose condition the CPU can determine by reading the DCD bit in the MSR for the appropriate serial port. The DCD bit is the compliment of the DCD# signal. The DDCD bit in the MSR indicates whether the DCD# input has changed state since the previous reading of the MSR. DCD# has no effect on the transmitter. DSRA#, DSRB# I DATA SET READY: When asserted, this signal indicates that the modem or data set is ready to establish the communications link with the serial port module. The DSR # signal is a modem status whose condition the CPU can determine by reading the DSR bit in the MSR for the appropriate serial channel. The DSR bit is the compliment of the DSR # signal. The DSR bit in the MSR indicates whether the DSR # input has changed state since the previous reading of the MSR. DSR # has no effect on the transmitter. DTRA#, DTRB# I/O CTSA#, CTSB# , DATA TERMINAL READY: DTRA#/DTRB# are outputs during normal system operations. When asserted, this signal indicates to the modem or data set that the serial port module is ready to establish a communications link. The DTA # signal can be asserted via the Modem Control Aegister (MCR). A hard reset negates this signal. Hardware Configuration These signals are only inputs during hardware configuration time (ASTDAV asserted and for a short time after ASTDAV is negated). (See Section 4.0, AlP Configuration.) RIA#, AIB# I RING INDICATOR: When asserted, this signal indicates that a telephone ringing signal has been received by the modem or data set. The RI # signal is a modem status input whose condition the CPU can determine by reading the AI bit in the MSA for the appropriate serial channel. The AI bit is the compliment of the AI '# signal. The TEAl bit in the MSA indicates whether the AI # input has changed from low to high since the previous reading of the MSA. 4-17 82091AA 2.3 Serial Port Interface (Continued) Signal Name RTSA#, RTSB# Type 1/0 Description REQUEST TO SEND: RTSA # I RTSB # are outputs during normal system operations. When asserted, this signal informs the modem or data set that the serial port module is ready to exchange data. The RTS# signal can be asserted via the RTS bit in the Modem Control Register. A hard reset negates this signal. Hardware Configuration These signals are only inputs during hardware configuration time (RSTDRV asserted and for a short time after RSTDRV is negated). (See Section 4.0, AlP Configuration.) SINA,SINB SO UTA, SOUTB I SERIAL INPUT: Serial data input from the communications link. (Peripheral device, modem, or data set.) 110 SERIAL OUTPUT: SOUTA/SOUTB are serial data outputs to the communications link during normal system operations. (Peripheral device, modem, or data set.) The SOUT signal is set to a marking state (logic 1) after a hard reset. Test Mode, In test mode (selected via the SACFG2 or SBCFG2 Registers), the baudout from the baud rate generator is output on SOUTx. Hardware Configuration These signals are only inputs during hardware configuration time (RSTDRV asserted and for a short time after RSTDRV is negated). (See Section 4.0, AlP Configuration.) 2.4 IDE Interface Signal Name 1016# IDECS[1 :0] # Type Description I 16·BIT 1/0: This signal is driven by 1/0 devices on the ISA Bus to indicate 'support for 16·bit 1/0 bus cycles. The IDE interface asserts this signal to the 82091 AA to indicate support for 16·bit transfers. For IDE transfers, the 82091 AA asserts HEN# when 1016# is asserted. 1/0 IDE CHIP SELECT: IDECS[1 :0] # are outputs during normal system operation and are chip selects for the IDE interface. IDECS[1 :0] # select the Command Block Registers of the IDE device and are decoded frqm SA[9:3] and AEN. Hardware Configuration These signals are only inputs during hardware configuration time (RSTDRV asserted). (See Section 4.0, AlP Configuration.) 4·18 82091AA 2.4 IDE Interface (Continued) Signal Name Type Description DEN# 1/0 DATA ENABLE: DEN# is an output during normal system operations and is a data enable for an external data buffer for all 82091 AA and IDE accesses. The SD[7:0] signals can be connected directly to the ISA. In this case, the DEN# signal is not used. However, an external buffer can be used to isolate the SD[7:0] signals from the 240 pF loading of the ISA Bus. With an external buffer implementation, DEN # controls the external buffers for transfers tolfrom the ISA Bus. Hardware Configuration This signal is only an input during hardware configuration time (RSTDRV asserted). (See Section 4.0, AlP Configuration.) HEN# 1/0 IDE UPPER DATA TRANSCEIVER ENABLE: HEN# is an output during normal system operations and is a high byte data transceiver enable signal for the IDE hard disk drive interface. HEN # is asserted for 1/0 accesses to the IDE data register when the drive asserts 1016#. Hardware Configuration This signal is only an input during hardware configuration time (RSTDRV asserted). (See Section 4.0, AlP Configuration.) 2.5 Parallel Port External Buffer Control/Game Port Signal Name Type PPDIR/GCS# 1/0 Description PARALLEL PORT DIRECTION (PPDIR) or GAME PORT CHIP SELECT (GCS#): This signal is an output during normal operations and provides the PPDIR and GCS# functions as follows: PPDIR This signal pin functions as a parallel port direction control output when the 82091AA is configured for software motherboard mode (SWMB). For configuration details, see Section 4.0, AlP Configuration. If external buffers are used on PD[7:01, PPDIR can be used to control the buffer direction. The 82091AA drives this signal low when PD[7:0] are outputs and the 82091 AA drives this signal high when PD[7:0] are inputs. Note that if a configuration mode other than SWMB is selected, this signal pin is a game port chip select and does not track the PD[7:0] signal direction. GCS# This signal pin functions as a game port chip select output when 82091AA configuration is set for Software Add-In (SWAI), Hardware Basic (HWB), or Hardware Extended (HWE) modes. When the host accesses 1/0 address 201 h, GCS# is asserted. Hardware Configuration This signal is only an input during hardware configuration time (RSTDRV asserted). (See Section 4.0, AlP Configuration.) 4-19 82091AA 2.6 Parallel Port Interface The 82091AA parallel port is a multi-function interface that can be configured for one of four hardware modes (see Section 4.0, AlP Configuration). The hardware modes are ISA-Compatible, PS/2-Compatible, EPP, and ECP modes. These parallel port modes support the compatibility, nibble, byte, EPP and ECP parallel interface protocols described in the IEEE 1284 standard. The operation and use. of the interface signal pins are a function of the parallel port hardware mode selected and the protocol used. Table 1 shows amatrix of the 82091AA parallel port signal names and corresponding signal names for each of the protocols. Sections 2.6.1-2.6.5 provide a signal description for the five interface protocols. Note that the 82091 AA hardware operations are the same for Compatibility and Nibble protocols. The signals, however, are controlled and used differently via software and the peripheral device. Table'1. Parallel Port Signal Name Cross Reference 82091AA Signal Names STROBE# Compatibility Protocol Signal Names Nibble Protocol Signal Names - Strobe # Byte Protocol Signal Names EPP Protocol Signal Names HostCLK Write # ECP Protocol Signal Names HostClk BUSY Busy PtrBusy PtrBusy Wait# PeriphAck ACK# Ack# PtrClk PtrClk Intr PeriphClk# SELECT Select Xflag Xflag Xflag Xflag PERROR PError AckDataReq AckDataReq AckDataReq AckReverse# FAULT# Fault# DataAvail# DataAvail# DataAvail# PeriphRequest # INIT# Init# Init# ReverseRequest # AUTOFD# AutoFd# HostBusy DStrb# HostAck PD[7:0j Data[8:1j - Data[8:1j Data[8:1j Data[8:1j SELECTIN# Selectln# - AStrb# ECPMode HostBusy - NOTE: Not all parallel port signal pins are used for certain parallel port interface protocols. These signals are labeled "-". 4-20 82091AA 2.6.1 COMPATIBILITY PROTOCOL SIGNAL DESCRIPTION Except for the data bus, the 82091AA and compatibility protocol signal names are the same. For the data bus, the 82091AA signal names PD[7:0j corresponds to the compatibility protocol signal names Data[8:1j. 82091AA Signal Name Type Compatibility Protocol Signal Name and Description STROBE# 0 STROBE: The host asserts STROBE# to latch data into the peripheral device's input latch. This signal is controlled via the PCON Register. BUSY I BUSY: BUSY is asserted by the peripheral to indicate that the peripheral device is not ready to receive data. The status of this signal line is reported in the PSTAT Register. ACK# I ACKNOWLEDGE: The printer asserts this signal to indicate that it has received the data and is ready for new data. The status of this signal line is reported in the PSTAT Register. SELECT I SELECT: SELECT is asserted by the peripheral device to indicate that the device is on line. The status of this signal line is reported in the PSTAT Register. ·PERROR I PAPER ERROR: The peripheral device asserts PERROR to indicate that it has encountered an error in the paper path. The exact meaning varies from peripheral device to peripheral device. The status of this signal line is reported in the PSTAT Register. FAULT# I FAULT: FAULT # is asserted by the peripheral device to indicate that an error has occurred. The status of this signal line is reported in the PSTAT Register. INIT# 0 . INITIALIZE: The host asserts INIT # to issue a hardware reset to the peripheral device. This signal is controlled via the PCON Register. AUTOFD# 0 AUTO FEED: AUTOFD# is asserted by the host to put the peripheral device into auto-line feed mode. This means that when software asserts this signal, the printer is instructed to advance the paper one line for each carriage return encountered. This signal is controlled via the PCON Register. PD[7:0j 0 DATA: Forward channel data. SELECTIN# 0 SELECT INPUT: SELECTIN# is asserted by the host to select a peripheral device. This signal is controlled via the PCON Register. 4-21 • 82091AA 2.6.2 NIBBLE PROTOCOL SIGNAL DESCRIPTION The Nibble protocol assigns the following signal operation to the parallel port pins. The name in bold at the beginning of the signal description column is the Nibble protocol signal name. The terms assert and negate are used in accordance with the 82091 AA signal name as described at the beginning of Section 2.0. For example, AUTOFD# (HostBusy) asserted refers to AUTOFD# (HostBusy) at a low level. 82091AA Signal Name Type Nibble Protocol Signal Name and Description STROBE# 0 STROBE: The host controls this signal via the PCON Register and STROBE # should be held negated by the host. BUSY I PRINTER BUSY (PtrBusy): The peripheral drives this signal to transfer data bits 3 and 7 sequentially. The status of this signal line is reported in the PSTAT Register. ACK# I PRINTER CLOCK (PtrClk): The peripheral device asserts ACK# (PtrClk) to indicate to the host that data is available. The signal is subsequently asserted to qualify data being sent to the host. The status of this signal line is reported in the PSTAT Register. If interrupts are enabled via the PCON Register, the assertion of this signal causes a host interrupt to be generated. SELECT I XFLAG: The peripheral device .drives this signal to transfer data bits 1 and 5 sequentially. The status of this signal line is reported in the PSTAT Register. PERROR· I ACKNOWLEDGE DATA REQUEST (AckDataReq): This signal is initially high. The peripheral device drives this signal low to acknowledge HostBusy assertion. PERROR is subsequently used to transfer data bits 2 and 6 sequentially. The status of this signal line is reported in the PSTAT Register. FAULT# I DATA AVAILABLE (DataAvail): The peripheral device asserts FAULT# (DataAvail) to indicate data availability. Subsequently used to transfer data bits 0 and 4 sequentially. The status of this signal line is reported in the PSTAT Register. INIT# 0 INITIALIZE: The host controls this signal via the PCON Register. AUTOFD# 0 HOST BUSY (HostBusy): The host negates AUTOFD # (HostBusy) in response to ACK # being asserted. This signal is subsequently driven low to enable the peripheral to transfer data to the host. AUTOFD# is then driven high to acknowledge receipt of byte data. This signal is controlled via the PCON Register. PD[7:0j 0 DATA: This 8·bit output data path to the peripheral Host data is written to the peripheral attached to the parallel port interface on these signal lines. SELECTIN# 0 SELECT INPUT: This signal is controlled by the PCON Register. 4·22 82091AA 2.6.3 BYTE MODE SIGNAL DESCRIPTION The Byte protocol assigns the following signal operation to the parallel port pins. The name in bold at the beginning of the signal description column is the Byte protocol signal name. The terms assert and negate are used in accordance with the 82091AA signal name as described at the beginning of Section 2.0. For example, STROBE# (HostClk) asserted refers to STROBE# (HostClk) at a low level. 82091AA Signal Name STROBE# Type Byte Protocol Signal Name and Description 0 HOST CLOCK (HostClk): This signal is strobed low by the host to acknowledge receipt of data. Note that the peripheral must not interpret this as a latch strobe for forward channel data. BUSY I PRINTER BUSY (PtrBusy): The peripheral device asserts BUSY (PtrBusy) to provide forward channel peripheral busy status. The status of this signal line is reported in the PSTAT Register. ACK# I PRINTER CLOCK (PtrClk): The peripheral device asserts ACK # .(PtrClk) to indicate to the host that data is available. The signal is subsequently asserted to qualify data being sent to the host. The status of this signal line is reported in the PSTAT Register. If interrupts are enabled via the PCON Register, the assertion of this signal causes a host interrupt to be generated. SELECT I XFLAG: SELECT (XFLAG) is asserted by the peripheral device to indicate that the device is on line. The status qf this signal line is reported in the PSTAT Register. PERROR I ACKNOWLEDGE DATA REQUEST (AckDataReq): This signal is initially high. The peripheral device drives this signal low to acknowledge HostBusy assertion. The status of this signal line is reported in the PSTAT Register. FAULT# I DATA AVAILABILITY (DataAvall): The peripheral device asserts FAULT # (DataAvail) to indicate data availability. The status of this signal line is reported in the PSTAT Register. INIT# 0 INITIALIZE: The host controls this signal via the PCON Register and INIT# should be held in the negated state. AUTOFD# 0 HOST BUSY (HostBusy): The host negates AUTOFD# (HostBusy) in response to ACK # being asserted. The signal is subsequently driven low to enable the peripheral to transfer data to the host. AUTOFD# is then driven high to acknowledge receipt of nibble data. This signal is controlled via the PCON Register. PD[7:0j SELECTIN# 0 DATA: This 8-bit data bus is used for bi-directional data transfer. I/O SELECT INPUT: This signal is controlled by the PCON Register. 4-23 82091AA 2,6.4 ENHANCED PARALLEL PORT (EPP) PROTOCOL SIGNAL DESCRIPTION EPP protocol assigns the following signal operation to the parallel port pins. The name in bold at the beginning of the signal description column is the EPP mode signal name. The terms assert and negate are used in accordance with the 82091 AA signal name as described at the beginning of Section 2.0. For example, BUSY (Wait#) asserted refers to BUSY (Wait#) being high. 82091AA Signal Name Type EPP Protocol Signal Name and Description STROBE # 0 BUSY I WAIT (Wait#): The peripheral sets BUSY (Wait#) low to indicate that the device is not ready. When BUSY signal is low, the 82091AA negates 10CHRDY on the ISA Bus to lengthen the I/O cycles. The peripheral device sets BUSY (Wait#) high to indicate that transfer of data or address is completed. ACK# I INTERRUPT REQUEST (Intr): The peripheral asserts ACK # (Intr) to generate an interrupt the host. When this signal is low and interrupts are enabled via bit 4 of the PCON Register, the 82091 AA generates an interrupt request (via either IROS or IR07) to the host. SELECT I SELECT:.SELECT is asserted by the peripheral device to indicate that the device is on line. The status of this signal line is reported in the PSTAT Register. PERROR I PAPER ERROR: The peripheral device asserts PERROR to indicate that it has encountered an error in the paper path. The exact meaning varies from peripheral device to peripheral device. The status of this signal line is reported in the PSTAT Register. FAULT# I FAULT: FAULT # is asserted by the peripheral device to indicate that an error has occurred. The status of this signal line is reported in the PSTAT Register. INIT# 0 INITIALIZE: The host asserts INIT # to issue a hardware reset to the peripheral device. This signal is controlled via the PCON Register. AUTOFD# 0 DATA STROBE (DStrb#): The 82091AA asserts AUTOFD# (DStrb#) to indicate that valid data is present on PD[7:0] and is used by the peripheral to latch data during write cycles. For reads, the 82091 AA reads in data from PD[7:0] when this signal is asserted. PD[7:0] $ELECTIN# I/O 0 WRITE (Write#): STROBE # (Write#) indicates an address or data read/write . operation to the peripheral. The 82091 AA drives this signal low for a write and high for a read. DATA: This 8-bit bi-directional bus provides addresses or data during the write cycles and supplies addresses or data to the 82091 AA during the read cycles. ADDRESS STROBE (AStrb#): The 82091AA asserts SELECTIN# (AStrb#) to indicate that a valid address is present on PD[7:0] and is used by the peripheral to latch addresses during write cycles. For reads, the 82091 AA reads in an address from PD[7:0] when this signal is asserted. 2.6.5 EXTENDED CAPABILITIES PORT (ECP) PROTOCOL SIGNAL DESCRIPTION ECP protocol assigns the following signal operation to the parallel port pins. The name in bold at the beginning of the signal description column is the ECP protocol signal name. The terms assert and negate are used in accordance with the 82091 AA signal name as described at the beginning of Section 2.0. For example, STROBE # (HostClk) asserted refers to STROBE # (HostClk) being low. 4-24 82091AA 82091AA Signal Name Type ECP Protocol Signal Name and Description 0 HOST CLOCK (HostClk): In the forward direction, the 82091 AA asserts STROBE# (HostClk) to instruct the peripheral to latch the data on PD[7:0]. During write operations, the peripheral should latch data on the rising edge of STROBE# (HostClk). STROBE# (HostClk) handshakes with BUSY (PeriphAck) during write operations and is negated after the 82091 AA detects BUSY (PeriphAck) asserted. STROBE# (HostClk) is not asserted by the 82091AA again until BUSY (PeriphAck) is detected negated. For read operations (reverse direction), STROBE # (HostClk) is not used. BUSY I PERIPHERAL ACKNOWLEDGE (PeriphAck): The peripheral device asserts this signal during a host write operation to acknowledge receipt of data. The peripheral device then negates the signal after STROBE # is detected high to terminate the transfer. For host write operations (forward direction), this signal handshakes with STROBE# (HostClk). During a host read operation (reverse direction), BUSY (PeriphAck) is normally low and is driven high by the peripheral to identify Run Length Encoded (RLE) data. ACK# I PERIPHERAL CLOCK (PeriphClk): During a peripheral to host transfer"{reverse direction), ACK# (PeriphClk) is asserted by the peripheral to indicate data is valid on the data bus and then negated after AUTOFD# is detected high. This signal handshakes with AUTOFD# to transfer data. , SELECT I XFLAG (Xflag): This signal is asserted by the peripheral to indicate that it is online. The status of this signal line is reported in the PST AT Register. PERROR I ACKNOWLEDGE REVERSE (AckReverse#): PERROR (AckReverse#) is driven low by the peripheral to acknowledge a reverse transfer request by the host. This signal handshakes with INIT# (ReverseRequest#). The status of this signal line is reported in the PSTAT Register. FAULT# I PERIPHERAL REQUEST (PeriphRequest#): The peripheral asserts FAULT # (PeriphRequest#) to request a reverse transfer. The status of this signal line is reported in the PSTAT Register. INIT# 0 REVERSE REQUEST (ReverseRequest#): The host controls this signal via the PCON Register to indicate the transfer direction. The host asserts this signal to request a reverse transfer direction and negates the signal for a forward transfer direction. AUTOFD# 0 HOST ACKNOWLEDGE (HostAck): The 82091AA asserts AUTOFD# (HostAck) to request data from the peripheral (reverse direction). This signal handshakes with ACK# (PeriphClk). AUTOFD# (HostAck) is negated when the peripheral indicates valid state of the data bus (i.e., ACK # is detected asserted). STROBE# In the forward direction, AUTOFD# (HostAck) indicates whether PD[7:0] contain an address/RLE or data. The 82091 AA asserts this signal to identify an address/ RLE transfer and negates it to identify a data transfer. PD[7:0] SELECTIN# I/O 0 DATA: PD[7:0] is a bi-directional data bus that transfers data, addresses, or RLE data. ECP MODE (ECPmode): The host (via the PCON Register) negates this signal during ECP mode operation. 4-25 82091AA 2.7 Hard Reset Signal Conditions Table 1 shows the state of all 82091AA output and bi-directional signals during hard reset (RSTDRV asserted). The strapping options described in Section 4.0, AlP Configuration are sampled when the 82091AA is hard reset. Table 2. Output and 1/0 Signal States During a Hard Reset Signal Name ACK# AEN AUTOFD# BUSY CTS[A,B]# DCD[A,B]# State Tri-state Signal Name State Signal Name HDSEL High RSTDRV HEN# High(l) RTS[A,B]# IDECS[1 ,0] # High(l) SA[10,0] - - SD[7:0] Tri-state SELECT - - INDX# High 1016# INIT# Low - SELECTIN# DEN# High(l) 10CHRDY DIR# . High 10RC# 10WC# - SOUnA,B] Low DRVDEN[1:0]0 Tri-state(2) SIN[A,B] STEP# DSKCHG# - IRO[7:3] Tri-state STROBE# DTR[A,B]# High(l) NOWS# Tri-state TC - PD[7:0] Low FAULT# FDDACK# FDDREO Tri-state PERROR PPDACK# FDMEO#/MEEN# High PPDREO FDME1 #/DSEN# High PPDIR/GCS# FDSO#/MDSO High RDDATA FDS1#/MDS1 High RI[A,B]# TRKO# - WE# - WP# Tri-state High(l) - WRDATA# X1/0SC X2 State High(l) Tri-state High(l) High Tri-state High High - NOTES: 1. During and immediately after a hard reset, this signal is an input for hardware configuration. After the hardware configuration time, these Signals go to the state specified in the table. 2. If IORC# or IOWC# is asserted, IOCHRDY will be asserted by the IOCHRDY. 3. Dashes represent input signals. 4-26 82091AA 2.8 Power And Ground Signal Name 'I Type Description Vss I GROUND: The ground reference for the 82091 AA. Vee I POWER: The 5V/3.3V(1) modes are selected via strapping options at power-up (see Section 4.2, hardware Configuration). When strapping options (VsELl are set to 5V, the Vee pins must be connected to 5V. When strapping options are set to 3.3V, the Vee pins must be connected to 3.3V. VeeF I POWER: The 5V 13.3V(1) power supply for the 82091 AA. In 5V or 3.3V power supply modes (non-mixed mode), the voltage applied to VeeF is the same voltage as applied to Vee. For mixed mode operations, 5V is applied to VeeF. This voltage provides 5V reference for the parallel port and floppy disk controller interfaces. Note that in mixed mode, 3.3V is applied to Vee. NOTE: 1. 3.3V operation is available only in the 82091AA. 3.0 1/0 ADDRESS ASSIGNMENTS The 82091 AA assigns CPU liD address locations to its game port chip select, IDE interface, serial ports, parallel port, floppy disk controller, and the 82091 AA configuration registers as indicated in Table 3. Except for the game port chip select (address 201 h), address assignments are configurable. For example, the serial port can be assigned to one of eight address blocks. The parallel port can be assigned to one of three address blocks, and the IDE interface and floppy disk controller can be assigned to one of two address blocks. These address assign- ments are made during 82091AA configuration (either hardware configuration at powerup or a hard reset, or software configuration by programming the 82091 AA configuration registers). In addition, the 82091 AA configuration registers can be located at one of two address blocks during hardware configuration. All of the 82091AA address locations are located in the host liD address space. The address block assignments are shown in Table 3. The first hex address in the Address Block column represents the base address for that particular block. 4-27 82091AA Table 3. AlP Address Assignments Address Block (ISA Bus) 170-177h Assignment IDE Interface-Secondary Address Block 1FO-1F7h IDE Interface-Primary Address Block 201h Game Port Chip Select 220-227h Serial Port 228-22Fh Serial Port 238-23Fh Serial Port 26E-26Fh 82091AA Configuration Registers-Primary Address Block (022-023h on X-Bus) 278-27Fh Parallel Port 2E8-2EFh Serial Port 2F8-2FFh Serial Port 338-33Fh Serial Port 370-377h Floppy Disk Controller-Secondary Address Block (376h and 377h are Shared with the IDE Drive Interface Secondary Address) 378-37Fh Parallel Port 398-399h 82091AA Configuration Registers-Secondary Address Block (024-025h on X-Bus) 3BC-3BFh Parallel Port (All Mopes Except EPP) 3E8-3EFh Serial Port 3FO-3F7h Floppy Disk Controller-Primary Address (3F6h and 3F71i are Shared with the IDE Drive Interface Primary Address) 3F8-3FFh Serial Port 678-67Ah Parallel Port (ECP Mode Peripheral Interface Protocol) 778-77Ah Parallel Port (ECP Mode Peripheral Interface Protocol) 7BC-7BEh Parallel Port (ECP Mode Peripheral Interface Protocol) NOTES: 1. The 82091AA does not contain IDE registers. However, the 82091AA provides the address block assignments for accessing the IDE registers that are located in the IDE device. 2. The standard PC/AT' compatible logical I/O address assignments are supported. For example, COM1 (3F8-3FFh) and COM2 (2F8-2FFh) are part of the serial port assignments and LPT1 (3BC-3BFh), LPT2 (378-37Fh), and LPT3 (278-27Fh) are part of the parallel port assignments. 'Other brands and names are the property of their respective owners. 4-28 82091AA 4.0 AlP CONFIGURATION 82091AA configuration consists of setting up overall device operations along with certain functions pertaining to the individual 82091 AA modules (parallel port, serial ports, floppy disk controller, and IDE interface). Overall device operations include selecting the clock frequency, power supply voltage, and address assignment for the configuration registers. Overall device operations also enable/ disable access to the configuration registers and provide interrupt signal level control. For the individual modules, 82091 AA configuration includes module address assignment, interrupt control, module enable/disable, powerdown control, test mode control, module reset, and certain functions specific to each module. The remainder of the functions unique to each module are handled via the individual module registers. , Two methods are provided for configuring the 82091AA-hardware configuration via strapping options. at powerup (or whenever RSTDRV is asserted) and software configuration by programming the configuration registers. (For information on hardware configuration, see Section 4.2, Hardware Configuration. For information on software configuration, see Section 4.1, Configuration Registers.) NOTE: 1. There are four hardware configuration modes-SWMB (Software Motherboard), SWAI (Software Add-In), HWB (Hardware Basic). and HWE (Hardware Extended). Some of these modes can be used without the need for programming the 82091AA configuration registers. Other modes use both hardware configuration strapping options and programming the configuration registers to set up the 82091 AA. 2. The 82091AA's operating power supply voltage level, 82091 AA clock frequency, and address assignment for the 82091 AA configuration registers can only be configured by hardware configuration. 4.1 Configuration Registers 82091AAGonfiguration Space contains 13 configuration registers. Four of the registers (Product and Revision Identification Registers and the 82091AA Configuration 1 and 2 Registers) provide control and status information for the entire chip. In addition, two registers each for the floppy disk controller, parallel port, serial port A, and serial port B and one register for the IDE interface provide certain module status and control information. The 82091AA configuration registers are indirectly addressed by first writing to the 82091AA Configuration Index Register as described in Section 4.1.1. Thus, the 13 configuration registers occupy two address locations in the host's I/O address space-one for indirectly selecting the specific configuration register and the other for transfering register data. All 82091AA configuration registers are 8-bits wide and are accessed as byte quantities. Some of the 82091AA Configuration registers described in this section contain reserved bits. These bits are labeled "R". Software must deal correctly with fields that are reserved. On reads, software must use appropriate masks to extract the defined bits and not rely on reserved bits being any particular value. On writes, software must ensure that the values of reserved bit pOSitions are preserved. That is, the value of reserved bit positions must first be read, merged with the new values for other bit positions, and then written back. In addition to reserved bits within a register, the 82091 AA configuration space contains address locations that are labeled "Reserved" (Table 5). While the 82091AA responds to accesses to these I/O addresses by completing the host cycle, writing to a reserved I/O address can result in unintended device operations. Values read from a reserved I/O address should not be used to permit future expansion and upgrades. During a hard reset (RSTDRV asserted), the 82091 AA sets its configuration registers to pre-determined default states. The default values are indicated in the individual register descriptions. The following nomenclature is used for register acCess attributes: RO Read Only. If a register is read only, writes have no effect. R/W Read/Write. A register with this attribute can be read and written. Note that individual bits in some read/write registers may be read only. 4-29 82091AA 4.1.1 CFGINDX, CFGTRGT-CONFIGURATION INDEX REGISTER AND TARGET PORT I/O Address: Default Value: Attribute: Size: Hardware Configurable (see Table 4) OOh . Read/Write 8 bits CFGINDX and CFGTRGT are used to access 82091AA configuration space where all of the 82091AA configuration registers are located. CFGINDX and CFGTRGT are located in the host I/O address space and the address locations are hardware configurable as shown in Table 4. CFGINDX is an 8-bit register that contains the address index of the 82091AA configuration register to be accessed. CFGTRGT is a port for reading data from or writing data to the configuration register whose index address matches the address stored in the CFGINDX Register. Thus, to access a configuration register, CFGINDX must first be programmed with the index address. A software example is provided in this section demonstrating how to access the configuration registers. Table 4. Configuration Register Access Addresses Address Selection X-Bus Implementation ISA Bus Implementation Index Target Index Target Primary Address 22h 23h 26Eh 26Fh Secondary Address 24h 25h 398h 399h Table 5 summarizes the 82091 AA configuration space. Following the table, is a detailed description of each register. The register descriptions are arranged in the order that they appear in Table 5. Description 82091AA Configuration Register Address Index: Bits[7:0] correspond to 50[7:0]. 4-30 82091AA Software Configuration Access Addresses for the two Software Configuration Modes: For SWMB Mode Primary Address: For SWMB Mode Secondary Address: For SWAI. HWE. and HWB Modes Primary Address: For SWAI. HWE. and HWB Modes Secondary Address: Index 22h 24h 26Eh 398h Target 23h 25h 26Fh 399h The following pseudo code sequence could be used to access the configuration registers under SWMB primary address: Configuration register write: OUT 22h. ConfigRegAddr OUT 23h. ConfigRegOata Configuration register read: OUT 22h. ConfigRegAddr .IN 23h Table 5. AlP Configuration Registers 82091AA Configuration Address Index Abbreviation OOh AIPIO 01h 02h 03h Register Name Access Product Identification RO AIPREV Revision Identification RO AIPCFG1 82091 AA Configuration 1 R/W AIPCFG2 82091AA Configuration 2 R/.W - 04-0Fh Reserved - 10h FCFG1 FOC Configuration R/W 11 h FCFG2 FOC Power Management and Status R/W - 12-1Fh 20h PCFG 21h PCFG2 - 22-2Fh Reserved - Parallel Port Configuration R/W Parallel Port Power Management and Status R/W Reserved - 30h SACFG1 Serial Port A Configuration R/W 31h SACFG2 Serial Port A Power Management and Status R/W - 32-3Fh Reserved - 40h SBCFG1 Serial Port B Configuration R/W 41h SBCFG2 Serial Port B Power Management and Status R/W 42-4Fh 50h 51-FFh - Reserved - Reserved IOE Configuration ICFG R/W - NOTE: Writing to a reserved 110 address should not be attempted and can result in unintended device operations. 4-31 82091AA 4.1.2 AIPID-AIP IDENTIFICATION REGISTER Index Address: Default Value: Attribute: Size: OOh AOh Read Only 8 bits Bit 7:0 Description AlP IDENTIFICATION (AIPID): A value of AOh is assigned to the 82091 AA. This 8-bit register combined with the 82091AA Revision Identification Register uniquely identifies the device. 4.1.3 AIPREV-AIP REVISION IDENTIFICATION, Index Address: Default Value: Attribute: Size: 01h OOh Read Only 8 bits This register contains two fields that identify the revision of the 82091 AA device. The revision number will be incremented for every stepping, even if change is invisible to software. 7 43 OBit ,-----------,------------, Stepplng# Dash# '-------------........-----------~L-- Revision Identification Number (RO) 290486-6 Figure 6. AlP Revision Identification Register Bit Description 7:4 STEP NUMBER: Contains the hexadecimal representation of the device stepping. 3:0 DASH NUMBER: Contains the hexadecimal representation of the dash number of the device ' stepping. 4-32 82091AA 4.1.4 AIPCFG1-AIP CONFIGURATION 1 REGISTER Index Address: Default Value: Attribute: Size: 02h Depends upon hardware strap Read/Write 8 bits The AIPCFG1 Register enables/disables master clock circuitry for power management, enables/disables access to the configuration registers, and selects the 82091AA configuration mode. This register provides status for certain hardware configuration selections-the 82091 AA clock frequency, power supply voltage, and address assignment for the configuration registers (address locations of the INDEX and TARGET Registers). Bit Default Clock Off (R/W) 1=AIP Powered Off O=AIP Powered On Reserved Reserved Configuration Address Select (RO) 1=Secondary Address (24h125h for X-Bus and 398h1399h for ISA Bus) O=Prlmary Address (22h/23h for X-Bus and 26Ehl26Fh for ISA Bus) Configuration Mode Select (RIW) OO=Software Motherboard 01 =Software Add-in 10=Extended Hardware 11 =Baslc Hardware Supply Voltage (RO) 1=3.3 Volts' 0=5.0 Volts Not Used Always Write 0 290486-7 NOTES: *3.3V operation is available only in the 82091 AA. X = Value is determined by hardware strapping options as described in Section 4.2, Hardware Configuration. Figure 7. AlP Configuration 1 Register 4-33 82091AA Bit Description 7 NOT USED: Always write to O. 6 VOLTAGE SELECT (VSEL): This bit indicates whether 3.3V or 5V has been selected for the operating power supply voltage during hardware configuration. A 1 indicates that 3.3V is selected and a 0 indicates that 5V is selected. This bit is read only and writes have no effect. NOTE: 3.3V operation is available only in the 82091 AA. 5:4 CONFIGURATION MODE SELECT (CFGMOD): These bits indicate the configuration mode for the 82091AA. After a hard reset, these bits reflect the mode selected by hardware configuration. If configuration register access is not locked out during hardware configuration, software can change the configuration mode by writing to this field. For configuration mo.de details, (see Section 4.2, Hardware Configuration). Bits [5:4] 00 01 10 11 Configuration Mode Software Motherboard (SWMB) Software Add-in (SWAI) Extended Hardware (HWE) Basic Hardware (HWB) 3 CONFIGURATION ADDRESS SELECT (CFGADS): This read only bit indicates the address assignment for the 82091 AA configuration registers as selected by hardware configuration. Hardware configuration selects between primary addresses (22h/23h and 26Eh/26Fh) and secondary addresses (24h/25h and 398h/399h) for accessing the 82091AA configuration registers. When CFGADS = 0, the primary addresses are selected and when CFGADS = 1, the secondary addresses are selected. 2 RESERVED 1 RESERVED 0 CLOCK OFF (CLKOFF): The CLKOFF bit is used to implement clock circuitry power management. When CLKOFF = 0, the main clock circuitry is powered on. When CLKOFF = 1, the main clock Circuitry is powered off. This capability is independent of the 82091 AA's powerdown state. Note that auto powerdown mode and powerdown have no effect over the power state of the clock circuitry. 4.1.5 AIPCFG2-AIP CONFIGURATION 2 REGISTER Index Address: Default Value: Attribute: Size: 03h OOOOORRR Read/Write 8 bits This register selects the active signal level for IRQ[7:3]. The interrupt signals can be individually programmed for either active high or active low drive characteristics. The active high mode is ISA (non-share) compatible and has tri-state drive characteristic. The active low mode is EISA (sharable) compatible and has an open collector drive characteristic. 4-34 82091AA 7 6 543 2 OBit Default. IRQ3 Mode Select (RIW) 1=Actlve Low (Open Collector Drive) O=Active High (Trl-state Drive) IRQ4 Mode Select (RIW) 1=Actlve Low (Open Collector Drive) O=Actlve High (Trl-state Drive) IRQS Mode Select (R/W) 1=Actlve Low (Open Collector Drive) O=Actlve High (Trl-state Drive) IRQ6 Mode Select (RIW) 1=Actlve Low (Open Collector Drive) O=Actlve High (Trl-state Drive) IRQ7 Mode Select (RIW) 1=Actlve Low (Open Collector Drive) O=Actlve High (Trl-state Drive) 290486-8 Figure 8. AlP Configuration 2 Register Bit Description 7 IRQ7 MODE SELECT (IRQ7MOD): When IR07MOD = 0, IR07 is an active high tri-state drive signal. When IR07MOD = 1, IR07 is an active low open collector drive signal. 6 IRQ6 MODE SELECT (IRQ6MOD): When IR06MOD = 0, IR06 is an active high tri-state drive signal. When IR06MOD = 1, IR06 is an active low open collector drive signal. 5 IRQS MODE SELECT (IRQSMOD): When IR05MOD = 0, IR05 is an active high tri-state drive signal. When IR05MOD = 1, IR05 is an active low open collector drive signal. 4 IRQ4 MODE SELECT (IRQ4MOD): When IR04MOD = 0, IR04 is an active high tri·state drive signal. When IR04MOD = 1, IR04 is an active low open collector drive signal. 3 IRQ3 MODE SELECT (IRQ3MOD): When IR03MOD = 0, IR03 is an active high tri-state drive signal. When IR03MOD= 1, IR03 is an active low open collector drive signal. 2:0 RESERVED 4-35 82091AA 4.1.6 FCFG1-FDC CONFIGURATION REGISTER Index Address: Oefault Value: Attribute: Size: 10h ORRR RR01 Read/Write 8 bits This register selects between a 2 and 4 floppy drive system, selects primary/secondary ISA address range for the FOG, and enables/disables the FOG. All bits in this register are read/write. r-'-7--r...::6--:._ _ _ _ _....;2::...,--:.--r...::D-, Bit ! D'! R ! 0" !1" ! Default L FDC Enable (RIW) 1=Enable D=Dlsable - FDC Address Select (RIW) 1=Secondary FDC Address (37D-3n) O=Prlmary FDC Address (3FD·3F7) -Reserved ~ Floppy Disk Drive Quantity (RIW) 1=Four Floppy Disk Drives (with External Decoder) D=Two Floppy Disk Drives (without External Decoder) 290486-9 NOTES: 'Default shown is for SWMB, SWAI, and HWB hardware configuration modes. For HWE, the default is determined by hardware strapping options as described in Section 4.2, Hardware Configuration. "Default shown is for SWMB and SWAI configuration modes. For HWB and HWE configuration modes, the default is determined by hardware strapping options as described in Section 4.2, Hardware Configuration. figure 9. FDC Configuration Register Bit Description 7 FLOPPY DISK DRIVE QUANTITY (FDDQTY): This bit selects between two and four floppy 9iSk drive capability. When FOOOTY = 0, the 82091 AA can control two floppy disk drives directly without an external decoder. When FOOOTY= 1, the 82091AA can control four floppy disk drives with an external decoder. When FOOOTY = 1 , the POEN feature in the powerdown command is disabled. For further details, see Appendix A, FOG Four Orive Support. This bit can be configured by hardware extended configuration (HWE) at powerup. For all other hardware configuration modes (SWMB, SWAI, and HWB), the floppydisk drive quantity is not configurable by hardware strapping options and defaults to 2 drives. 6:2 RESERVED 1 FLOPPY DISK CONTROLLER ADDRESS SELECT (FADS): When FAOS = 0, the primary FOG address (3FO-3F7) is selected. When FAOS = 1, the secondary FOG address (370-377) is selected. For SWMB and SWAI configuration modes, the default is 0 (primary address). For HWB and HWE hardware configuration modes, the default is determined by signal pin strapping options. 0 FLOPPY DISK CONTROLLER ENABLE (FEN): This bit enables/disables the FOG. When FEN = 1, the FOG is enabled. When FEN = 0, the FOG module is disabled. For SWMB and SWAI configuration modes, the default is 1 (enabled). For HWB and HWE hardware configuration modes, the default is determined by Signal pin strapping options. Note that, when the FOG is disabled, IR06 and FOOREO are tri-stated. 4-36 82091AA 4.1.7 FCFG2-FDC POWER MANAGEMENT AND STATUS REGISTER Index Address: Default Value: Attribute: Size: 11 h RRRR 0000 Read/Write 8 bits This register enables/disables FOG auto powerdown and can place the FOG into direct powerdown. The register also provides FOG idle status and FOG reset control. 43210811 7 I R lololololoerault L FOC Direct Powerdown Control (RIW) 1=Powerdown O=Not In Direct Powerdown - FOC Idle Status (RO) 1=ldle O=Actlve - FOC Reset (RIW) 1=Reset FOC Module O=No FOC Module Reset '-- FOC Auto Powerdown Enable (RIW) 1=Enable O=Olsable ' - - Reserved 290486-10 Figure 10. FDC Power Management and Status Register Description Bit 7:4 RESERVED 3 FLOPPY DISK AUTO POWER DOWN ENABLE (FAPDN): This bit is used to enable/disable auto powerdown for the FOG. When FAPON = 1, the FOG will enter auto powerdown when the required conditions are met. When FAPON = 0, FOG auto powerdown is disabled. 2 FLOPPY DISK CONTROLLER RESET (FRESET): FRESET is a reset for the FOG. When FRESET = 1, the FOG is reset (Le., all programming and current state information is lost). FRESET = 1 has the same affect on the FOG as a hard reset (asserting the RSTORV signal). When resetting the FOG via this configuration bit, the software must toggle this bit and ensure the reset active time (FRESET = 1) of 1.13 J.ts minimum is met. 1 FLOPPY DISK CONTROLLER IDLE STATUS (FIDLE): When the FOG is in the idle state, this bit is set to 1 by the 82091 AA hardware. In the idle state the FOG's Main Status Register (MSR) = 80h, IRQ6 = inactive, and the head unload timer has expired. When the FOG exits its idle state, this bit is set to O. This bit is read only. 0 FLOPPY DISK CONTROLLER POWERDOWN (FDPDN): When FOPON is set to 1, the FOG is placed in direct powerdown. Once in powerdown the following procedure should be used to bring the FOG out of powerdown: • Write this bit low • Apply a hardware reset (via bit 2 of this register) or a software reset (via either bit 2 of the FOG's DaR or bit 7 of the FOG's OSR). NOTE: A hard reset via the RSTORV pin also removes the FOG powerdown. 4-37 82091AA 4.1.8 PCFG1-PARALLEL PORT CONFIGURATION REGISTER Index Address: Default Value: Attribute: Size: 20h ODOR 0000 Read/Write 8 bits The PCFG1 Register enables/disables the parallel port, selects the parallel port address, and selects the parallel port interrupt. This register also selects the hardware operation mode for the parallel port. 7 6 5 4 3 2 OBit ~-'-------r--~--r-----~--~ Default PP Enable (RIW) 1=Enable O=Dlsable PP Address Select (RJW) 00=378h-37Fh 01 =278h-27Fh 10=3BCh·3BEh 11=Reserved PP Interrupt Select (RIW) 1=IRQ7 0=IRQ5 Reserved PP Hardware Mode Select (RJW) See Text Description PP FIFO Threshold Select (RIW) 1=1 (forward), 15 (reverse) 0=8 (forward and reverse) 290486-11 NOTES: 'Default shown is for SWMB and SWAI configuration modes. For HWB and HWE modes, the default'is determined by hardware configuration options as described in Section 4.2, Hardware Configuration. "Default shown is for SWMB, SWAI, and HWB configuration modes. For HWE mode, the default is determined by hardware configuration options as described in Section 4.2, Hardware Configuration. Figure 11. Parallel Port Configuration Register 4·38 82091AA Bit 7 Description PARALLEL PORT FIFO THRESHOLD SELECT (PTHRSEL): This bit controls the FIFO threshold and only affects parallel port operations when the parallel port is in ECP mode or ISA-Compatible FIFO mode. When PTHRSEL = 1, the FIFO threshhold is 1 in the forward direction and 15 in the reverse direction. When PTHRSEL= 0, the FIFO threshold is 8 in both directions. This bit can only be programmed when the parallel port is in ISA-Compatible or PS/2-Compatible mode. These modes can be selected via bits[6:5] of this register or the ECP Extended Control Register (ECR). NOTE: In the reverse direction, a threshold of 15/8 means that a request (OMA or Interrupt is enabled) is generated when 15/8 bytes are in the FIFO. In the forward direction, a threshold of 1/8 means that a request is generated when 1/8 byte locations are available. 6:5 PARALLEL PORT HARDWARE MODE SELECT (PPHMOD): This field selects the parallel port hardware mode. The ISA-Compatible mode is for compatibility and nibble mode peripheral interface protocols. The PS/2-Compatible mode is for the byte mode peripheral interface protocol. The EPP and ECP modes are for the EPP and ECP mode peripheral interface protocols, respectively. This field can be configured by strapping options at powerup for hardware extended configuration (HWE) mode only. For all other hardware configuration modes (SWMB, 5WAI, and HWB), the default is 00 (ISA-Compatible). Bits [6:5] 00 01 10 11 Read ISA-Compatible PS/2-Compatible EPP ECP(2) Write ISA-Compatible(1) PS/2-Compatible(1) EPp(1,3) Reserved; do not write(2) NOTES: 1. ISA-Compatible, PS/2-Compatible, and EPP modes are selected via this field or hardware configuration. In addition, ISA-Compatible and PS/2-Compatible modes can be selected via the ECP Extended Control Register (ECR).When the ECR is programmed for one of these two modes (ECR[7:5] = 000,001), this field is updated to match the selected mode. 2. ECP Mode can not be entered by programming this field. ECP Mode can only be selected through the ECA. When the ECR is programmed for ECP mode, the 82091 AA sets this field to 11. 3. Parallel port interface signals controlled by the PCON Register (SELECTIN#, INIT#, AUTOFO#, and STROBE#) should be negated before entering EPP mode. 4 RESERVED 3 PARALLEL PORT IRQ SELECT (PIRQSEL): When PIROSEL = 1, IR07 is selected as the parallel port interrupt. When PIROSEL= 0, IR05 is selected as the parallel port interrupt. This field can be configured by strapping options at powerup for HWB and HWE modes only. For all other hardware configuration modes (SWMB and SWAI), the default is 0 (lR05). 4-39 82091AA Bit 2:1 Description PARALLEL PORT ADDRESS SELECT (PADS): This field selects the addrelis for the parallel port as follows: Blts[2:1] Address Parallel Port Hardware Mode 00 378-37F All 278-27F 01 All 10 3BC-3BE All except EPP 11 Reserved None, do not write This field can be configured by strapping options at powerup for HWB and HWE modes only. For all other hardware configuration modes (SWMB and SWAI), the default is 00 (378h-37Fh). Note that the SWMB and SWAI default settings for PIROSEL (bit 3) and PADS (bits[2, 11) do not match a standard PC/AT" combination for address assignment and interrupt setting. However, for SWMB and SWAI, the parallel port defaults to a disabled condition and this register must be programmed to enable the parallel port (i.e., bit 0 set to 1). At this time, the selections for interrupt and address assignments should be made. 0 PARALLEL PORT ENABLE (PEN): When PEN = 0, the parallel port is disabled. When PEN = 1, the parallel port is enabled. This bit can be configured by hardware strapping options at powerup for HWB and HWE modes only. For all other hardware configuration modes (SWMB and SWAI), the default is 0 (disabled). Note that when the parallel port is disabled, IRO[7,5] and PPDREO are tristated. 4.1.9 PCFG2-PARALLEL PORT POWER MANAGEMENT AND STATUS REGISTER Index Address: Default Value: Attribute: Size: 21h RROROOOO Read/Write 8 bits This register enables/disables parallel port auto powerdown and can place the parallel port into a powerdown mode directly. The register also provides parallel port idle status, resets the parallel port, and reports FIFO underrun or overrun errors. 'Other brands and names are the property of their respective owners. 4-40 82091AA 7 4 Bit r------r--,---~~--~--_r__, Default PP Port Direct Powerdown (RIW) 1=Enabled O=Dlsabled PP Idle Status (RO) 1=ldle O=Actlve PP Reset (RIW) 1=Actlve O=lnactlve PP Auto Powerdown Enable (RIW) 1=Enable O=Dlsable Reserved PP FIFO Error Status (RO) 1=Underrun or Overrun O=No Underrun or Overrun Reserved 290486-12 Figure 12. Parallel Port Power Management and Status Register Bit Description 7:6 RESERVED 5 PARALLEL PORT FIFO ERROR STATUS (PFERR): When PFERR = 1, a FIFO underrun or overrun condition has occurred. This bit is read only. Setting PRESET to 1 clears this bit to O. 4 RESERVED 3 PARALLEL PORT AUTO POWER DOWN ENABLE (PAPDN): When PAPDN = 1, the parallel port can enter auto powerdown if the required auto powerdown conditions are met. When PAPDN = 0, auto powerdown is disabled. 2 PARALLEL PORT RESET (PRESET): When PRESET is set to 1, the parallel port is reset (i.e., all programming and current state information is lost). This is the same state the module would be in after a hard reset (RSTDRV asserted) to the 82091 AA. When resetting the parallel port via this configuration bit, the software,must toggle this bit and ensure the reset active time (PRESET= 1) of 1.13 /-Ls minimum is met. 1 PARALLEL PORT IDLE STATUS (PIDLE): This bit reflects the idle state of the parallel port. When the parallel port is in an idle state (i.e., when the same conditions are met that apply to entering auto powerdown) the 82091 AA sets this bit to 1. The parallel port idle state is defined as the FIFO empty and no activity on the parallel port interface. This bit is read only. 0 PARALLEL PORT DIRECT POWERDOWN (PDPDN): When PDPDN is set to 1, the parallel port enters direct powerdown. When PDPDN is set to 0, the parallel port is not in direct powerdown. Note that a parallel port module reset (PRESET bit in this register) also brings the parallel port out of the direct powerdown state. 4·41 82091AA 4.1.10 SACFG1-SERIAL PORT A CONFIGURATION REGISTER Index Address: Default Value: Attribute: Size: 30h ORRO 0000 Read/Write 8 bits The SACFG1 register enables/disables Serial Port A, selects the Serial Port A address range, and selects between IR03 and IR04 as the Serial Port A interrupt. This register also selects the appropriate clock frequency for use with MIDI. NOTES: 1. Through programming of this register and the SBCFG1 Register, the 82091 AA permits serial ports A and B to be configured for the same interrupt assignment. However, software must take care in responding to interrupts correctly. 2. It is possible to enable and assign both serial ports to the same address through software. In this configuration, the 82091AA disables serial port B, but does not set serial port B into it's powerdown condition. Although this is a safe configuration for the 82091 AA, it is not power conservative and is . not recommended. 7 I I 6 5 4 3 Rio' I O· OBit I I r--,-------r--~----------r__, 0 O· O· O· Default L Serial Port A Enable (R/W) 1=Enable O=Dlsable ' - - Serial Port A Address Select (R/W) (ISA Address Range) 000=3FS-3FFh 001=2FS-2FFh 010=220-227h 011 =22S-22Fh 100=23S-23Fh 101 =2ES-2EFh 110=33S-33Fh 111 =3ES-3EFh '--- Serial Port A IRQ Select (R/W) 1=IRQ4 0=IRQ3 '--Reserved - MIDI Clock Enable for Serial Port A (R/W) 1=2 MHz Clock for Serial Port A (used for generating MIDI baud rate) 0=1.S462 MHz Clock for Serial Port A 290486-13 NOTE: ·Default shown is for SWMB and SWAI hardware configuration modes. For HWB and HWE modes, the default is determined by hardware strapping options as described in Section 4.2, Hardware Configuration. Figure 13. Serial Port A Configuration Register 4-42 82091AA Description Bit 7 MIDI CLOCK FOR SERIAL PORT A ENABLE (SAMIDI): When SAMIDI = 1, the clock into Serial Port A is changed from 1.8462 MHz-2 MHz. The 2 MHz clock is needed to generate the MIDI baud rate. When SAMIDI = 0, the clock frequency is 1.8462 MHz. 6:5 RESERVED 4 SERIAL PORT A IRQ SELECT (SAIRQSEL): When SAIRQSEL = 0, IRQ3 is selected for the Serial Port A interrupt. When SAIRQSEL= 1, IRQ4 is selected for the Serial Port A interrupt. This bit can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI hardware configuration modes, the default is 0 (IRQ3). Note that, while the default address and IRQ assignments for SWMB and SWAI modes are the same for both serial ports, the serial ports are disabled and programming of this register is required for operation. 3:1 SERIAL PORT A ADDRESS SELECT (SAADS): This field selects the ISA address range for Serial Port A as follows: Bits[3:1] 000 001 01 0 01 1 1 00 1 01 110 111 ISA Address Range 3F8-3FFh 2F8-2FFh 220-227h 228-22Fh 238-23Fh 2E8-2EFh 338-33Fh 3E8-3EFh This field can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI hardware configuration modes, the default is 000 (3F8-3FFh). Note that, while the default address and IRQ assignments for SWMB and SWAI modes are the same for both serial ports, the serial ports are disabled and programming of this register is required for operation. 0 SERIAL PORT A ENABLE (SAEN): When SAEN = 1, Serial Port A is enabled. When SAEN = 0, Serial Port A is disabled. This bit can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI hardware configuration modes, the default is 0 (disabled). 4.1.11 SACFG2-SERIAL PORT A POWER MANAGEMENT AND STATUS REGISTER Index Address: Default Value: Attribute: Size: 31h RRRO OOUO Read/Write 8 bits This register enables/disables the Serial Port A module auto powerdown and can place the module into a direct powerdown mode. The register also provides Serial Port A idle status, resets the Serial Port A module, and places Serial Port A into test mode. 4-43 82091AA 7 5 4 3 2 OBit R Serial Port A Direct Powerdown (R/W) 1=Enable O=Dlsable Serial Port A Idle Status (RO) 1=ldle O=Actlve Serial Port A Reset (R/W) 1 =Reset Active O=Reset Inactive Serial Port A Auto Powerdown Enable (R/W) 1=Enable O=Dlsable Serial Port A Test Mode (R/W) 1=Enable O=Dlsable Reserved 290486-14 NOTE: U = Undefined Figure 14. Serial Port A Power Management and Status Register Bit Description 7:5 RESERVED 4 SERIAL PORT A TEST MODE (SATEST): The serial port test mode provides user access to the output of the baud out generator. When SATEST = 1 (and the DLAB bit is 1 in the LeR), the Serial Port A test mode is enabled and the baud rate clock is output on the SOUTA pin (Figure 15). When SATEST = 0, the Serial Port A test mode is disabled. 4-44 82091AA elK BAUDOUT (+1) BAUDOUT (+2) BAUDOUT (+3) I BAUDOUT (+N, N>3) 1'--__--' (t L 290486-15 Figure 15. Test Mode Output (SO UTA and SOUTB) Bit Description 3 SERIAL PORT A AUTO POWER DOWN ENABLE (SAAPDN): This bit enables/disables auto powerdown. When SAAPDN = 1, Serial Port A can enter auto powerdown if the required conditions are met. The required cO:·iditions are that the transmit and receive FIFOs are empty and the timeout counter has expired. Wh ··1 SAAPDN = 0, auto powerdown is disabled. 2 SERIAL PORT A RESET (SARESET): When SARESET = 1, the Serial Port A module is reset (i.e. all programming and current state information is lost). This is the same state the module would be in after a hard reset (RSTDRV asserted); When resetting the serial port via this configuration bit, the software must toggle this bit and ensure the reset active time (SARESET= 1) of 1.13 ,""S minimum is met. 1 SERIAL PORT A IDLE STATUS (SAIDLE): When Serial Port A is in an idle state the 82091AA sets this bit to 1. Serial Port A is in the idle state when the transmit and receive FIFOs are empty and the timeout counter has expired. Note that these are the same conditions that apply to entering auto powerdown. When serial port A is not in an idle state, the 82091 AA sets this bit to O. Direct powerdown does not affect this bit and in auto powerdown SAIDLE is only set to a 1 if the receive and transmit FIFOs are empty. This bit is read only. During a hard reset (RSTDRV asserted), The 82091 AA sets SAIDLE to O. However, because the serial port is typically initialized by software before the idle conditions are met, the default state is shown as undefined. 0 SERIAL PORT A DIRECT POWERDOWN (SADPDN): When SADPDN = 1, Serial Port A is placed in direct powerdown mode. Setting this bit to 0 brings Serial Port A out of direct powerdown mode. Setting bit 2 (SARESET) of this register to 1 will also bring Serial Port A out of the direct powerdown mode. NOTE: Direct powerdown resets the receiver and transmitter portions of the serial port including the receive and transmit FIFOs. To ensure that the resetting of the FIFOs does not cause data loss, the SAIDLE bit should be 1 before placing the serial port into direct powerdown. 4-45 82091AA 4.1.12 SBCFG1-SERIAL PORT B CONFIGURATION REGISTER Index Address:, Default Value: Attribute: 'Size: 40h ORRO 0000 Read/Write 8 bits The SBCFG1 register enables/disables Serial Port B, selects the Serial Port B address range, and selects between IR03 and IR04 as the Serial Port B interrupt. This register also selects the appropriate clock frequency for use with MIDI. ' NOTES: 1. Through programming of this register and the SBCFG1 Register, the 82091AA permits serial ports A and B to be configured for the same inte~rupt assignment. However, software must take care in responding to interrupts correctly. 2. It is possible to enable and assign' both serial ports to the same address through software. In this configuration, the 82091AA disables serial port B; but does not set serial port.B into it's powerdown condition. Although this is a safe configuration for the 82091AA, it is not power conservative and is not recommended. 7 4 Serial Port B Enable (RIW) 1=Enable O=Dlsable Serial Port B Address Select (RIW) (ISA Address Range) , 00o--3F8-3FFh 001 =2FS-2FFh 010=220-227h 011=22S-22Fh 100=23S-23Fh 101=2ES-2EFh ' 110=33S-33Fh 111 =3ES-3EFh Serial' Port B IRQ Select (RIW) 1=IRQ4 0=IRQ3 Reserved MIDI Clock Enable for Serial Port B (RIW) 1=2 MHz Clock for Serial Port B (used for generating MIDI biud rate) 0=1.8462 MHz Clock for Serial Port B 290486-16 NOTE: 'Default shown is for SWMB and SWAI hardware configuration modes. For HWB and HWE modes, the default is determined by hardware strapping options as described in Section 4.2" Hardware Configuration. Figure 16. Serial Port B Configuration Register 4-46 82091AA Bit 7 Description MIDI CLOCK FOR SERIAL PORT B ENABLE (SBMIDI): When SBMIDI = 1, the clock into Serial Port B is changed from 1.8462 MHz to 2 MHz. The 2 MHz clock is needed to generate the MIDI baud rate. When SBMIDI = 0, the clock frequency is 1.8462 MHz. The default value is O. 6:4 RESERVED 4 SERIAL PORT B IRQ SELECT (SBIRQSEL): When SBIROSEL = 0, IR03 is selected for the Serial Port B interrupt. When SBIROSEL= 1, IR04 is selected for the Serial Port B interrupt. The default value is O. This bit can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI configuration modes, the default is 0 (IR03). Note that, while the default address and IRO assignments for SWMB and SWAI modes are the same for both serial ports, the serial ports are disabled and programming of this register is required for operation. 3:1 SERIAL PORT B ADDRESS SELECT (SBADS): This field selects the ISA address range for Serial Port B as follows: Bits[3:1] 000 001 010 01 1 100 101 110 111 ISA Address Range 3F8-3FFh 2F8-2FFh 220-227h 228-22Fh 238-23Fh 2E8-2EFh 338-33Fh 3E8-3EFh This field can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI configuration modes, the default is 000 (3F8-3FFh). Note that, while the default address and IRO assignments for SWMB and SWAI modes are the same for both serial ports, the serial ports are disabled and programming of this register is required for operation. 0 SERIAL PORT B ENABLE (SBEN): When SBEN = 1, Serial Port B is enabled. When SAEN = 0, Serial Port B is disabled. This bit can be configured by strapping options at powerup for HWB and HWE modes only. For SWMB and SWAI configuration modes, the default is 0 (disabled). 4-47 82091AA 4.1.13 SBCFG2-SERIAL PORT B POWER MANAGEMENT AND STATUS REGISTER Index Address: Default Value: Attribute: Size: 41h RRRQ QQUQ Read/Write 8 bits This register enables/disables the Serial Port B module auto powerdown and can place the module into a powerdown mode directly. The register also provides Serial Port B idle status, resets the Serial Port B module, and enables/disables Serial Port B test mode. 7 5 4 3 o 2 BII .-------~r__r--~-,---r--, R Serial Port B Direct Powerdown (R/W) 1=Enable O=Dlsable Serial Port B Idle Status (RO) 1=ldle O=Acllve Serial Port B Reset (RIW) 1=Reset Active O=Reset Inactive Serial Port B Auto Powerdown Enable (RIW) 1=Enable O=Dlsable Serial Port B Test Mode (R/W) 1=Enable O=Dlsable Reserved 290486-17 NOTE: U= Undefined Figure 17. Serial Port B Power Management and Status Register 4-48 82091AA Bit Description 7:5 RESERVED 4 SERIAL PORT B TEST MODE (SBTEST): The serial port test mode provides user access to the output of the baud out generator. When SBTEST = 1 (and the DLAB bit is 1 in the LeR), the Serial Port B test mode is enabled and the baud rate clock is output on the SOUTB pin (Figure 15). When SBTEST = 0, the Serial Port B test mode is disabled. 3 SERIAL PORT B AUTO POWERDOWN ENABLE (SBAPDN): This bit enables/disables auto powerdown. When SBAPDN = 1, Serial Port B can enter auto powerdown if the required conditions are met. The required conditions are that the transmit and receive FIFOs are empty and the timeout counter has expired. When SBAPDN = 0, auto powerdown is disabled. 2 SERIAL PORT B RESET (SBRESET): When SBRESET = 1, Serial Port B is reset (i.e., all programming and current state information is lost). This is the same state the module would be in after a hard reset (RSTDRV asserted). When resetting the serial port via this configuration bit, the software must toggle this bit and ensure the reset active time (SBRESET = 1) of 1.13 /Ls minimum is met. 1 SERIAL PORT B IDLE STATUS (SBIDLE): When Serial Port B is in an idle state the 82091AA sets this bit to 1. Serial Port B is in the idle state when the transmit and receive FIFOs are empty and the timeout counter has expired. Note that these are the same conditions that apply to entering auto powerdown. When serial port B is not in an idle state, the 82091AA sets this bit to o. Direct powerdown does not affect this bit and in auto powerdown, this bit is only set to a 1 if the receive and transmit FIFOs are empty. This bit is read only. During a hard reset (RSTDRV asserted), the 82091AA sets this bit to O. However, because the serial port is typically initialized by software before the idle conditions are met, the defaullt state is shown as undefined. 0 SERIAL PORT B DIRECT POWER DOWN (SBDPDN): When SBDPDN = 1, Serial Port B is placed in powerdown mode. Setting this bit to 0 brings the module out of direct powerdown mode. Setting bit 2 (SBRESET) of this register to 1 will also bring Serial Port B out of the direct powerdown mode. NOTE: Direct powerdown resets the receiver and transmitter portions of the serial port including the receive and transmit FIFOs. To ensure that the resetting of the FIFOs does not cause data loss, the SBIDLE bit should be 1 before placing the serial port into direct powerdown. 4-49 82091AA 4.1.14 IDECFG-IDE CONFIGURATION REGISTER Index Address: Default Value: Attribute: Size: 50h RRRR R001 Read/Write 8 bits The IDECFG Register sets up the 82091 AA IDE interface. This register enables the IDE interface and selects the address for accessing the IDE. 3 7 l R J 2 1 0""1 0" 0 Bit 1" Default J I - IDE Interface Enable (RIW) 1=Enable O=Dlsable '--IDE Address Select (RIW) 1=Secondary IDE Address (170·177, 376, 377) O=Prlmary IDE Address (1FO·1F7, 3F6, 3F7) '--IDE Dual Interface Select (R/W) 1=Prlmary and Secondary Addresses Selected O=Duallnterface Disabled -Reserved 290486-18 NOTES: • Default shown is for SWMB and SWAI configuration modes. For HWBand HWE hardware configuration modes, the default is determined by hardware strapping options as described in Section 4.2, Hardware Configuration . •• Not hardware configurable. Figure 18. IDE Configuration Register Bit Description 7:3 RESERVED 2 IDE DUAL SELECT (IDUAL): When IDUAL = 0, the IDE address selection is determined by the lADS bit. When IDUAL = 1, both the primary and secondary IDE addresses are selected and the setting of the lADS bit does not affect IDE address selection. 1 IDE ADDRESS SELECT (lADS): When lADS = 0, the primary IDE address is selected (1 FOh-1 F7h, 3F6h, 3F7h ). When lADS = 1, the secondary IDE address is selected (1 FOh-1 F7h, 376h, 377h). For all hardware configuration modes (SWMB, SWAI, HWB, and HWE), the default is determined by signal pin strapping options. 0 IDE INTERFACE ENABLE (lEN): When lEN = 0, the IDE interface is disabled (i.e., the IDE chip selects (IDECS[1:0j), DEN#, and HEN# are negated (remain inactive) for accesses to the IDE primary and secondary addresses). When lEN = 1, the IDE interface is enabled. For all hardware configuration modes (SWMB, SWAI, HWB, and HWE), the default is determined by signal pin strapping options. 4·50 82091AA 4.2 Hardware Configuration Hardware configuration provides a mechanism for configuring certain 82091 AA operations at powerup. Four hardware configuration modes provide different levels of configuration depending on the type of application and the degree of hardware/software configuration desired. The hardware configuration modes are: • Software Motherboard (SWMB) • Software Add-In (SWAI) • Hardware Extended (HWE) • Hardware Basic (HWB) These modes support a variety of system implementations. For example, with Hardware Basic (HWB) and Hardware Extended (HWE) modes, an extensive set of 82091 AA configuration options are available for setting up the 82091AA at powerup. This permits the 82091AA to be used in systems without 82091 AA software drivers. For many of these systems, access to the 82091 AA configuration registers may not be necessary. As such, access to these registers can be disabled via hardware configuration. This option could be used to prevent software from inadvertently re-configuring the 82091AA. NOTE: If the 82091AA is configured in HWBor HWE configuration mode at powerup, and reconfiguration with software is desired, the 82091 AA configuration mode must first be changed to SWAI configuration mode by writing the AIPCFG1 register. The 82091AA can then remain in SWAI configuration mode to accomodate software programmable configuration changes as desired. Software Motherboard (SWMB) and Software AddIn (SWAI) modes provide a minimum hardware configuration in systems where software/firmware drivers are used for configuration. Because access to the 82091 AA configuration registers after powerup/ hardware configuration is needed, the SWMB and SWAI modes do not provide disabling access to these registers (Le., the strapping of the HEN # signal has no effect). The desired hardware configuration mode and options within the mode are selected by strapping certain 82091 AA signal pins at powerup. These signal pins are sampled when the 82091AA receives a hard reset (via RSTDRV). This section describes how to select the configuration mode and options within the mode. The section also provides example hardware connection diagrams for the different modes. 4.2.1 SELECTING THE HARDWARE CONFIGURATION MODE During powerup or a hard reset, four signal pins (DEN#, PPDIR/GCS#, DTRA, and HEN#) select the hardware configuration mode, I/O address assignment for the 82091 AA configuration registers, and whether software access to these configuration registers is permitted. The following mnemonics and signal pins are assigned for these functions: CFGMOD[1,0] Hardware Configuration Mode. The 82091 AA samples the CFGMODO (DEN#) and CFGMOD1 (PPDIR/GCS#) signal pins to select one of the four hardware configuration modes as shown in Table 6. CFGADS 82091AA Configuration Register Address Assignment. The 82091AA samples the DTRA# signal (CFGADS function) to determine the address assignment of the 82091AA configuration registers as shown in Table 6. CFGADS works in conjunction with CFGDIS. Note that the 82091AA configuration register address assignment for Hardware Basic mode is not selectable. CFGDIS 82091AA Configuration Register Disable. The 82091 AA samples CFGDIS (HEN # signal) to enable/ disable access to the 82091 AA configuration registers as shown in Table 6. Note that CFGDIS only affects. the HWE and HWB modes. NOTE: For Extended Hardware Configuration, the time immediately following the RSTDRV pulse is required to complete the configuration time. If 10RC#/IOWC# are asserted during this time, 10CHRDY will be negated (wait-states inserted) until the 82091 AA configuration time expires. 4-51 82091AA Table 6. AlP Configuration Mode Register Address Assignment Configuration Register ISA Address (INDEX/TARGET) CFGDIS (HEN#) . CFGMOD1 (PPDIR) CFGMODO (DEN#) CFGADS (DTRA#) Configuration Mode X 0 0 0 SWMB X 0 0 1 SWMB 24h/25h X 0 1 0 SWAI 26Eh/26Fh X 0 1 1 SWAI 398h/399h 0 1 0 0 HWE 26Eh/26Fh 0 1 0 1 HWE 398h/399h 1 1 0 X HWE Access Disabled 0 1 1 n/a HWB 398h/399h 1 1 1 n/a HWB Access Disabled 4.2.2 SELECTING HARDWARE CONFIGURATION MODE OPTIONS Within each hardware configuration mode, a number . of options are available. For the HWB and HWE hardware configuration modes, the user can enablel 22h/23h disable the floppy disk controller and the IDE interface via the IDE chip select pins (see Table 7). If enabled, these signal pins also select the address assignment. For SWMB and SWAI configuration modes, these signal pins have no effect. Table 7. FDC and IDE Enable/Disable DDCFG1 (IDECS1#) DDCFGO (IDECSO#) 0 0 Disable Disable 0 1 Enabled (3F6-3F7h; Primary) Disable 1 0 Enabled (370-377h; Secondary) Enabled (170-177h; Secondary) 1 1 Enabled (3F6-3F7h; Primary) Enabled (1 FO-1 F7h; Primary) Floppy Disk Controller The 82091AA provides additional hardware configuration options through the SOUTA,· SOUTB, RTSA # , RTSB#, DTRA#, and DTRB# signal pins as shown· in Table 8. In the case of the Hardware Extended Mode, the 82091AA samples the signal pins at two different times (once for HWEa options and again for HWEb options). The timing for signal sampling is dis. cussed in Section 4.2.3, Hardware Configuration Timing Relationships. The options provide configura- 4-52 IDE tion of the serial ports, floppy disk controller, parallel port, IDE interface, 82091AA operating power supply voltage, 82091AA clock frequency, and address as-· signment for the 82091 AA configuration registers. Table 8 provides a matrix of the options available for each hardware configuration mode. The configuration options are selected as shown in Table 8 through Table 14. 82091AA Note that for the SWAI and SWMB modes, the selection of the operating frequency (CLKSEL), power supply voltage level (VSEL), and 82091AA configuration register address assignment (CFGADS) are the only hardware configuration options (Table 8). In these modes, software/firmware provides the remainder of the 82091 AA configuration by programming the 82091AA configuration registers (see Section 4.1, Configuration Registers). For the SWAI and SWMB modes, the 82091 AA modules are placed in the following states after powerup or a hard reset: • Serial ports disabled • Parallel port disabled • FOC enabled for two drives (primary address) • IDE enabled (primary address) Table 8. Hardware Configuration Mode Option Matrix Signal Name Basic Hardware Configuration Extended Hardware Configuration Software Add·ln Configuration Software MotherBoard Configuration HWB HWEa HWEb SWAI SWMB CLKSEU3) SOUTA SPCFGO CLKSEU3) SPCFGO CLKSEU3) SOUTB SPCFG1 PPMODO SPCFG1 RTSA# SPCFG2 PPMOD1 SPCFG2 RTSB# SPCFG3 FDDQTY SPCFG3 - DTRA# PPCFGO CFGADS PPCFGO CFGADS CDGADS DTRB# PPCFG1 VSEL PPCFG1 VSEL VSEL - - NOTES: 1. HWEa and HWEb reference the switching banks shown in Figure 22. 2. The following mnemonics are used in the table: SPCFGx = serial port configuration, PPCFGx = parallel port configuration, CLKSEL = clock select, PPMODx = parallel port hardware mode, FDDQTY = floppy disk drive quantity, VSEL = power supply voltage select, CFGADS=82091AA configuration register address assignment select. 3. Always tie this signal low with a 10K resistor. 4-53 II intel® 82091AA Table 9. Serial Port Address and Interrupt Assignments SPCFG3 (RTSB#) SPCFG2 (RTSA#) SPCFG1 (SOUTB) SPCFGO (SOUTA) Serial Port B Address Interrupt Assignment. Assignment Serial Port A Address. Assignment 0 0 0 0 Disable - Disable 0 0 0 1 Disable 0 0 1 0 Disable 0 0 1 1 - 3FS-3FFh 2FS-2FFh 3ES-3EFh 0 1 0 0 0 1 a 1 0 1 1 0 Disable 3FS-3FFh 3ES-3EFh 3FS-3FFh 3FS-3FFh .2FS-2FFh 2FS-2FFh 0 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 Disable 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 a 1 1 1 1 2FS-2FFh 2ES-2EFh 2ES-2EFh 2ES-2EFh 2ES-2EFh IAQ4 IAQ4 IAQ4 IAQ4(1) IAQ3 IAQ3 IAQ3 IAQ3 IAQ3 IAQ3(1) IAQ3 Interrupt Assignment IAQ4 IAQ3 IAQ4 - Disable Disable IAQ3 IAQ4(1) 2FS-2FFh 3ES-3EFh - Disable IAQ4 IAQ3 IAQ4 3FS-3FFh 2ES-2EFh 3ES-3EFh - Disable IAQ4 IAQ3(1) 3FS-3FFh 2FS-2FFh 3ES-3EFh IAQ4 NOTE: 1. In this configuration, the two serial ports share the same interrupt line. Responding correctly to interrupts generated in this configuration is the exclusive responsibility of software. Table 10. Parallel Port Address and Interrupt Assignments PPCFG1 (DTRB#) PPCFGO (DTRA #) 0 a 4-54 0 1 1 0 1 1 Parallel Port Address Assignment Disable 37S-37Fh 27S-27Fh 38C-38Fh Parallel Port Interrupt Assignment IAQ7 IAQ5 IAQ7 82091AA Table 11. Parallel Port Hardware Mode Select PPMOD1 (RTSA#) PPMODO (SOUTB) 0 0 0 1 1 0 EPP 1 1 Reserved Table 14. Floppy Drive Quantity Select FDDQTY (RTSB#) Number of Supported Floppy Drives ISA-Compatible 0 2 Floppy Drives PS/2-Compatible 1 4 Floppy Drives Mode NOTES: 1. PPMODx hardware configuration is effective in HWE mode only. 2. ECP mode is not selectable via hardware configuration. 3. For EPP mode, address assignment must be either 27Bh or 37Bh. Table 12. AlP Clock Select CLKSEL (SOUTA) , o NOTE: Always tie this low. Table 13. AlP Power Supply Voltage VSEL (DTRB#) Power Supply Voltage 0 5.0V Operation 1 3.3V Operation NOTES: 1. FDDQTY hardware configuration is effective in HWE mode only. 2. Four floppy drive support requires external logiC to decode. 4.2.3 HARDWARE CONFIGURATION TIMING RELATIONSHIPS The 82091AA samples all of the hardware configuration signals on the high-to-Iow transition of RSTDRV. For the HWB, SWMB, and SWAI modes, the 82091AA completes hardware configuration on this sampling (Figure 19). For HWE mode, the 82091 AA samples some of the signals twice (Figure 20). The first sampling occurs on the high-to-Iow transition of RSTDRV. As Figure 22 shows (see Section 4.2.5, Extended Hardware Configuration Mode), the HC367 tri-states its outputs when RSTDRV is negated. This permits the strapping options from the HWEb block to be sampled. A short time after RSTDRV is negated (the time is specified in Section 11.0, Electrical Characteristics), the 82091 AA samples the SOUTA, RTSA#, DTRA#, SOUTB, RTSB#, and DTRB# signals. NOTES: 1. VSEL hardware configuration is not available in HWB mode only. 2. To operate the B2091AA and all of the interfaces at 5V or 3.3V, both VCC and VCCF are connected to 5V or 3.3V power supplies, respectively. However,. in the mixed mode, hardware configuration (VSEU is set to 3.3V, Vee is connected to 3.3V, and VCCF connected to 5V. 3. 3.3V operation is available only in the B2091M. 4-55 82091AA All Hardware Configuration s'"'''f_m_p_le_d_ _ _ _ _ _ _ _ __ RSTDRV Hardware Configuration Driven by pullupfpulldown Signals AlP Operational 290486-19 Figure 19. HWB, SWMB, and SWAI Hardware Configuration Mode Timing RSTDRV Hardware Configuration Signals ~ HWEa, CFGMOD[1 ,OJ, VSEL, CFGDIS, CLKSEL, DDCFG[1,Oj,. CFGADS, HWEb, SPCFG[3:0j, P~=f P_P_C_It-G-[1~,0-j- - - ~ _ Driven by HWE Configuration Buffer . F...,.D_D_Q_TM ___ Driven by pullup/pulldown AlP Operational 290486-20 Figure 20. HWE Hardware Configuration Mode Timing 82091AA is fixed at 5V. The parallel port mode is set to ISACompatible. In addition, the FOC floppy drive support is set at two floppy drives. If configuration register access is enabled, the access address is fixed at 398h/399h. To reconfigure the 82091AA using software, the 82091 AA configuration mode must be changed to SWAI mode (refer to AIPCFG1 register). Figure 21 shows the implementation of a basic hardware configuration. 4.2.4 HARDWARE BASIC CONFIGURATION The Hardware Basic' configuration mode permits the user to assign addresses to the serial ports and par· allel ports. This is achieved by sampling several of the serial port connections at the end of a hardware reset. The PPOIR/GCS# signal defaults to game port chip select output (GCS#). The 82091AA power supply voltage is not selectable in this mode and Hard Wire to Vcc/Vss Is Optional (~ :E;;-) ~--- - - f - - - .......... _ ( ~ 0-,) ,..---------, AlP > '> '-- - '> '> '> > '> '> SOUTA RTSA# T Hard Wire to -.l Vcc/Vss Is Optional V 10 K .... .... .... } Serial Port A Connector } Serial Port B Connector DTRA# Reset - t - - I RSTDRV .... SOUTB .... RTSB# :.... DTRB# IDECSO# },~ IDECS1# Signals DEN# PPDIRI HEN# GCS# ? tOK --:-1 T /" ,I ~~) .&'-- - >10K 'r,.., ~ l ~) v- Hard Wire to Vcc/Vss Is Optional v 290486-21 Figure 21. Hardware Basic Configuration 4-57 82091AA 4.2.5 HARDWARE EXTENDED CONFIGURATION MODE The Hardware Extended configuration mode provides all of the features of the Hardware Basic configuration mode. Additional features in Hardware Extended configuration permit the user to select quantity of floppy drives can be selected for either 2 or 4 floppy drive support. The 82091AA operating voltage is selectable between a.av· and 5V. In addition, the parallel port can be configured to operate in ISACompatible, PS/2-Compatible, or EPP modes. Hardware extended configuration provides these addi· tional hardware configuration options by sampling the pins on the serial ports at two different times. , Hard Wire to VccNss Is Optional When RSTDRV is asserted, the HCa67 drives the values on SOUTA, RTSA #, DTRA #, SOUTB, RTSB # , and DTRB # (Figure 22). When RTSDRV is negated, the HCa67 is disabled and these serial port signals are driven by HWEb pullup/down resistors. The PPDIR/GCS# signal defaults to a game port chip select (GCS#). To reconfigure the 82091AA using software, the 82091AA configuration mode must be changed to SWAI mode (refer to AIPCFG1 register). NOTE: *a.av operation is only available in the 8'2091AA. -----------.),- - - - - ~ HWEb ' \ ~~~~~~ .J -f-- - /-""T ,0-0-) > AlP " .... DTRA# SOUTB RTSB# Reset * > 10K ... SOUTA RTSA# -~ RSTDRV rd Wire to ~cNss Is Optional ~ DTRB# IDECSO# Serial Port A Connector } Serial Port B Connector " IDECS1# DEN# PPDlRI } IDE } Signals HEN# GCS# - . L/ (~ -.... !r!) *' - 10K >10K r._). l r:r-) - Hard Wire to VccNss Is Optional EA EB HC367 *4.7K §-l §-l §-l r-... v ~ ~ §-l~ CCJCI:lI:I I HWEa 290486-22 Figure 22. Hardware Extended Configuration 4-58 82091AA 4.2.6 SOFTWARE ADD-IN CONFIGURATION The Software Add-in configuration mode permits the user to assign the address for the 82091AA configuration registers, and select the power supply voltage for the 82091 AA. The 82091 AA configuration registers are accessible. The registers are located in the ISA Bus 1/0 address space and can be selected to be at either 398h/399h or 26Eh/26Fh. The PPDIR/GCS# signal defaults to a game port chip select (GCS#). 10K AlP SOUTA RTSA# OTRA# SOUTS RTSB# OTRB# Reset RSTORV IOECSO# IOECS1# OEN# PPOIRI GCS# } Serial Port A Connector Serial Port B Connector }tOE Signals HEN# 10K 1\ Hard Wire to '- ) Vcc/Vss Is Optional 290486-23 Figure 23. Software Add-In Configuration 4-59 82091AA 4.2.7 SOFTWARE MOTHERBOARD CONFIGURATION The Software Motherboard configuration mode permifs the 82091 AA to be located on the motherboard. In this mode, the 82091AA configuration registers are accessible via the X-Sus 110 address space and can be selected to be at either 22h/23h or 24h/25h. In addition, the user selects the power supply voltage for the 82091AA. The PPDIR/GCS# signal defaults to a Parallel Port Direction Control Output (PPDIR). 10K AlP SOUTA RTSA# DTRA# } Serial Port A Connector SOUTB RTSB# DTRB# Reset RSTDRV IDECSO# IDECS1# Serial Port B Connector }IDE Signal DEN# PPDIRI GCS# HEN# 10K -..... 1\ Hard Wire to / VccNss Is Optional '-290466-24 Figure 24. Software Motherboard Hardware Configuration 4-60 82091AA bus cycle to match the access time of the device connected to the Parallel Port. The 10CHRDY signal is used by the 82091AA to extend ISA Bus cycles, as needed, according to the ISA protocol. 10CHRDY overrides all other strobes that attempt to shorten the bus cycle. • NOWS# for 3 BCLK 110 Cycles. All programmed I/O accesses to 82091AA registers can be completed in a total of 3 BCLK cycles. This is possible because the 82091AA register access times have been minimized to allow data transfers to occur with shortened read/write control strobes. As a result, the 82091 AA is well suited for use in embedded control designs that use an asynchronous microprocessor interface without any particular reference to ISA cycle timings. • DMA Transfers: The 82091AA supports DMA compatible, type A, type B and type F DMA cycles. Some newer system DMA controllers are capable of generating fast DMA cycles (type F) on all DMA channels. If such a controller is used in conjunction with the 82091 M, it will be possible to accomplish a DMA transfer in 2 BCLKs. 5.0 HOST INTERFACE The 82091 AA host interface is an 8-bit direct-drive (24 mAl ISA Bus/X-Bus interface that permits the CPU to access its registers through read/write operations in I/O space. These registers may be accessed by programmed I/O and/or DMA bus cycles. With the exception of the IDE Interface, all functions on the 82091 M require only 8-bit data accesses. The 16-bit access required for the IDE Interface is supported through the appropriate chip selects and data buffer enables from the 82091AA. The 82091 AA does not participate in 16-bit IDE DMA . transfers. Although the 82091AA has an ISA/X-Bus host interface, there are a few features that differentiate it from conventional ISA/X-Bus peripherals. These features are as follows: • Internal, Configurable Chip Select Decode Logic. SA[9:0] allow full decoding of the ISA 110 address space such that the functional modules contained in the 82091 AA can be relocated to the desired 110 address. This feature can be used to resolve potential system configuration conflicts. • 10CHRDY for ISA Cycle Extension. During certain 110 cycles to the parallel port controller in the 82091 AA, it is necessary to extend the current AlP ISABus SA[9:0] SA[9:0] 10RC# 10RC# 10WC# 10WC# RSTORV 1016# The 82091AA ISA data lines (SD[7:0]) can be connected directly to the ISA Bus. If external buffers are used to isolate the SD[7:0] signals from the 240 pF loading of the ISA Bus, the DEN # signal can be used to control the external buffers as shown in Figure 25. IOECS1# I10ECSO# I- RSTORV OEN# r---- 1016# HEN# I- OACK# xOACK# OREQ# xOREQ# SO[7:0] SO[7:0] +E ~~ '-- Optional Buffer 290486-25 Figure 25. ISA Interface (with Optional Data Buffer) 4-61 82091AA 6.0 PARALLEL PORT The 82091 AA parallel port can be configured for four parallel port modes. These parallel port modes and the associated parallel interface protocols are: Parallel Port Mode Parallel Interface Protocol ISA-Compatible Mode Compatibility, Nibble PS/2-Compatible Mode Byte EPP Mode EPP ECP ECP ISA-Compatible, PS/2-Compatible, and EPP modes are selected through 82091AA configuration (see Section 4.0, AlP Configuration). ECP is selected by programming the ECP Extended Control Register (ECR). In .ISA-Compatible mode, the parallel port exactly emulates a standard ISA-style parallel port. The parallel port data bus (PD[7:0j) is uni-directional. The compatibility protocol transfers data to the peripheral device via PD[7:0j (forward direction). Note that the Nibble protocol permits data transfers from the peripheral device (reverse direction) by using four peripheral status signal lines to transfer 4 bits of data at a time. PS/2-Compatible mode differs from ISA-Compatible mode by providing bi-directional transfers on PD[7:0j. A bit is added to the PCON Register to allow software control of the data transfer direction. For both the ISA-Compatible and PS/2-Compatible modes, the actual data transfer over the parallel port interface is accomplished by software handshake (Le., automatic hardware handshake is not used). Software controls data transfer by monitoring handshake Signal status from the peripheral device via the PSTAT Register and contrOlling handshake signals to the peripheral device via the PCON Register. EPP mode provides bi-directional transfers on PD[7:0j. The 82091AA automatically generates the address and data strobes in hardware. ECP is a high performance peripheral interface mode. This mode uses an asynchronous automatic handshake to transfer data over the parallel port interface. In addition, the parallel port contains a FIFO for transferring data in ECP mode. The ECP register set contains an Extended Control Register (ECR) that provides a wide range of functions including the ability to operate the parallel port in either ECP, ISACompatible, or PS/2-Compatible modes. 4-62 NOTE: In general, this document describes parallel port operations and functions in terms of how the 82091AA parallel port hardware operates. Detailed descriptions of the parallel interface protocols are beyond the scope of this document. Readers should refer to the proposed IEEE Standard 1284 for detailed descriptions of the Compatibility, Nibble, Byte, EPP, and ECP protocols. Special circuitry on the 82091 prevents it from being powered up or being damaged while a parallel port peripheral is powered on and the 82091 is powered off. 6.1 Parallel Port Registers This section is organized into three sub-sectionsISA-Compatible and PS/2-Compatible Modes, EPP Mode, and ECP Mode. Since the register sets are similar for ISA-Compatible and PS/2-Compatible modes (differing by a direction control bit in the PCON Register) the register set descriptions are combined. The EPP mode and ECP mode register sets are described separately. Each register set description contains the I/O address assignment and a complete description of the registers and register bits. Note that the PSTAT and PCON Registers are common to all modes and for completeness are repeated in each sub-section. Any difference in bitoperations for a particular mode is noted in that particular register description. The registers provide parallel port control/status information and data paths for transferring data between the parallel port interface and the 8-bit host i~terface. All registers are accessed as byte quantities. The base address is determined by hardware configuration at powerup (or a hard reset) or via software configuration by programming the 82091AA configuration registers as described in Section 4.0, AlP Configuration. The parallel port can be disabled or configured for a base address of 378h (all modes), 278h (all modes), or 3BCh (all modes except EPP and ECP). This provides the system designer with the option of using additional parallel ports on add-in cards that have fixed address decoding. 82091AA Some of the parallel port registers described in this section contain reserved bits. These bits are labeled "R". Software must deal correctly with fields that are reserved. On reads, software· must use appropriate masks to extract the defined bits and not rely on reserved bits being any particular value. On writes, software must ensure that the values of reserved bit positions are preserved. That is, the value of reserved bit positions must first be read, merged with the new values for other bit positions, and then written back. During a hard reset (RSTDRV asserted), the 82091 AA registers are set to pre-determined default states. The default values are indicated in the individual register descriptions. The following nomenclature is used for register access attributes: RO Read Only. Note that for registers with read only attributes, writes to the I/O address have no affect on parallel port operations. R/W Read/Write. A register with this attribute can be read and written. Note that individual bits in some read/write registers may be read only. 6.1.1 ISA-COMPATIBLE AND PS/2COMPATIBLE MODES This section contains the registers used in ISA-Compatible and PS/2-Compatible modes. The I/O address assignment for this register set is shown in Table 15 and the register descriptions are presented in the order that they appear in the table. Table 15. Parallel Port Register (ISA-Compatible and PS/2-Compatible) Parallel Port Register Address Access (AEN = 0) Base + Abbreviation Oh PDATA Data Register R/W 1h PSTAT Status Register RO 2h PCON Control Register R/W Register Name Access NOTE: Parallel port base addresses are 278h, 378h and 3BCh. 4-63 82091AA 6.1.1.1 PDATA-Parallel Port Data Register (ISA-Compatlble and PS/2-Compatible Modes) I/O Address: Default Value: Attribute: Size: Base +OOh OOh Read/Write a bits ISA-.Compatlble Mode The PDATARegister is a uni-directional data port that transfers a-bit data from the host to the peripheral device (forward transfer). A write to this register drives the written data onto PD[7:0]. Reads of this register should not be performed in ISA-Compatible mode. For a host read of this address location. the a2091AA completes the handshake on the ISA Bus and the value is the last value stored in the PDATA Register. PS/2-Compatible Mode The PDATA Register is a bi-directional data port that transfers a-bit data between the peripheral device and host. The direction of transfer is determined by the DIR# bit in the PCON Register. If DIR# =0 (forward direction). and the host writes to this register. the data is stored in the PDATA Register and driven onto PD[7:0]. If DIR# = 1 (reverse direction). a host read of this register returns the data on PD[7:0]. Note that read data is not stored in the PDATA Register. Bit Description 7:0 PARALLEL PORT DATA: Bits[7:0] correspond to parallel port data lines PD[7:0] and ISA Bus data lines SD[7:0]. 6.1.1.2 PSTAT-Status Register (ISA-Compatible and PS/2-Compatible Modes) I/O Address: Default Value: Attribute: Size: Base +01h XXXX X1RR Read Only a bits The PSTAT Register provides the status of certain parallel port signals and whether a CPU interrupt has been generated by the parallel port. This register indicates the current state of the BUSY. ACK#. PERROR. SELECT. and FAULT# signals. 4-64 82091AA '-:--r--=-~-=-'--'-"'::'-r--='.,...--'---=--, Bit Default Reserved Not Used (ISA-Compatible Mode) (RO) Always 1 IRQ Signal Status (PS/2-Compatible) (RO) l=Asserted O=Negated Fault# Signal Status (RO) l=Negated O=Asserted SELECT Signal Status (RO) l=Asserted O=Negated PERROR Signal Status (RO) l=Asserted O=Negated ACK# Signal Status (RO) l=Negated O=Asserted BUSY Signal Status (RO) O=Asserted l=Negated 290486-26 NOTE: X = Default value is determined by signal state at reset. Figure 26. Status Register (lSA-Compatible and PS/2-Compatible Modes) 4-65 82091AA Bit Description 7 BUSY STATUS (BUSYS): This bit indicates the state of the parallel port interface BUSY signal. When BUSY is asserted, BUSYS=O. When BUSY is negated, BUSYS= 1.This bit is an inverted version of the parallel port BUSY signal. 6 ACK# STATUS (ACKS): This bit indicates the state of the parallel port interface ACK# signal. This bit indicates when the peripheral has received a data byte and is ready for another. When ACK # is asserted, ACKS = o. When ACK # is negated, ACKS = 1. Note that if interrupts are enabled (via bit 4 of the PCON Register), the assertion of the ACK # signal generates an interrupt to the CPU. 5 PERROR STATUS (PERRS): This bit indicates the state of the parallel port interface PERROR signal. This bit indicates when an error has occurred in the peripheral paper path (e.g., out of paper). When PERROR is asserted, PERRS = 1, When PERROR is negated, PERRS = O. 4 SELECT STATUS (SELS): This bit indicates the state of the parallel port interface SELECT signal. When the SELECT signal is asserted, SELS = 1, When the SELECT signal is negated, SELS = O. 3 FAULT# STATUS (FAULTS): This bit indicates the state of the parallel port interface FAULT# signal being driven by the peripheral device. When the FAULT # signal is asserted, FAULTS = O. When the FAULT# signal is negated, FAULTS = 1. 2 PARALLEL PORT INTERRUPT STATUS (PIRQ): This bit indicates a CPU interrupt by the parallel port. PIRQ indicates that the printer has accepted the previous character and is ready for another. In ISA-Compatible mode, interrupt status is not reported in this register and this bit is always 1. In PS/2-Compatibile mode, if interrupts are enabled via the PCON Register and the ACK # signal is asserted (Iow-to-high transition), PIRQ is set to a 0 (and an IRQ generated to the CPU). The 82091 AA sets PIRQ to 1 when this register is read or by a hard reset. If interrupts are disabled via the PCON Register, this bit is never set to O. 1:0 4-66 RESERVED 82091AA 6.1.1.3 PCON-Control Register (ISA-Compatible And PS/2-Compatible Mode) I/O Address: Default Value: Attribute: Size: Base + 02h RROO 0000 Read/Write 8 bits The PCON Register controls certain parallel port interface signals and enables/disables parallel port interrupts. This register permits software to control the STROBE#, AUTOFD#, INIT#, and SELECTIN# signals. For PS/2-Compatible mode, this register also controls the direction of transfer on PD[7:0j. STROBE# Signal (RIW) 1=Asserl O=Negate AUTOFD# Signal (RIW) 1=Asserl O=Negate INIT# Signal (RIW) 1=Negate . O=Asserl SELECTIN# Signal (RIW) 1=Asserl O=Negate PP Interrupt Enable (ACK#) (RIW) 1=Enable ,O=Dlsable Reserved (ISA-Compatible Mode) (RIW) Direction (PS/2-Compatible Mode) (RIW) 1=Reverse O=Forward Reserved 290486-27 Figure 27. Control Register (lSA-Compatible and PS/2-Compatible Modes) 4-67 82091AA Bit Description 7:6 RESERVED 5 RESERVED (ISA·COMPATIBLE MODE): Not used and undefined when read. Writes have no affect on parallel port operations. DIRECTION (DIR#) (PS/2·COMPATIBLE MODE): This bit is used to control the direction of data transfer on the parallel port data bus (PD[7:0j). When DIR# = 0, PD[7:0] are outputs. When DIR # = 1, PD[7:0] are inputs. 4 ACK# INTERRUPT ENABLE (ACKINTEN): ACKINTEN enables CPU interrupts (via either IR05 or IR07) to be generated when the ACK # signal on the parallel port interface is asserted. When ACKINTEN= 1, a CPU interrupt is generated when ACK# is asserted. When ACKINTEN=O, the ACK # interrupt is disabled. 3 SELECTIN # CONTROL (SELlNC): This bit controls the SELECTIN # signal. SELINC is set to 1 to select the printer. When SELlNC= 1, the SELECTIN# signal is asserted, When SELlNC=O, the SELECTIN# signal. is negated. 2 INIT# CONTROL (INITC): This bit controls the INIT# signal. When INITC= 1, the INIT# signal is negated. When INITC = 0, the INIT # signal is asserted. 1 AUTOFD# CONTROL (AUTOFDC): This bit controls the AUTOFD# signal. AUTOFDC is set to 1 to instruct the printer to advance the paper one line each time a carriage return is received. When AUTOFDC= 1, the AUTOFD# Signal is asserted. When AUTOFDC=O, the AUTOFD# signal is negated. ° STROBE # CONTROL (STROBEC): This bit controls the STROBE # signal. The STROBE # signal is set active to instruct the printer to accept the character being presented on the data lines. When STROBEC = 1, the STROBE # signal is asserted. When STROBEC = 0, the STROBE # signal is negated. 4·68 82091AA 6.1.2 EPP MODE This section contains the registers used in EPP mode. The I/O address assigment for this register set is shown in Table 16 and the register descriptions are presented in the order that they appear in the table. Table 16. Parallel Port Registers (EPP Mode) Parallel Port Register Address Access (AEN = 0) Base + Oh Abbreviation Register Name PDATA Data Register Access R/W 1h PSTAT Status Register RO 2h PCON Control Register R/W 3h ADDSTR Address Strobe Register R/W 4h-7h DATASTR Data Strobe Registers R/W NOTE: Parallel port base addresses are 278h (LPT2) and 378h (LPT1). Base address 3BCh is not available in EPP or ECP modes. 6.1.2.1 PDATA-Parallel Port Data Register (EPP Mode) I/O Address: Default Value: Attribute: Size: Base +OOh OOh Read/Write 8 bits The PDATA Register is a bi-directional data port that transfers 8-bit data between the peripheral device and host. The direction of transfer is determined by the DIR# bit in the PCON Register. If DIR# =0 (forward direction) and the host writes to this register, the data is stored in the PDATA Register and driven onto PD[7:0j. If DIR# = 1 (reverse direction), a host read of this register returns the data on PD[7:0j. However, read data is not stored in the PDATA Register. Bit Description 7:0 PARALLEL PORT DATA: Bits[7:0j correspond to parallel port data lines PD[7:0] and ISA Bus data lines. 4-69 82091AA 6.1.2.2 PSTAT-Status Register (EPP Mode) I/O Address: Default Value: Attribute: Size: Base +01h XXXX X1RR Read Only 8 bits The PSTAT Register provides the status of certain parallel port signals. It also indicates whether a CPU interrupt has been generated by the parallel port. This register indicates the current state of the BUSY, ACK #, PERROR, SELECT, arid FAULT# signals. Bit ,Default IRQ Signal Status (RO) Not Used In EPP Mode Always 1 Fault# Signal Status (RO) 1=Negated O=Asserted SELECT Signal Status (RO) 1=Asserted O=Negated PERROR Signal Status (RO) 1=Asserted O=Negated ACK# Signal Status (RO) 1=Negated O=Asserted BUSY Signal Status (RO) O=Asserted 1=Negated 290486-28 NOTE: X = Default value is determined by Signal state at reset. Figure 28. Status Register (EPP Mode) 4-70 82091AA Bit Description 7 BUSY STATUS (BUSYS): This bit indicates the state of the parallel port interface BUSY signal. When BUSY is asserted, BUSYS = O. When BUSY is negated, BUSYS = 1. This bit is an inverted version of the parallel port BUSY signal. 6 ACK# STATUS (ACKS): This bit indicates the state of the parallel port interface ACK# signal. This bit indicates when the peripheral has received a data byte and is ready for another. When ACK # is asserted, ACKS = o. When ACK # is negated, ACKS = 1. Note that if interrupts are enabled (via bit 4 of the PCON Register), the assertion of the ACK # signal generates an interrupt to the CPU. 5 PERROR STATUS (PERRS): This bit indicates the state of the parallel port interface PERROR signal. This bit indicates when an error has occurred in the peripheral paper path (e.g., out of paper). When PERROR is asserted, PERRS = 1. When PERROR is negated, PERRS = O. 4 SELECT STATUS (SELS): This bit indicates the state of the parallel port interface SELECT signal. When the SELECT signal is asserted, SELS = 1. When the SELECT signal is negated, SELS = O. 3 FAULT # STATUS (FAULTS): This bit indicates the state ofthe parallel port interface FAULT # signal being driven by the peripheral device. When the FAULT # signal is asserted, FAULTS = o. When the FAULT # signal.is negated, FAULTS = 1. 2 PARALLEL PORT INTERRUPT (PIRQ): In EPP mode interrupt status is not reported in this register and this bit is always 1. 1:0 RESERVED 4-71 82091AA 6.1.2.3 PCON-Control Register (EPP Mode) 1/0 Address: Default Value: Attribute: Size: Base + 02h RROO 0000 Read/Write 8 bits The PCON Register controls certain parallel port interface signals, enables/disables parallel port interrupts, and selects the direction of data transfer on PD[7:0]. This register permits software to control the INIT# signal. Note that in the EPP parallel interface protocol, the STROBE#, AUTOFD#, and SELECTIN# signals are automatically generated by the parallel port and are not controlled by software. 7 6 5 4 3 OBit 2 r-----.--,---r--r-~--~_, Default STROBE# Signal (RIW) Write toO AUTOFD# Signal (RIW) Write to 0 INIT# Signal (RJW) 1=Negate O=Assert SELECTIN# Signal (RJW) Write to 0 ACK# Interrupt Enable (RIW) 1=Enable O=Dlsable Direction (RJW) 1=Reverse O=Forward Reserved Figure 29. Control Register (EPP Mode) 4·72 290486-29 82091AA Bit Description 7:6 RESERVED 5 DIRECTION (DIR #): This bit is used to control the direction of data transfer on the parallel port data bus (PD[7:0j). When DIR # = 0 (forward direction), PD[7:0] are outputs. When DIR # = 1 (reverse direction), PD[7:0] are inputs. 4 ACK# INTERRUPT ENABLE (ACKINTEN): ACKINTEN enables CPU interrupts (via IR05 or IR07) to be generated when the ACK# signal on the parallel port interface is asserted. When ACKINTEN = 1, a CPU interrupt is generated when ACK # is asserted. When ACKINTEN = 0, the ACK # interrupt is disabled. 3 SELECTIN # CONTROL (SELlNC): Write to 0 when programming this register. This bit must be 0 for the parallel port handshake to operate properly. 2 INIT # CONTROL (INITC): This bit controls the INIT # signal. When INITC = 1, the INIT # signal is negated. When INITC=O, the INIT# signal is asserted. 1 AUTOFD# CONTROL (AUTOFDC): Write to 0 when programming this register. 0 STROBE# CONTROL (STROBEC): Write to 0 when programming this register. This bit must be 0 for the parallel port handshake to operate properly. 6.1.2.4 ADDSTR-EPP Auto Address Strobe Register (EPP Mode) \/0 Address: Default Value: Attribute: Size: Base +03h OOh Read/Write 8 bits The ADDSTR Register provides a peripheral address to the peripheral (via PD[7:0j) during a host address write operation and to the host (via PD[7:0j) during a host address read operation. An automatic address strobe is generated on the parallel port interface when data is read from or written to this register. Description EPP ADDRESS: Bits[7:0] correspond to SD[7:0] and PD[7:0]. 4-73 82091AA 6.1.2.5 DATASTR-Auto Data Strobe Register (EPP Mode) I/O Address: Default Value: Attribute: Size: Base + 04h, OSh, OSh, 07h OOh Read/Write 8 bits The DATASTR Register provides data from the host to the peripheral device (via PD[7:0» during host write operations and from the peripheral device to the host (via PD[7:0» during a host read operation. An automatic data strobe is generated on the parallel port interface when data is read from or written to this register. to maintain compatibility with Intel's 823S0SL I/O device that has a 32-bit Host Bus interface, four consecutive byte address locations are provided for transferring data. Description EPP DATA: Bits[7:0) correspond to SD[7:0]and PD[7:0). 6.1.3 ECP MODE This section contains the registers used in ECP mode. The I/O address assignment for this register set is shown in Table 17 and the register descriptions are presented in the order that they appear in the table. The Extended Control Register (ECR) permits various.modes of operation. Note that ECR[7:S) =000 selectslSACompatible mode and ECR[7:5) =001 selects PS/2-Compatibile mode. These modes are discussed in Section S.1.1, ISA-Compatible and PS/2 Compatible modes. The other modes selected by ECR[7:S) are discussed in this section. Table 17. Parallel Port Registers (ECP Mode) Parallel Port Register Address Access (AEN = 0) Base + Access Abbreviation Register Name ECR[7:5) Read/Write Attribute Oh ECPAFIFO ECP Address/RLE FIFO 011 R/W 1h PSTAT Status Register All RO 2h PCON Control Register All R/W 400h SDFIFO Standard Parallel Port Data FIFO 010 R/W 400h ECPDFIFO ECP Data FIFO 011 R/W 400h TFIFO Test FIFO 110 R/W 400h ECPCFGA ECP Configuration A 111 R/W 401h ECPCFGB ECP Configuration B 111 R/W 402h ECR Extended Control Register All R/W NOTES: 1. Parallel port base addresses are 278h, 378h, and 3BCh. 2. A register is accessible wMn the ECR[7:5) field contains the value specified in the ECR[7:5) column. The register is not accessible if the ECR[7:5) field does not match the value specified in this column. The term "All" means that the register is accessible in all modes selected by ECR[7:5). 4-74 82091AA 6.1.3.1 ECPAFIFO-ECP Address/RLE FIFO Register (ECP Mode) I/O Address: Default Value: Attribute: Size: Base +OOh UUUU UUUU (Undefined) Read/Write 8 bits The ECPAFIFO Register provides a channel address or a Run Length Count (RLE) to the peripheral, depend· ing on the state of bit 7. This I/O address location is only used in ECP mode (ECR bits[7:5] = 011). In this mode, bytes written to this register are placed in the parallel port FIFO and transmitted over PD[7:0] using ECP protocol. 16 Byte FIFO Byte 15 , , Byte 1 Byte 0 \ / I I o\ 7 I 0... 0 Bit Default L ECP AddresslRLE Value (R/W) Bit 7=1: Blts[6:0] Represents a Channel Address to the Peripheral Bit 7=0: Blts[6:0] Represents a Run Length Count to the Peripheral 290486-30 NOTE: U = Undefined Figure 30. ECP Address/RLE FIFO Register (ECP Bit 7:0 Mo~e) Description ECP ADDRESS/RLE VALUE: Bits[7:0] correspond to parallel port data lines PD[7:0] and ISA Bus data lines SD[7:0]. The peripheral device should interpret bits[6:0] as a channel address when bit 7 = 1 and as a run length count when bit 7 = o. Note that this interpretation is performed by the peripheral device and the value of bit 7 has no affect on 82091AA operations. Note that the 82091AA asserts AUTOFD# to indicate that the information on PD[7:0] represents an ECP address/RLE count. The 82091AA negates AUTOFD# (drives high) when PD[7:0] is transferring data. 4-75 82091AA 6.1.3.2 PSTAT-Status Register (ECP Mode) 1/0 Address: Default Value: Attribute: Size: Base +01h XXXXX1RR Read Only 8 bits The PSTAT Register provides the status of certain parallel port signals and whether a CPU interrupt has been generated by the parallel port. 'This register indicates the current state of the BUSY, ACK#, PERROR, SELECT, and FAULT# signals. r-'~-=---r-=-.,--'''-r-=---r-=--r-''----=-, Bit Default IRQ Signal Status (RO) Not Used on ECP Mode Always 1 Fault# Signal. Status (RO) 1=Negated ' O=Asserted SELECT Signal Status (RO) 1=Asserted O=Negated PERROR Signal Status (RO) 1=Asserted O=Negated ACK# Signal Status (RO) 1=Negated O=Asserted BUSY Signal Status (RO) O=Asserted 1=Negated ' 290486-31 NOTE: X = Default value is determined by the state of the corresponding signal pin at reset. Figure 31. Status Register (ECP Mode) 4-76 82091AA Bit Description 7 BUSY STATUS (BUSYS): This bit indicates the state of the parallel port interface BUSY signal. When BUSY is asserted, BUSYS = O. When BUSY is negated, BUSYS = 1. This is an inverted version of the parallel port BUSY signal. Refer to Section 6.2.3 ECP Mode for more detail. 6 ACK# STATUS (ACKS): This bit indicates the state of the parallel port interface ACK# signal. This bit indicates when the peripheral has received a data byte and is ready for another. When ACK# is asserted, ACKS = O. When ACK # is negated, ACKS = 1. Note that if interrupts are enabled (via bit 4 of the PCON Register), the assertion of the ACK# signal generates an interrupt to the CPU. Refer to Section 6.2.3 ECP Mode for more detail. 5 PERROR STATUS (PERRS): This bit indicates the state of the parallel port interface PERROR signal. This bit indicates when an error has occurred in the peripheral paper path (e.g., out of paper). When PERROR is asserted, PERRS = 1, When PERROR is negated, PERRS = O. 4 SELECT STATUS (SELS): This bit is used in all parallel port modes and indicates the state of the parallel port interface SELECT signal. When the SELECT signal is asserted, SELS = 1. When the SELECT signal is negated, SELS = O. 3 FAULT# STATUS (FAULTS): This bit is used in all parallel port modes and indicates the state of the parallel port interface FAULT # signal being driven by the peripheral device. When the FAULT # signal is asserted, FAULTS=O. When the FAULT# signal is negated, FAULTS = 1. 2 PARALLEL PORT INTERRUPT (PIRQ): In ECP mode, interrupt status is not reported in this register and this bit is always 1. 1:0 RESERVED 4-77 82091AA 6.1.3.3 PCON-Control Register (ECP Mode) Base + 02h RROO 0000 Read/Write 8 bits 110 Address: Default Value: Attribute: Size: The PCON Register controls certain parallel port interface signals, enables/disables parall!'!1 port interrupts, and selects the direction of data transfer on PD[7:0). Note that the function of some bits depends on the programming of the ECA. 7 6 5 4 3 2 Default STROBE# Signal (R/W) AUTOFD# Signal (RIW) INIT# Signal (RIW) 1=Negate O=Assert SELECTIN# Signal (RIW) 1=Assert O=Negate ACK# Interrupt Enable (RIW) 1=Enable O=Dlsable ISA-Compatlble Mode (ECR[7:5]=000, 010) (RO) Not Used (PD[7:0] are outputs) PS/2-Compatible and ECP Modes (ECR[7:5] = 001, 011) (RIW) 1=Reverse .Dlrectlon (PD[7:0] are Inputs) O=Forward Direction (PD[7:0] are outputs) Reserved 290486-32 Figure 32. Control Register (ECP Mode) 4-78 82091AA Bit Description 7:6 RESERVED 5 DIRECTION (DIR#): This bit is used to control the direction of data transfer on the parallel port data bus (PD[7:0j). When DIR # = 0 (forward direction), PD[7:0] are outputs. When DIR # = 1 (reverse direction), PD[7:0] are inputs. 4 INTERRUPT ENABLE (ACK #) (IRQ EN): IRQEN enables interrupts to the CPU to be generated when the ACK # signal on the parallel port interface is asserted and is used in all parallel port interface modes. When IRQEN = 1, a CPU interrupt is generated when ACK# is asserted. When IRQEN = 0, parallel port interrupts are disabled. 3 SELECTIN# CONTROL (SELlNC): This bit controls the SELECTIN# signal. SELINC is set to 1 to select the printer. When SELlNC= 1, the SELECTIN# signal is asserted, When SELlNC=O, the SELECTIN# signal is negated. 2 INIT# CONTROL (INITC): This bit controls the INIT# signal When INITC= 1, the INIT# signal is negated. When INITC = 0, the INIT # signal is asserted. 1 AUTOFD # CONTROL (AUTOFDC): In ECP mode or ISA-Compatible FIFO mode (ECR [7:5] = 011, 010), this bit has no effect. Refer to Section 6.2.3 ECP Mode for more details. 0 STROBE# CONTROL (STROBEC): In ECP mode or ISA-Compatible FIFO mode (ECR[7:5] = 011, 010), this bit has no effect. Refer to Section 6.2.3 ECP Mode for more details. 4-79 82091AA 6.1.3.4 SDFIF0-5tandard Parallel Port Data FIFO Base + 400h and (ECR [7:5] = 01 0) UUUU UUUU (undefined) Read/Write 8 bits I/O Address: Default Value: Attribute: Size: SDFIFO is used to transfer data from the host to the peripheral when the ECR Register is set for ISA-Compatible FIFO mode (bits[7:5] =010). Data bytes written or DMAed from the system to this FIFO are transmitted by a hardware handshake to the peripheral using the standard ISA-Compatible protocol. Note that bit 5 in the peON Register must be set to 0 for a forward transfer direction. 16 Byte FIFO Byte 15 Byte 1 Byte 0 / \ f-'7'--_ _ _ _ _ _ _ _ _ _ _ _0~\ Bit r I U... U [ Default ECP Standard Parallel Port FIFO Mode Data (RIW) (ECR[7:51=010) 290486-33 NOTE: U = Undefined Figure 33. ECP ISA-Compatible Data FIFO Description ECP STANDARD PARALLEL PORT DATA: Bits[7:0] correspond to SD[7:0] and PD[7:0]. 4-80 82091AA 6.1.3.5 DFIFo-Data FIFO (ECP Mode) I/O Address: Default Value: Attribute: Size: Base + 400h and (ECR [7:5] = 011) UUUU UUUU (undefined) Read/Write 8 bits This 110 address location transfers data between the host and peripheral device when the parallel port is in ECP mode (ECR Bits[7:5] = 011). Transfers use the parallel port FIFO. Data is transferred on PD[7:0] via hardware handshakes on the parallel port interface using ECP parallel port interface handshake protocol. 16 Byte FIFO Byte 15 Byte 1 Byte 0 \ / (-!....:7_ _ _ _ _ _ _ _ _ _ _ _.;.O"""'~ I j U...U [ Bit Default ECP Mode Data (R/W) (ECR[7:5]= # Sectors Per Side Unsuccessful Termination Result Phase Invalid 0 1 SC ::;; # Sectors Remaining AND EDT ::;; # Sectors Per Side Successful Termination Result Phase Valid 0 1 SC > # Sectors Remaining DR EDT> # Sectors Per Side Unsuccessful Termination Result Phase Invalid 1 0 SC=OTL EDT ::;; # Sectors Per Side Successful Termination Result Phase Valid 1 0 SC=OTL EDT > # Sectors Per Side Unsuccessful Termination Result Phase Invalid 1 1 SC ::;; # Sectors Remaining AND EDT ::;; # Sectors Per Side Successful Termination Result Phase Valid 1 1 SC > # Sectors Remaining DR EDT > # Sectors Per Side Unsuccessful Termination Result Phase Invalid MT NOTE: When MT= 1 and the SC value is greater than the number of remaining formatted sectors on Side 0, verification continues on Side 1 of the disk. 4-150 82091AA 8.5.2.6 Format Track The FORMAT TRACK Command allows an entire track to be formatted. After a pulse from the INOEX# pin is detected, the FOC starts writing data on the disk including gaps, address marks, 10 fields and data fields, per the IBM' System 34 (MFM). The particular values written to the gap and data field are controlled by the values programmed into N, SC, GPL, and 0 which are specified by the host during the command phase. The data field of the sector is filled with the data byte specified by O. The 10 field for each sector is supplied by the host. That is, four data bytes per sector are needed by the FOC for C, H, R, and N (cylinder, head, sector number, and sector size, respectively). After formatting each sector, the host must send new values for C, H, R, and N to the FOC for the next sector on the track. The R value (sector number) is the only value that must be changed by the host after each sector is formatted. This allows the disk to be formatted with nonsequential sector addresses (inter-leaving). This incrementing and formatting continues for the whole track until the FOC encounters a pulse on the INOEX# pin again and it terminates the command. Table 31 contains typical values for gap fields that are dependent on the size of the sector and the number of sectors on each track. Actual values can vary due to drive electronics. Table 31. Typical PC/AT Values for Formatting Drive Form 5.25" 3.5" MEDIA Sector Size N SC GPL1 GPL2 1.2 MB 512 02 OF 2A 50 360 KB 512 02 09 2A 50 2.88 MB 512 02 24 38 53 1.44 MB 512 02 18 1B 54 720 KB 512 02 09 1B 54 NOTES: 1. All values are in hex, except sector size. 2. Gap3 is programmable during reads, writes, and formats. 3. GPL1 = suggested Gap3 values in read and write commands to avoid splice point between data field and 10 field of contiguous sections. 4. GPL2=suggested Gap3 value in FORMAT TRACK Command. 4-151 82091AA 8.5.2.7 Format Field Syatem 34 Format Double Denalty GAP4a SYNC 12x BOx 4E 00 lAM I~IFC GAP 1 SYNC SOx 12x 4E 00 IDAM CH S N C GAP2 SYNC Y D EOR 22x 12x C ,4E 00 C ~L A1 DATA AM 3x A1 I FB Fa DATA C GAP3 GAP4b R C ISO Format GAP 1 SYNC 32x 12x 4E 00 IDAM C H S N C GAP2 SYNC 12x Y D EOR 22x C C 4E 00 3X FE 1A1 L l DATA C GAP 3 GAP4b R C DATA AM 3x A1 I Fa FB PerpendIcular Format GAP4a SYNC BOx 12x 4E 00 lAM :r GAP 1 SYNC SOx .12x 4E 00 IDAM C H S N C GAP2 SYNC " Y .D E OR 41x 12x 00 C C 4E 3X FE 1 A1 L l DATA AM 3x A1 I DATA C GAP 3 GAP4b R C FB Fa 290486-64 Figure 64. System 34, ISO and Perpendicular Formats 4·152 82091AA 8.5.3 CONTROL COMMANDS Control commands differ from the other commands in that no data transfer takes place. Three commands generate an interrupt when complete; READ ID, RECALIBRATE and SEEK. The other control commands do not generate an interrupt. 8.5.3.1 READ 10 Command The REAP ID Command is used to find the present position of the recording heads. The FDC stores the values from the first ID field it is able to read into its registers. If the FDC does not find an ID address mark on the diskette after the second occurrence of a pulse on the INDEX# pin, it then sets the IC code in Status Register 0 to 01 (Abnormal termination), sets the MA bit in Status Register 1 to 1, and terminates the command. ' The following commands will generate an interrupt upon completion. They do not return any result bytes. It is recommended that control commands be followed by the SENSE INTERRUPT STATUS Command. Otherwise, valuable interrupt status information will be lost. 8.5.3.2 RECALIBRATE Command This command causes the read/write head within the FDC to retract to the track 0 'position. The FDC clears the contents of the PCN counter, and checks the status of the TRKO pin from the FDD. As long as the TRKO pin is low, the DIR# pin remains 0 and step pulses are issued. When the TRKO pin goes high, the SE bit in Status Register 0 is set to 1, and the command is terminated. If the TRKO pin is still low after 79 step pulses halle been issued, the FDC sets the SE and the EC bits of Status Register 0 to 1 and terminates the command. Disks capable of handling more than 80 tracks per side may require more than one RECALIBRATE Command to return the head back to physical Track O. The RECALIBRATE Command does not have a result phase. The SENSE INTERRUPT STATUS Command must be issued after the RECALIBRATE Com- mand to effectively terminate it and to provide verification of the head position (PCN). During the command phase of the recalibrate operation, the FDC is in the busy state, but during the execution phase it is in a non-busy state. At this time another RECALIBRATE Command may be issued, and in this manner, parallel RECALIBRATE operations may be done on up to 2 drives simultaneously. ' After powerup, software must issue a RECALIBRATE Command to properly initialize all drives and the controller. 8.5.3.3 DRIVE SPECIFICATION Command The FDC uses two pins, DRVDENO and DRVDEN1 to select the density for modern drives. These signals inform the drive of the type of diskette in the drive. The DRIVE SPECIFICATION Command specifies the polarity of the DRVDENO and DRVDEN1 pins. It also enables/disables DSR programmed precompensation. This command removes the need for a hardware work-around to accommodate differing specifications among drives. By programming this command during BIOS's POST routine, the floppy disk controller internally configures the correct values for DRVDENO and DRVDEN1 with corresponding precompensation value and data rate table enabled for the particular type of drive. This command is protected from software resets. After executing the DRIVE SPECIFICATION Command, subsequent software resets will not clear the programmed parameters. Only another DRIVE SPECIFICATION Command or hard reset can reset it to default values. The 6 LSBs of the last byte of this command are reserved for future use. The DRATEO and DRATE1 are values as programmed in the DSR register. See Table 32 for pin decoding at different data rates. Table 32 describes the drives, that are supported with the DTO, DT1 bits of the DRIVE SPECIFICATION Command: 4-153 82091AA Table 32. DRVDENn Polarities DT1 DTO Data Rate 1 Mbps 1 O· O· 500 Kbps 0 0 1 1 1 0 1 DRVDEN1 DRVDENO 1 ; 1 300 Kbps 1 0 250 Kbps 0 0 1 Mbps 1 0 500 Kbps 0 0 300 Kbps 1 1 250 Kbps 0 1 1 Mbps 1 1 500 Kbps 0 0 300 Kbps 1 0 250 Kbps, 0 1 1 Mbps 1 1 500 Kbps 0 0 300 Kbps 0 1 250 Kbps 1 0 NOTE: (0) Denotes the default setting 8.5.3.4 SEEK Command The read/write head within the drive is moved from track to track under the control of the SEEK Com· mand. The FDC compares the PCN which is the current head position with the NCN and performs the following operation if there is a difference: PCN .< NCN: Direction signal to drive set to 1 (step in), and issues step pulses. PCN > NCN: Direction signal to drive set to 0 (step out), and issues step pulses. The rate at which step pulses are issued is con· trolled by SRT (Stepping Rate Time) in the SPECIFY Command. After each step pulse is issued, NCN is compared against PCN, and when NCN = PCN, then the SE bit in Status Register 0 is set to 1, and the command is terminated. 4-154 During the command phase of the seek or recali· brate operation, the FDC is in .the busy state, but during the execution phase it is in the non-biJsy state. Note that if implied seek is not enabled, the read and write commands should be preceded by: 1. SEEK Command; Step to the proper track .2. SENSE INTERRUPT STATUS Command; , Terminate the SEEK Command 3.READ 10. Verify head is on proper track 4. Issue READ/WRITE Command. "- 82091AA The SEEK Command does not have a result phase. Therefore, it is highly recommended that the SENSE INTERRUPT STATUS Command be issued after the SEEK Command to terminate it and to provide verification of the head position (PCN). The H bit (Head Address) in STO will always return a O. When exiting DSR Powerdown mode, the FDC clears the PCN value and the status information to zero. Prior to issuing the DSR POWER DOWN Command, it is highly recommended that the user service all pending interrupts through the SENSE INTERRUPT STATUS Command. 8.5.3.5 SENSE INTERRUPT STATUS Command An interrupt signal on the INT pin is generated by the FDC for one of the following reasons: The SEEK, RELATIVE SEEK and the RECALIBRATE Commands have no result phase. The SENSE INTERRUPT STATUS Command must be issued immediately after these commands to terminate them and to provide verification of the head position (PCN). The H (Head Address) bit in STO will always return a O. If a SENSE INTERRUPT STATUS is not issued, the drive, will continue to be busy and may effect the operation of the next command. 8.5.3.6 SENSE DRIVE STATUS Command The SENSE DRIVE STATUS Command obtains drive status information. It has no execution phase and goes directly to the result phase from the command phase. STATUS REGISTER 3 contains the drive status information. 1. Upon entering the Result Phase of: a. READ DATA Command 8.5.3.7 SPECIFY Command b. READ TRACK Command The SPECIFY Command sets the initial values for each of the three internal timers. The HUT (Head Unload Time) defines the time from the end of the execution phase of one of the read/write commands to the head unload state. The SRT (Step Rate Time) defines the time interval between adjacent step pulses. Note that the spacing between the first and second step pulses may be shorter than the remaining step pulses. The HLT (Head Load Time) defines the time between the command phase to the execution phase of a READ DATA or Write Data Command. The Head Unload Time (HUT) timer goes from the end of the execution phase to the begining of the result phase of a READ Data or Write Data Command. The values change with the data rate speed selection and are documented in Table 34. c. READ 10 Command d. READ DELETED DATA Command e. WRITE DATA Command f. FORMAT TRACK Command g. WRITE DELETED DATA Command h. VERIFY Command 2. End of SEEK, RELATIVE SEEK or RECALIBRATE Command i 3. FOC requires a data transfer during the execution phase in the non-DMA Mode The SENSE INTERRUPT STATUS Command resets the interrupt signal and via the IC code and SE bit of Status Register 0, identifies the cause of the interrupt. If a SENSE INTERRUPT STATUS Command is issued when no active interrupt condition is present, the status register STO will return a value of BOh (invalid command). Table 33. Interrupt Identification Interrupt Due To SE Ie 0 11 1 00 Normal Termination of SEEK or RECALIBRATE Command 1 01 Abnormal Termination of SEEK or RECALIBRATE Command Polling , 4-155 82091AA Table 34. Drive Control Delays (ms) SRT HUT 0 1 1M SOOK 300K 2S0K 1M SOOK 300K 2S0K 128 8 256 16 426 26.7 512 32 8.0 7.5 16 15 26.7 25 32 30 .. .. .. .. .. .. .. .. .. A B C D E F 80 88 96 104 112 120 160 176 192 208 224 240 267 294 320 346 373 400 320 352 384 416 448 480 3.0 2.5 2.0 1.5 1.0 0.5 6.0 5.0 4.0 3.0 2.0 1.0 10.2 8.3 6.68 5.01 3.33 1.67 12 10 8 6 4 2 Table 35. Head Load Time (ms) HLT 1M SOOK 300K 2S0K 128 1 2 7E 7F 126 127 252 254 426 3.3 6.7 .. 420 423 512 4 8 .. .. 256 2 4 00 01 02 .. The choice of DMA or non·DMA operations is made by the ND bit. When ND = 1, the non·DMA mode is selected, and when ND = 0, the DMA mode is selected. In DMA mode, data transfers are signalled by the DRO pin. Non-DMA mode uses the ROM bit and the IR06 pin to signal data transfers. 8.5.3.8 CONFIGURE Command Issue the configure command to enable features like the programmable FIFO and set the begining track for precompensation. A CONFIGURE Command need not be issued if the default values of the FDC meets the· system requirements. CONFIGURE DEFAULT VALUES: EIS No Implied Seeks EFIFO FIFO Disabled POLL Polling Enabled FIFOTHR FIFO Threshold Set to 1 Byte PRETRK Pre-Compensation Set to Track 0 EIS-Enable Implied Seek. When EIS = 1, the FDC will perform a SEEK operation before executing a read/write command. The default value is 0 (no implied seek). 4-156 .. 504 508 EFIFO-Enable FIFO. When EFIFO = 1, the FIFO is disabled (8272A compatible mode). This means data transfers are asked for on a byte by byte basis. The default value is 1 (FIFO disabled). The threshold defaults to one. POLL-Disable Polling. When POLL = 1, polling of the drives is disabled. POLL Defaults to 0 (polling enabled). When enabled, a single interrupt is generated after a reset. No polling is performed while the drive head is loaded and the head unload delay has not expired. FIFOTHR-The FIFO threshold in the execution phase of a read/write command. This is programmable from 1 to 16 bytes. FIFOTHR defaults to one byte. A 00 selects one byte and a OF selects 16 bytes. PRETRK-Precompensation start track number. Programmable from track 0 to 255. PRETRK defaults to track O. A OOh selects track 0 and a FFh selects 255. 82091AA 8.5.3.9 VERSION Command The VERSION Command checks to see if the con· troiler is an enhanced type (82077, 82077AA, 82077SL) or the older type (8272A/765A). A value of 90h is returned as the result byte, defining an enhanced FDD controiler is in use. No interrupts are generated. 8.5.3.10 RELATIVE SEEK Command The RELATIVE SEEK Command is coded the same as for the SEEK Command, except for the MSB of the first byte and the DIR# bit. DIR# Head Step Direction Control RCN DIR# ACTION 0 Step Head Out 1 Step Head In Relative Cylinder Number that determines how many tracks to step the head in or out from the current track number. The RELATIVE SEEK Command differs from the SEEK Command in that it steps the head the absolute number of tracks specified in the command instead of making a comparison against an internal register. The SEEK Command is good for drives that support a maximum of 256 tracks. RELATIVE SEEKs cannot be overlapped with other RELATIVE SEEKs. Only one RELATIVE SEEK can be active at a time. Bit 4 of Status Register 0 (EG) will be set to 1 if RELATIVE SEEK attempts to step outward beyond Track o. As an example, assume that a floppy drive has 300 useable tracks and that the host needs to read track 300 and the head is on any track (0-255). If a SEEK Command is issued, the head stops at track 255. If a RELATIVE SEEK Command is issued, the FDC moves the head the specified number of tracks, regardless of the internal cylinder position register (but increments the register). If the head had been on track 40 (D), the maximum track that the FDC could position the head on using RELATIVE SEEK, is 296 (D), the initial track, + 256 (D). The maximum count that the head can be moved with a single RELATIVE SEEK Command is 256 (D). The internal register, PCN, would overflow as the cylinder number crossed track 255 and would contain 40 (D). The resulting PCN value.is thus (NCN + PCN) mod 256. Functionally, the FDC starts count- ing from 0 again as the track number goes above 255(D}. ·It is the users responsibility to compensate FDC functions (precompensation track number) when accessing tracks greater than 255. The FDC does not keep track that it is working in an "extended track area" (greater than 255). Any command issued uses the current PCN value, except for the RECALIBRATE Command that only looks for the TRACKO signal. RECALIBRATE returns an error if the head is farther than 79 due to its limitation of issuing a maximum 80 step pulses. The user simply needs to issue a second RECALIBRATE Command. The SEEK Command and implied seeks function correctly within the 44 (D) track (299-255) area of the extended track area. It is the users responsibility not to issue a new track position that exceeds the maximum track that is present in the extended area. To return to the standard floppy range (0-255) of tracks, a RELATIVE SEEK is issued to cross the track 255 boundary. A RELATIVE SEEK Command can be used instead of the normal SEEK Command but the host is required to calculate the difference between the current head location and the new (target) head location. This may require the host to issue a READ 10 Command to ensure that the head is physically on the track that software assumes it to be. Different FDC commands return different cylinder results which may be difficult to keep track of with software without the READ 10 Command. 8.5.3.11 DUMPREG Command The DUMPREG Command is designed to support system run-time diagnostics and application software development and debug. The command returns pertinent information regarding the status of many of the programmed fields in the FDC. This can be used to verify the values initialized in the FDC. 8.5.3.12 PERPENDICULAR MODE Command An added capability of the FDC is the ability to interface directly to perpendicular recording floppy drives. Perpendicular recording differs from the traditional longitudinal method by orienting the magnetic bits vertically. This scheme packs in more data bits for the same area. The PERPENDICULAR MODE Command allows the system designers to designate specific drives as Perpendicular recording drives. Data transfers be- 82091AA tween Conventional and Perpendicular drives are allowed without having to issue PERPENDICULAR MODE Commands between the accesses of the two different drives, nor having to change write precompensation values. With this command, the length of the Gap2 field and VCO enable timing can be altered to accommodate the unique requirements of these drives. Table 36 describes the effects of the WGA TE and GAP bits for the PERPENDICULAR MODE Command. When both GAP and WGATE equal 0 the PERPENDICULAR MODE Command will have the following effect on the FDC: 1. If any of the new bits DO and 01 are programmed to 1, the corresponding drive is· automatically programmed for Perpendicular mode (ie: GAP2 being written during a write operation, the programmed Data Rate will determine the length of GAP2), and data will be written with 0 ns write precompensation. 2. Any of the new bits (00/01) are programmed for 0, the designated drive is programmed for Conventional Mode and data will be written with the currently programmed write precompensation value. 3. Bits DO and 01 can only be over-written when the OW bit is 1. The status of these bits can be determined by interpreting the eighth result byte of the DUMPREG Command. (Note: if either the GAP or WGATE bit is 1, bits DO and 01 are ignored.) Software and Hardware reset have the following effects on the enhanced PERPENDICULAR MODE Command: 1. A software reset (Reset via DOR or DSR registers) only sets GAP and WGATE bits to 0; DO and 01 retain their previously programmed values. . 2. A hardware reset (Reset via pin 32) sets all bits (GAP, Wgate, DO, and 01) to 0 (All Drives Conventional Mode). 8.5.3.13 .POWERDOWN MODE Command The POWERDOWN MODE Command allows the automatic power management and enables the enhanced registers (EREG EN) of the FDC. The use of the command can extend the battery life in portable PC applications. To .enable auto powerdown the command may be issued during the BIOS power on self test (POST). This command includes the ability to configure the FDC into the enhanced mode extending the SRB and TOR registers. These extended registers accommodate bits that give more information about floppy drive interface, allow for boot drive selection, and identify the values of ,he PO and IDLE status. As soon as the command is enabled, a 10 ms or a 0.5 sec minimum powerup timer is initiated, depending on whether the MIN DLY bit is set to 0 or 1. This timer is one of the required conditions that has to be satisfied before the FDC.will enter auto powerdown. Table 36. Effects of WGATE and GAP Bits GAP WGATE MODE Portion of Gap2VCO VCOLow Length of Gap2 Low Time for Time after Gap2 Format Written by Read Index Pulse Field Write Data Operations Operation 0 0 Conventional Mode 33 Bytes 22 Bytes o Bytes 24 Bytes 0 1 Perpendicular Mode (500 Kbps and Lower Data Rates) 33 Bytes 22 Bytes 19 Bytes 24 Bytes 1 0 Reserved (Conventional) 33 Bytes 22 Bytes o Bytes 24 Bytes 1 1 Perpendicular Mode (1 Mbps Data Rate) 18 Bytes 41 Bytes 38 Bytes 43 Bytes NOTE: When either GAP or WGATE bit is set, the current value of precompensation in the DSR is used. 4-158 82091AA Any software reset will re-initialize the timer. The timer countdown is also extended by up to 10 ms if the data rate is changed during the timer's countdown. Without this timer, the FDC would have been put to sleep immediately after FDC is idle. The minimum delay gives software a chance to interact with the FDC without incurring an additional overhead due to recovery time. The command also allows the output pins of the floppy disk drive interface to be tri-stated or left unaltered during auto powerdown. This is done by the FDI TRI bit. In the default condition (FDI TRI = 0) the output pins of the floppy disk drive are tri-stated. Setting this bit leaves the interface unchanged from the normal state. The results phase returns the values programmed for MIN DLY, FDI TRI and AUTO PD. The auto powerdown mode is disabled by a hardware reset. Software results have no effect on the POWER DOWN MODE Command parameters. 8.5.3.14 PART ID Command This command can be used to identify the floppy disk controller as an enhanced controller. The first stepping of the FDC (all versions) will yield Ox02 in the result phase of this command. Any future enhancements on these parts will be denoted by the 5 LSBs (Ox01 to Ox1 F). 8.5.3.15 OPTION Command The standard IBM format includes an index address field consisting of 80 bytes of GAP 4a, 12 bytes of the sync field, four bytes identifying the lAM and 50 bytes of GAP 1. Under the ISO format most of this preamble is not used. The ISO format allows only 32 bytes of GAP 1 after the index mark. The ISO bit in this command allows the FDC to configure the data transfer commands to recognize this format. The MSBs in this command are reserved for any other enhancements made available to the user in the future. 8.5.3.16 SAVE Command The first byte corresponds to the values programmed in the DSR with the exception of CLKSEL. The DRATE1, DRATEO used here are unmapped. The second byte is used for configuring the bits from the OPTION Command. All future enhancements to the OPTION Command will be reflected in this byte as well. The next nine result bytes are explained in the Parameter Abbreviations section after the command summary. The 13th byte is the value associated with the POWERDOWN MODE Command. The disk status is used internally by the FDC. There are two reserved bytes at the end of this command for future use. This command is similar to the DUMPREG Command but it additionally allows the user to read back the precompensation values as well as the programmed data rate. It also allows the user to read the values programmed in the POWERDOWN MODE Command. The precompensation values will be returned as programmed in the DSR register. This command, used in conjunction with the RESTORE Command, should prove very useful for SMM power management. This command reserves the last two bytes for future enhancements. 8.5.3.17 RESTORE Command Using the RESTORE Command with the SAVE Command, allows the SMM power management to restore the FDC to its original state after a system powerdown. It also serves as a succinct way to provide most of the initialization requirements normally handled by the system. The sequence of initializing the FDC after a reset occurred and assuming a SAVE Command was issued follows: • Issue the DRIVE SPECIFICATION Command (if the design utilizes this command) • Issue the RESTORE Command (pass the 16 bytes retrieved previously during SAVE) The RESTORE Command programs the data rate and precompensation value via the DSR. It then restores the values normally programmed through the CONFIGURE, SPECIFY, and PERPENDICULAR Commands. It also enables the previously selected values for the POWERDOWN Mode Command. The PCN values are set restored to their previous values and the user is responsible for issuing the SEEK and RECALIBRATE Commands to restore the head to the proper location. There are some drives that do not recalibrate in which case the RESTORE Command restores the previous state completely. The PDOSC bit is retrievable using the SAVE Command, however, the system designer must set it correctly. The software must allow at least 20 J.Ls to execute the RESTORE Command. When using the BOOTSEL bits in the TDR, the user must restore or reinitialize these bits to their proper values. 4-159 82091AA 8.5.3.18 FORMAT AND WRITE Command The FORMAT AND WRITE Command is capable of simultaneously formatting and writing data to the diskette. It is essentially the same as the normal FORMAT Command. With the exception that included in the execution for each sector is not only the C, H, R, and N but also the data transfer of N bytes. The 0 value is ignored. This command formats the entire track. High speed floppy diskette duplication can be done fast and efficiently with this command. The user can format the diskette and put data on it in a single pass. This is very useful for software duplication applications by reducing the time required to format and copy diskettes. 9.0 IDE INTERFACE The 82091AA supports the IDE (Integrated Drive Electronics) interface by providing two chip selects, and lower and upper data byte controls. DMA and lS-bit data transfers are supported. Minimal external logic is required to complete the optional lS-bit IDE 1/0 and DMA interfaces. With external logic, a fully buffered interface is also supported. 9.1 IDE Registers The 82091AA does not contain IDE registers. Allof the IDE device registers are located in the IDE device, except bit 7 of the Drive ·Address Register which is the Floppy Controller Disk Change status bit and is driven by the 82091AA. The IDE interface contains two chip (IDECSO# and IDECSl #). These signals serted for accesses to the Command and Block registers located at 01 Fxh and 03Fxh, tively (Table 37). selects are asControl respec- Table 37. IDE Register Set (Located in IDE DeVice) Pri~ary Address Secondary Address Chip Select lFOh 170h IDECSO# Registers Data Register Access R/W lFlh 171h IDECSO# Error Register RO 1Flh 171h IDECSO# Write Precomp/Features Register WO lF2h 172h IDECSO# Sector Count Register R/W lF3h 173h IDECSO# Sector Number Register R/W lF4h 174h IDECSO# Cylinder Low Register R/W lF5h 175h IDECSO# Cylinder High Register R/W lFSh 17Sh IDECSO# DrivelHead Register R/W lF7h 177h IDECSO# Status Register RO lF7h 177h IDECSO# Command Register WO 3FSh 37Sh IDECSl # Alternate Status Register RO 3FSh 37Sh IDECSl # Digital Output Register WO 3F7h 377h IDECSl # Drive Address Register RO 377h IDECSl # Not Used 3F7h 4-1S0 82091AA Figure 65 shows an example IDE interface without DMA capability. In this case all IDE accesses for setting up the IDE registers and transferring data is programmed via 110. The 82091 AA generates the chip selects (IDECSO# and IDECS1 #). The 82091AA also generates the DEN# and HEN# signals to enable the data buffers. 9.2 IDE Interface Operation The 82091 AA implements the chip select signals for the IDE interface and decodes the standard PC/AT primary and secondary 110 locations. The 82091AA provides a data buffer enable signal (DEN #) to control the lower data byte path for buffered designs. Buffering the lower data byte path is an application option that requires an external transceiver/buffer. For buffered applications, DEN# controls an external transceiver and enables data bits IDED[7:0] onto the system data bus SD[7:0]. For non-buffered applications (typically the X-Bus configuration), IDED[7:0] are connected directly to the bus and DEN # is not used and becomes a no-connect. For 16-bit applications the upper data byte path (IDED[15:81) is controlled by the HEN# signal. ISABus AlP SA[9:0] SA[9:0] 10RC# 10RC# 10WC# 10WC# RSTORV 1016# Figure 66 shows an example DMA IDE interface for type "F" DMA cycles. To set up the IDE interface, the host accesses the IDE registers on the IDE device. For programmed I/O accesses, the 82091 AA generates the chip selects (IDECSO# and IDECS1 #) to access the IDE registers and the DEN # and HEN # signals to control the data buffers. During DMA transfers the DMA handshake is between the DMA controller and IDE device via the DREQ and DACK# signals. The DACK# signal is ORed with the DEN# and HEN# signals to control the upper and lower byte buffers during DMA transfers. 10E SO[7:0] OB[7:0] IOECS1# CS1# CSO# 10ECSO# RSTORV OEN# 1016# HEN# 10RC# flE ~ SO[7:0] '-- '--' o E ~~ OB[15:8] SO[15:8] ... - SAr2:01 10WC# 10RC# ,.. I> ,.. BALE - 1016# RSTORV --'- "v A[2:0] 10W# 10R# BALE 1016# RST# 290486-65 Figure 65. IDE Interface Example (without DMA) 4-161 82091AA SA[9:0] SA[9:0] 10RC# 10RC# 10WC# 10WC# RSTDRV 1016# OACK# OREQ IDE AlP ISABus SO[7:0] OB[7:0] ""'- IOECS1# CS1# 10ECSO# CSO# RSTDRV 1016# =1vr> 10RC# OACK# OREQ ~E ~ SO[7:0] '---'- - SA[2:0] 10WC# 10RC# - 1016# OB[15:8] '-- , I> BALE RSTDRV DE ~~ SO[15:8] v A[2:0] 10W# 10R# BALE 1016# RST# . 290486-66 Figure 66. IDE Interface Example (with DMA) 4-162 82091AA 10.0 POWER MANAGEMENT 10.2 Clock Power Management The 82091 AA provides power management capabilities for its primary functional modules (parallel port, floppy disk controller, serial port A, and serial port B). For each module, the 82091AA implements two types of power management-direct powerdown and auto powerdown. Direct powerdown, enabled via control bits in the 82091 AA configuration registers, immediately places the module in a powerdown mode by turning off the clock to the associated module. Direct powerdown removes the clock regardless of the activity or status of the module. By contrast, when auto powerdown is enabled (via control bits in the 82091AA configuration registers), the associated module only enters a powerdown mode if it is in an idle state. The internal clock circuitry of the 82091AA can be turned on or off as part of a power management scheme. The clock circuitry is controlled via the CLKOFF bit in the AIPCFG1 Register. If an external clock source exists, the user may want to turn off the internal oscillator to save power and provide minimum recovery time. NOTE: The entire 82091 AA can be placed in direct powerdown by writing to the CLKOFF bit in the AIPCFG1 Register. 10.1 Power Management Registers The floppy disk controller, parallel port, serial port A, and serial port B each have two 82091 AA configuration registers. For each module, three configuration register bits control power management-xDPDN, xIDLE, and xAPDN. • xAPDN: auto-powerdown, shuts off the oscillator to the module when the module is idle. • xl OLE: idle status, a read only pin that indicates idle status. • xDPDN: direct powerdown, shuts off module oscillator when active regardless of module status. Auto powerdown and direct powerdown (in each module) have no effect on the state of internal oscillator. 10.3 FOC Power Management This section describes the FDC direct and auto powerdown modes and recovery from the powerdown modes. Auto Powerdown Automatic powerdown (APDN) has an advantage over direct powerdown (PDN) since the register contents are not lost under APDN. Automatic powerdown is invoked by either the Auto Powerdown command, or by enabling the FAPDN bit in the FDC configuration register. There are four conditions required before the FDC will enter powerdown: 1. The motor enable pins ME[3:0] must be inactive. 2. The FDC must be in an idle state. FDC idle is indicated by MSR = 80h and the IRQ6 signal is negated (IRQ6 may be asserted even if MSR = 80h due to polling interrupt). 3. The head unload timer (HUT, explained in the SPECIFY Command) must have expired. . 4. The auto powerdown timer must have timed out. The 82091 AA exits any powerdown mode after a hardware reset (RSTDRV asserted) or reset via the xRESET bit in the 82091AA configuration registers. Direct powerdown can also be exited by writing the corresponding xPDN bit in the configuration register to O. Auto powerdown is exited by events at the module (e.g., CPU read/write or module interface activity). NOTE: The configuration registers also contain the xEN bit. This bit is used to completely disable an unused module. Enabling a disabled module takes much longer than restoring a module from powerdown. Therefore, this bit is not recommend for temporarily disabling a module as a powerdown scheme. An internal timer is initiated when the POWERDOWN MODE Command is executed. The amount of time can be set by the user via the MIN DLY bits in the POWERDOWN MODE Command. The mod. ule is then powered down, provided all the remaining conditions are met. A software reset reinitializes the timer. When using the FDC FAPDN bit to enable the automatic powerdown feature, the MIN DLY bit is set to the default condition. Recovery from Auto Powerdown When the FDC is in auto powerdown, the module is awakened by a reset or access to the DOR, MSR or FIFO registers. The module remains in auto powerdown mode after a software reset (i.e., it will power- 4-163 82091AA down again after being idle for the time specified by MIN DLV). However, the FDC does not remain in auto powerdown mode after a hardware reset or DSR reset. Direct Powerdown Direct powerdown is invoked via the Powerdown bit in the Data Rate Select Register (bit 6), or the FDPDN bit in the FCFG2 Register. Setting FDPDN to 1 will powerdown the FDC. All status is lost when this type of powerdown mode is used. The FDC exits powerdown mode after any hardware or software reset. Direct powerdown overrides automatic powerdown. Recovery from Direct Powerdown The FDC exits the direct powerdown state by setting the FDPDN bit to 0 followed by a software or hardware reset. Direct Powerdown Direct Powerdown is invoked via the SxCFG2 Register (setting the SxDPDN bit to 1). When in direct powerdown, the clock to the module is shut off. All registers are accessible while in direct powerdown. A host read of the Receiver Buffer Register or a write to the Transmitter Holding Register should not be performed during powerdown. The SINx input should remain static. When direct powerdown is invoked, the transmit and receive sections of the serial port are reset, including the transmit and receive FIFOs. Thus, to prevent possible ,data loss when the FIFOs are reset, software should not invoke direct powerdown until the serial portis in the idle state as indicated by the SxlDLE bit in the SxCFG2 Register. Recovery from Direct Powerdown After reset, the FDC goes through a normal sequence. The drive status is initialized. The FIFO mode is set to default mode on a hardware or software reset if the LOCK Command has not blocked it. Finally, after a delay, the polling interrupt is issued. Recovery from direct powerdown is accomplished by writing the SxDPDN bit in the configuration register to 0 or by a module reset. 10.4 Serial Port Power Management Auto This section describes the serial port direct and auto powerdown modes .and recovery from the powerdown modes. Auto powerdown is enabled via the PAPDN bit in the PCFG2 Register. When enabled, the parallel port enters auto powerdown when the module is in an idle state. If the parallel port FIFO is being used to transfer data, the parallel port is in an idle state when the FIFO is empty. Auto Powerdown When auto powerdown is enabled in the SxCFG2 Register (SxAPDN bit is 1), the serial port enters auto powerdown based on monitoring line interface activity. During auto powerdown, the status of the serial port is maintained (the FIFO and registers are not reset). Access to any serial port register is allowed during auto powerdown. The transmitter and the receiver enter powerdown individually, depending on certain conditions. When there are no characters to transmit (TEMPTY = 1 in the LSR), the transmitter clock is shut off placing the transmitter in auto powerdown. In the case of the receiver, when serial input signal is inactive for approximately 5 character times, indicating that no character is being received, the receiver goes into auto powerdown. Recovery from Auto Powerdown The serial port recovers from auto powerdown when either the transmitter or receiver are active. If data is written to the transmitter or data is present at the receiver, the serial port exits from auto powerdown. 4-164 10.5 Parallel Port Power Management P~werdown Recovery from Auto Powerdown Recovery from auto powerdown occurs when the FIFO is written or as a result of parallel port interface activity. Direct Powerdown Direct powerdown is invoked via the PCFG2 Register (setting the PDPDN bit to 1). When PDPDN = 1, the clock to the printer state machine is disabled and the state machine goes into an idle state. Recovery from Direct Powerdown Recovery from direct powerdown is accomplished by setting the PDPDN bit to 0 or the PRESET bit to a 1 in the PCFG2 Register. An 82091 AA hard reset (RSTDRV asserted) also brings the part out of direct powerdown. 82091AA 11.0 ELECTRICAL CHARACTERISTICS 11.1 Absolute Maximum Ratings Storage Temperature .......... - 6SoC to + 1S0°C Supply Voltage .................. - O.SV to + B.OV Voltage on Any Input. ............ GND-2V to 6.SV Voltage on Any Output ... GND-O.SV to Vee + O.SV Power DisSipation ........................... 1W NOTICE: This data sheet contains information on products in the sampling and initial production phases of development. The specifications are subject to change without notice. Verify with your local Intel Sales office that you have the latest data sheet before finalizing a design. • WARNING: Stressing the device beyond the "Absolute Maximum Ratings" may cause permanent damage. These are stress ratings only. Operation beyond the "Operating Conditions" is not recommended and extended exposure beyond the "Operating Conditions" may affect device reliabilily. 11.2 DC Characteristics Table 38. DC Specifications (Vee = SV ± 10%, T amb Symbol Parameter = O°C to 70°C) Vee = +SV ±10 Vee = 3.3V ± O.3V Min(V) Max(V) VILe Input Low Voltage, X1 -O.S O.B Notes Min(V) -0.3 Max(V) O.B Notes VIHe Input High Voltage, X1 3.9 Vee + O.S 2.4 Vee + 0.3 VIL Input Low Voltage (all pins except X1) -O.S O.B -0.3 O.B VIH Input High Voltage (all pins except X1) 2.0 Vec + O.S 2.0 Vce + 0.3 Icc Vee Supply Current -1 Mbps FDC Data Rate VIL = 0.4SV, VIH = 2.4V SOmA 1,2 40mA 1,2 lecss ICC in Powerdown 100 J.l-A 3,4, S 100 J.l-A 3,4,S IlL Input Load Current (all input pins) +10J.l-A -10 J.l-A 6 +10 J.l-A -10 J.l-A 6 IOFL Data Bus Output Float Leakage +10 J.l-A -10 J.l-A 7 +10 J.l-A -10 J.l-A B ISPL Parallel Port Back-Power Leakage (All Parallel Port Signals) +10 J.l-A 9 +10J.l-A 9 NOTES: 1. Test Conditions: Only the data bus inputs may float. All outputs are open. 2. Test Conditions: Tested while reading a sync field of "00". Outputs not connected to DC loads. This specification reflects the supply current when all modules within the 82091AA are active. 3. Test Conditions: VIL =Vss, VIH=VCC; Outputs not connected to DC loads. 4. Test Conditions: Typical value with the oscillator off. 5. Test Conditions: All 82091AA modules are in their powerdown state. 6. Test Conditions: 10 ",A (VIN= VCC), -10",A (VIN=OV) 7. Test Conditions: OV C D S O . ' J < :_ _ _ _ DIR STEP INDEX ~ --I' ~t39-l , "'----- HDSEL _ _ _, . , / W E - -............, ,_ _ _ ~ -"'~I-L::-_t_4,-==_~_I290486-99 NOTE: For overlapped seeks. only one step pulse per drive selection is issued. Non-overlapped seeks will issue all programmed step pulses. Invert high. 4-194 82091AA 13.3 Internal PLL """ 0_\44__A_ --./ l\40J ~ 290486-AO NOTE: Invert high. 4-195 82091AA APPENDIX A Foe FOUR· DRIVE SUPPORT Section 8.0 of this document completely describes the FDC when the module is configured for two drive support. In addition, the FDC commands in Section 8.0 provide four drive support information. This appendix provides additional information concerning four drive support. The signal pins that are affected by four drive support are described in Section A.1. Note that the FDC signals not discussed in this appendix operate the same for both two and four drive systems. The following registers are described in this appendix; Digital Output Register (DOR), Enhanced Tape Drive Register (TDR), and the Main Status Register (MSR). Some bits in thE:lse registers operate differently in a four drive configuration than a two drive configuration. NOTES: • The descriptions in this appendix assume .that four floppy drive support has been selected by setting FDDQTY to 1 in the AIPCFG1 Register. • Only drive 0 or drive 1 can be selected as the boot drive. A.1 Floppy Disk Controller Interface Signals These signal descriptions are for a four drive system (FDDQTY = 1 in the AIPCFG1 Register). See Section 2.0 for two drive system signal descriptions. Signal Name Type Description FDME1 #/DSEN#(1) 0 FLOPPY DRIVE MOTOR ENABLE 1, or DRIVE SELECT ENABLE: In a four drive system, this signal functions as a drive select enable (DSEN #). When DSEN # is asserted, MDS1 and MDSO reflect the selection of the drive. FDS1 # IMDS1(1) 0 FLOPPY DRIVE SELECT1, or MOTOR DRIVE SELECT 1: In a four drive system, this signal functions as a motor drive select (MDS1). MDS1, together with MDSO, indicate which of the four drives is selected, as shown in note 1. FDMEO#/MEEN#(1) 0 FLOPPY DRIVE MOTOR ENABLE 0 or MOTOR ENABLE ENABLE: In a four drive system, this signal functions as a motor enable enable (MEEN #). MEEN # is asserted to enable the external decoding of MDS1 and MDSO for the appropriate motor enable (see note 1). FDSO# IMDSO(1) 0 FLOPPY DRIVE SELECT 0 or MOTOR DRIVE SELECT 0: In a four drive system, this signal functions as motor drive select (MDSO). MDSO, together with MDS1, indicate which of the four drives is selected as shown in note 1. NOTE: 1. These signal pins are used to control an external decoder for four floppy disk drives as shown below. Refer to the DOR Register Description in Section A.2 for details. MDS1 MDSO DSEN# =0 MEEN# =0 o 0 Drive 0 MEO o 1 Drive 1 ME1 1 0 Drive 2 ME2 1 1 Drive 3 ME3 4-196 82091AA A.2 DOR-Digital Output Register I/O Address: Default Value: Attribute: Size: Base +2h OOh Read/Write 8 bits The Digital Output Register enables/disables the floppy disk drive motors, selects the disk drives, enables/disables DMA, and provides a FOG module reset. The DOR reset bit and the Motor Enable bits have to be inactive when the 82091AA's FOG is in powerdown. The DMAGATE# and Drive Select bits are unchanged. During powerdown, writing to the DOR does not wake up the 82091 AA's FOG, except for activating any of the motor enable bits. Setting the motor enable bits to 1 will wake up the module. The four internal drive select and four internal motor enable signals are encoded to a total of four output pins as described in Table 47. Figure 99 shows an example of how these four output pins can be decoded to provide four drive select and four motor enable signals. Note that only drive 0 or drive 1 can be used as the boot drive when four disk drives are enabled. Default Drive 0 Select (RIW) 1=Selected O=Not Selected Drive 1 Select (RIW) 1=Selected O=Not Selected FOe Reset (RIW) 1=Resets FDC O=Does Not Reset FDC DMA Enable (R/W) 1=Enable O=Dlsable Motor Enable 0 (RIW) (for DrIve 0) 1=Enable O=Dlsable Motor Enable 1 (RIW) 1=Enable O=Dlsable Motor Enable 2 (RIW) 1=Enable O=Dlsable Motor Enable 3 (RIW) 1=Enable O=Dlsable 290486-A1 Figure 98. Digital Output Register 4-197 82091AA Bit Description 7 Motor Enable 3 (ME3): This bit controls a motor drive enable output signal and provides the signal output for the floppy drive 3 motor (via external decoding) as shown in Table 46. 6 Motor Enable 2 (ME2): This bit controls a motor drive enable output signal and provides the signal output for the floppy drive 2 motor (via external decoding) as shown in Table 46. 5 Motor Enable 1 (ME1): This bit controls a motor drive enable signal and provides the signal output for the floppy drive 1 motor (via external decoding) as shown in Table 46. 4 Motor Enable 0 (MEO): This bit controls a motor drive enable signal and provides the signal output for the floppy drive 0 motor (via external decoding) as shown in Table 46. 3 DMA Gate (DMAGATE): This bit enables/disables OMA for the FOC. When OMAGATE= 1, OMA forthe FOC is enabled. In this mode FOOREO, TC, IR06, and FOOACK# are enabled. When OMAGATE = 0, OMA for the FOC is disabled. In this mode, the IR06 and ORO outputs are tri-stated and the OACK # and TC inputs are disabled to the FOC. Note that the TC input is only disabled to the FOC module. Other functional units in the 82091 AA (e.g., parallel port or IDE interface) can still use the TC input signal for OMA activities. 2 FOe Reset (OORRST): OORRST is a software reset for the FOC module. When OORRST is set to 0, the basic core of the 82091 AA's FOC and the FIFO circuits are cleared conditioned by the LOCK bit in the Configure Command. This bit is set to 0 by software or a hard reset (RSTORV asserted). The FOC remains in a reset state until software sets this bit to 1. This bit does not a,ffect the OSR, CCR and other bits of the OOR. OORRST must be held active for at least 0.5 ,""S at 250 Kbps. This is less than a typicallSA I/O cycle time. Thus, in most systems consecutive writes to this register to toggle this bit allows sufficient time to reset the FOC. 1:0 Orive Select (OS[1:0]): This field provides the output signals to select a particular floppy drive (via external decoding) as shown in Table 47. Note that the drive motor can be enabled separately without selecting the drive. This permits the motor to come up to speed before selecting the drive. Note also that only one drive can be selected at a time. However, the drive should not be selected without enabling the appropriate drive motor via bits[7:4] of this register. 4-198 82091AA Table 46. Output Pin Status for Four Disk Drives FDe DOR Register Bits Signal Pins Description ME3 ME2 ME1 MEO DS1 DSO MDS1# MDSO# DSEN# MEEN# MEO and OSO enable X X X 1 0 0 0 0 0 0 ME1 and OS1 enable X X 1 X 0 1 0 1 0 0 ME2and OS2enabie X 1 X X 1 0 1 0 0 0 ME3 and 083 enable 1 X X X 1 1 1 1 0 0 MEO enable only X X X 1 08[1:01*00 0 0 1 0 ME1 enable only X X 1 0 08[1:01*01 0 1 1 0 ME2enabie only X 1 0 0 08[1:01*10 1 0 1 0 ME3 enable only 1 0 0 0 08[1:01*11 1 1 1 0 No MEor 08 enable 0 0 0 0 1 1 1 1 X X NOTE: To enable a particular drive motor and select the drive, the value for 08[1 :0] must match the appropriate motor enable bit selected as indicated in the first four rows of the table. For example, to enable the drive 0 motor and select the drive, MEO is set to 1 and 08[1 :0] must be set to 00. To enable the drive motor and keep the drive de-selected the value for 08[1 :0] must not match the particular motor enable as shown in the first four rows. For example, to enable the motor for drive 0 while the drive remains de-selected, MEO is set to 1 and 08[1:0] is set to 01,10, or 11. 4-199 82091AA AlP 2 to 4 Decoder MDSO# IN OUT k OUT MDS1# IN ,.... EN DSEN# MEEN# ~ DS1# OUT OUT DSO# DS2# k DS3# 2t04 Decoder r-- '--- ,.. IN OUT MEO# OUT IN EN ME1# OUT k ME2# OUT k ME3# 290486-A2 Figure 99. Example External Decoder (Four Drive System) A.3 TOR-Enhanced Tape Drive Register I/O Address: Default Value: Attribute: Size: Base +3h OOh Read/Write 8 bits This register allows the user to assign tape support to a particular drive during initialization. Any future references to that drive number automatically invokes tape support. A hardware reset sets all bits in this register to o making drive 0 not available for tape support. A software reset via bit 2 of the DOR does not affect this register. Drive 0 is reserved for the floppy boot drive. Bits [7:2) are only available when EREG EN = 1; otherwise the bits are tri-stated. 4-200 82091AA 7 3 I o o 2 I I 0 0 - I 0 L - Bit Default Tape Select (RIW) OO=None (All are Floppy Drives) 01=1 10=2 11=3 Boot Drive Select (R/W) see Table Reserved 290486-A3 Figure 100. Enhanced Tape Drive Register Bit Description 7:3 Reserved: 2 Boot Drive Select (BOOTSEL): The BOOTSEL bit is used to remap the drive selects and motor enables. The functionality is shown below: BOOTSEL 0 1 Mapping DSO ~ FDSO, MEO ~ FDMEO (default) DS1 ~ DS1, ME1 ~ FDME1 DSO ~ DS1, MEO ~ FDME1 DS1 ~ FDSO, ME1 ~ FDMEO Only drive 0 or drive 1 can be selected as the boot drive. 1:0 Tape Select (TAPESEL[1:0l): These two bits are used by software to assign a logical drive number to be a tape drive. Other than adjusting precompensation delays for tape support, these two bits do not affect the FOG hardware. They can be written and read by software as an indication of the tape drive assignment. Drive 0 is not available as a tape drive and is reserved as the floppy disk boot drive. The tape drive assignments are as follows: Bits[1:0] 00 01 10 11 Drive Selected None (all are floppy disk drives) 1 2 3 4-201 82091AA A.4 MSR-Main Status Register 1/0 Address: Default Value: Attribute: Size: Base +4h OOh Read Only B bits This read only register provides FDG status information. This information is used by software to control the flow of data to and from the FIFO (accessed via the FDGFIFO Register). The MSR indicates when the FDG is ready to send or receive data through the FIFO. During non-DMA transfers, this register should be read before each byte is transferred to or from the FIFO. After a hard or soft reset or recovery from a powerdown state, the MSR is available to be read by the host. The register value is OOh until the oscillator circuit has stabilized and the internal registers have been initialized. When the FDG is ready to receive a new command, MSR[7:0] = BOh. The'worst case time allowed for the MSR to report BOh (Le., ROM is set to 1) is 2.5 fJ.s after a hard or soft reset. Main Status Register is used for controlling command input and result output for all commands. Some example values of the MSR are: • MSR = BOH; The controller is ready to receive a command. • MSR = 90H; Executing a command or waiting for the host to read status bytes (assume DMA mode). • MSR = DOH; Waiting for the host to write status bytes. Bit Default Drive 0 Busy (RO) see Text Drive 1 Busy (RO) see Text Reserved Command Busy (RO) 1=FDC Command In Progress O=No FDC Command In Progress Non DMA Mode (RO) see Text Data I/O Direction (RO) 1=Host Data Read Required O=Host Data Write Required Request for Master (RO) 1=Host can Transfer Data O=Data Transfer Is not Permitted Figure 101. Main Status Register 4-202 290486-A4 82091AA Bit Description 7 Request For Master (RQM): When ROM = 1, the FOC is ready to send/receive data through the FIFO (FOCFIFO Register). The FOC sets this bit to 0 after a byte transfer and then sets the bit to 1 when it is ready for the next byte. Ouring non-OMA execution phase, ROM indicates the status of IR06. 6 Direction I/O (DIO): When ROM = 1, 010 indicates the direction of a data transfer. When 010 = 1, the FOC is requesting a read of the FOCFIFO. When 010 = 0, the FOC is requesting a write to the FOCFIFO. 5 NON·DMA (NONDMA): Non-OMA mode is selected via the SPECIFY Command. In this mode, the FOC sets this bit to a 1 during the execution phase of a command. This bit is for polled data transfers and helps differentiate between the data transfer phase and the reading of result bytes. 4 Command Busy (CMDBUSY): CMOBUSY indicates when a command is in progress. When the first byte of the command phase is written, the FOC sets this bit to 1. CMOBUSY is set to 0 after the last byte of the result phase is read. If there is no result phase (e.g., SEEK or RECALIBRATE Commands), CMOBUSY is set to 0 after the last command byte is written. 3 Drive 3 Busy (DRV1BUSY): The FOC module sets this bit to 1 after the last byte of the command phase of a SEEK or RECALIBRATE Command is issued for drive 3. This bit is set to 0 after the host reads the first byte in the result phase of the SENSE INTERRUPT Command for this drive. 2 Drive 2 Busy (DRV1 BUSY): The FOC module sets this bit to 1 after the last byte of the command phase of a SEEK or RECALIBRATE Command is issued for drive 2. This bit is set to 0 after the host reads the first byte in the result phase of the SENSE INTERRUPT Command for this drive. 1 Drive 1 Busy (DRV1BUSY): The FOC module sets this bit to 1 after the last byte of the command phase of a SEEK or RECALIBRATE Command is issued for drive 1. This bit is set to 0 after the host reads the first byte in the result phase of the SENSE INTERRUPT Command for this drive. 0 Drive 0 Busy (DRVOBUSY): The FOC module sets this bit to 1 after the last byte of the command phase of a SEEK or RECALIBRATE Command is issued for drive o. This bit is set to 0 after the host reads the first byte in the result phase of the SENSE INTERRUPT Command for this drive. 4·203 82078 CHMOS SINGLE-CHIP FLOPPY DISK CONTROLLER Footprint and Low Height • Small Packages • • Supports Standard 5.0V as Well as Low Voltage 3.3V Platforms - Selectable 3.3V and 5.0V Configuration - 5.0V Tolerant Drive Interface Enhanced Power Management - Application Software Transparency - Programmable Powerdown Command - Save and Restore Commands for OV Powerdown - Auto Powerdown and Wakeup Modes - Two External Power Management Pins - Consumes No Power While in Powerdown Programmable Internal Oscillator • Floppy Support Features • - DriveDrive Specification Command - Media 10 Capability Provides Media Recognition - Drive 10 Capability Allows the User to Recognize the Type of Drive - Selectable Boot Drive - Standard IBM and ISO Format Features - Format with Write Command for High Performance in Mass Floppy Duplication • Integrated HostlDisk,lnterface Drivers • Integrated Analog Data Separator - 250 Kbits/sec - 300 Kbits/ sec - 500 Kbits/ sec - 1 Mbits/sec -2 Mbits/sec Tape Drive Support • -Integrated Standard 1 Mbps/500 Kbps/ 250 Kbps Tape Drives - New 2 Mbps Tape Drive Mode Recording Support for • 4Perpendicular MB Drives Fully Decoded Drive Select and Motor • Signals Write Precompensatlon • Programmable Delays Addresses 256 Tracks Directly, • Supports Unlimited Tracks 16 Byte FIFO • Single-Chip Floppy Disk Controller • Solution for Portables and Desktops -100% PC-AT* Compatible -100% PS/2* Compatible -100% PS/2 Model 30 Compatible - Fully Compatible with Intel's 386SL Microprocessor SuperSet -Integrated Drive and Data Bus Buffers in 64 Pin QFP and 44 Pin QFP =Available Package (See Package Specification Order Number 240800, Package Type S) The 82078 Product Family brings a set of enhanced floppy disk controllers. These include several' features that allow for easy implementation in both the portable and desktop market. The current family includes a 64 pin and a 44 pin part in the smaller form factor QFP package. The 3.3V version of the 64 pin part provides an ideal solution for the rapidly emerging 3.3V platforms. It also allows for a 5.0V tolerant floppy drive interface that lets the users retain their normal 5.0V drives. Another version of the 64 pin part provides support for 2 Mbps data rate tape drives. 'Other brands and names are the property of their respective owners. 4-204 February 1994 Order Number: 290468-003 82078 Table 1-0. 64 Pin Part Versions 82078SL 3.3V S.OV X X 2 Mbps Data Rate X 82078-1 X The 44 pin is targeted for platforms that are operated at 3.3V or S.OV and do not require more than two drive support. The 82078·S is designed for price sensitive S.OV designs which do not include 4 MB drive support. Table 2-0. 44 Pin Part Versions 3.3V S.OV 1 Mbps Data Rate 82078 X X 82078-S X X 82078-3 X Both parts can be operated at 1 Mbps/SOO Kbps/300 Kbps/2S0 Kbps. Additionally, one version of the 64 pin part provides 2 Mbps data rate operation specific for the new tape drives. The 82078 is fabricated with Intel's advanced CHMOS III technology. FDSO# FDMEO# DACK# DBO DRVIDO FDS1# VSSP DB1 FDME1# DB2 DRVID1 IDENTO DIR# DB3 VCCF 82078 vSSP MEDIDO vee STEP# DB4 VSS IDENT1 FDS2# DBS DB6 FDME2. VSS HDSEL# DB7 WEt SEL3V# WRDATA# f~ . . .. rn <.l 0. rn <.l ~ ~ l2 ax ;;; ;( ;( a: a: a f- ~ .. ..: !;( a ~ en aa: a (!) J:: rn rn > . a'" u. rn 290468-1 I 4-20S intel® 82078 0 a: 0 '" a: ;: ll! DACK* . 0 a: "'" ~ u ~ ~ " '"'" :;: 0 ~ ~ III III > {;; Iii III X ~ a: 0 Z w 0 > a: 0 290468-2 4·206 I 8207844 PIN CHMOS SINGLE-CHIP FLOPPY DISK CONTROLLER • • • • • Small Footprint and Low Height Package Enhanced Power Management - Application Software Transparency -:- Programmable Powerdown Command - Save and Restore Commands for Zero-Volt Powerdown - Auto Powerdown and Wakeup Modes - Two External Power Management Pins - Consumes No Power While in Powerdown Integrated Analog Data Separator -250 Kbps -300 Kbps -500 Kbps -1 Mbps • • Perpendicular Recording Support for 4 MB Drives Integrated Host/Disk Interface Drivers • Fully • SignalsDecoded Drive Select and Motor • Programmable Write Precompensation Delays Addresses 256 Tracks Directly, • Supports Unlimited Tracks 16 Byte FIFO • Single-Chip Floppy Disk Controller • Programmable Internal Oscillator Floppy Drive. Support Features - Drive Specification Command - Selectable Boot Drive - Standard IBM and ISO Format Features - Format with Write Command for High Performance in Mass Floppy Duplication Integrated Tape Drive Support - Standard 1 Mbps/500 Kbpsl 250 Kbps Tape Drives • • Solution for Portables and Desktops -100% PC/AT* Compatible - Fully Compatible with Intel386™ SL -Integrated Drive and Data Bus Buffers Separate 5.0V and 3.3V Versions of the 44 Pin part are Available Available in a 44 Pin QFP Package The 82078, a 24 MHz crystal, a resistor package, and a device chip select implements a complete solution. All programmable options default to 82078 compatible values. The dual PLL data separator has better perform· ance than most board level/discrete PLL implementations. The FIFO allows better system performance in mUlti-master (e.g., Microchannel, EISA). The 82078 maintains complete software compatibility with the 820nSL/820n AAl8272A floppy disk controllers. It contains programmable power management features while integrating all of the logic required for floppy disk control. The power management features are transparent to any application software. The 82078 is fabricated with Intel's advanced CHMOS III technology and is also available in a 64-lead QFP package. 'Other brands and names are the property of their respective owners. Refer to the 1995 Peripheral Components Handbook for the complete data sheet on this device. The complete document for this product is available on Intel's "Data-an-Demand" CD-ROM product Contact your local Intel field sales office, Intel technical distributor, or call1-BOO-54B-4725. December 1994 Order Number: 290474·003 4-207 8207864 PIN CHMOS SINGLE-CHIP FLOPPY'DISK CONTROLLER Footprint and Low Height • Small Packages Standard 5;OV as well as Low • Supports Voltage 3.3V Platforms ~ Selectable • • 3.3V and 5.0V Cc;mfiguration - 5.0V Tolerant Drive Interface Enhanced Power Management - Application Software Transparency - Programmable Powerdown Command - Save and Restore Commands for Zero-Volt Powerdown - Auto Powerdown and Wakeup Modes - Two External Power Management - Pins - Consumes no Power when in Powerdown Integrated Analog Data Separator -250 Kbps -300 Kbps -500 Kbps -1 Mbps -2 Mbps • Programmable Internal Oscillator • Floppy Drive Support Features - Drive Specification Command - Media ID Capability Provides Media Recognition - Drive ID Capability Allows the User to Recognize the Type of Drive • • • • • • • • • • - Selectable Boot Drive - Standard IBM and ISO Format Features - Format with Write Command for High Performance in. Mass Floppy Duplication Integrated Tape Drive Support -Standard 1 Mbps/500 Kbps/ 250 Kbps Tape Drives - New 2 Mbps Tape Drive Mode Perpendicular Recording Support for 4 MBDrives Integrated Host/Disk Interface Driv.ers Fully Decoded Drive Select and Motor Signals Programmable Write Pre compensation Delays Addresses 256 Tracks Directly, Supports Unlimited Tracks 16 Byte FIFO Single-Chip Floppy Disk Controller Solution for Portables and Desktops -100% PC AT* Compatible -100% PS/2* Compatible -100% PS/2 Model 30 Compatible - Fully Compatible with Intel386TM SL Microprocessor SuperSet Integrated Drive and Data Bus Buffers Available in 64 Pin QFP Package The 82078, a 24 MHz crystal, a resistor package, and a device chip selectimplements a complete solution. All programmable options default to 82078 compatible values. The dual PLL data separator has better performance than most board level/discrete PLL implementations. The FIFO allows better system performance in multi-master (e.g., Microchannel, EISA). The 82078 maintains complete software compatibility with the 82077SLl82077AA/8272A floppy disk controllers. Itcontains programmable power management features while integrating all of the logic required for floppy disk control. The power management features are transparent to any application software. There are two versions of 82078 floppy disk controllers, the 82078SL and 82078-1. The 82078 is fabricated with Intel's advanced CHMOS III technology and is also available in a 44-lead QFP package. ·Other brands and names are the property of their respective owner. Refer to the 1995 Peripheral Components Handbook for the complete data sheet on this device. The complete document for this product is available on Intel's "Oata-on-Oemand" CD-ROM product. Contact your loeallntel field sales office, Intel technical distributor, or call 1-800-548-4725. 4-208 Noveniber 1994 . Order Number: 290475-004 82077SL CHMOS SINGLE-CHIP FLOPPY DISK CONTROLLER • Completely Compatible with Industry Standard 82077 AA • Single-Chip Laptop Desktop Floppy Disk Controller Solution -100% PC AT" Compatible - 100% PS/2" Compatible -100% PS/2 Model 30 Compatible - Fully Compatible with Intel's 386SL Microprocessor SuperSet - Integrated Drive and Data Bus Buffers • Power Management Features - Application Software Transparency - Programmable Powerdown Command - Auto Powerdown and Wakeup Modes - Two External Power Management Pins - Typical Power Consumption in Power Down: 10 J.LA • High Speed Processor Interface • • • • • • • • • • Integrated Analog Data Separator - 250 Kbits/sec - 300 Kbits/sec - 500 Kbits/sec -1 Mbits/sec Programmable Crystal Oscillator for On or Off Integrated Tape Drive Support Perpendicular Recording Support 12 mA Host Interface Drivers, 40 mA Disk Drivers Four Fully Decoded Drive Select and Motor Signals Programmable Write Precompensation Delays Addresses 256 Tracks Directly, Supports Unlimited Tracks 16 Byte FIFO 68-Pin PLCC (See Packaging Handbook Order Number # 240800, Package Type N) The 82077SL, a 24 MHz crystal, a resistor package, and a device chip select implements a complete laptop solution. All programmable options default to 82077 AA compatible values. The dual PLL data separator has better performance than most board level/discrete PLL implementations. The FIFO allows better system performance in multi-master systems (e.g., Microchannel, EISA). The 82077SL is a superset of 82077 AA. The 82077SL incorporates power management features while maintaining complete compatibility with the 82077AA/8272A floppy disk controllers. It contains programmable power management features while integrating all of the logic required for floppy disk control. The power management features are transparent to any application software. The 82077SL is available in three versions-82077SL-5, 82077SL and 82077SL-1. 82077SL-1 has all features listed in this data sheet. It supports both tape drives and 4 MB floppy drives. The 82077SL supports 4 MB floppy drives and is capable of operatio(1 at all data rates through 1 Mbps. The 82077SL-5 supports 500/300/250 Kbps data rates for high and low density floppy drives. The 82077SL is fabricated with Intel's advanced CHMOS III technology and is available in a 68-lead PLCC (plastic) package. 290410-1 Figure 1. 82071SL Pinout ·PS/2 and PC AT are trademarks of IBM. Refer to the 1995 Peripheral Components Handbook for the complete data sheet on this device. The complete document for this product is available on Intel's "Data-on-Demand" CD-ROM product. Contact your local Intel field sales office, Intel technical distributor, or call 1-800-548-4725. December 1994 Order Number: 290410·004 4-209 82595 ISA/PCMCIA HIGH INTEGRATION ETHERNET CONTROLLER of Use • -EaseDesign Time Reduced by High Integration for Lowest Cost • Optimal Solution • • -Glueless8-Bitl16-Bit ISA/PCMCIA 2.0 Bus Interface - Provides Fully 802.3 Compliant AUI and TPE Serial Interface - Local DRAM Support up to 64 Kbytes - FLASH/EPROM Boot Support - Hardware and Software Portable between Motherboard, Adapter, and PCMCIA 10 Card Solution High Performance Networking Functions -16-Bit 10 Accesses to Local DRAM with Zero Added Wait-States - Ring Buffer Structure for Continuous Frame Reception and Transmit Chaining - Automatic Retransmission on Collision - Automatically Corrects TPE Polarity Switching Problems Low Power CHMOS IV Technology Integration - EEPROM Interface to Support Jumperless Design - Software Structures Optimized to Reduce Processing Steps - Automatically Maps into Unused PC 10 Location to Help Eliminate LAN Setup Problems - All Software Structures Contained in One 16-Byte 10 Space - Automatic or Manual Switching between TPE and AUI Ports -JTAG Port for Reduced Board Testing Times Management • -Power SL Compatible SMOUT Power Down Input - Software Power Down Command for non-SL Systems 144-Lead tQFP Package Provides • Smallest Available Form Factor (See Packaging Spec., Order No. 240800) AO-9 TPE Serial Interface A14-19 00-15 ISA/PCMCIA Bus Interface Control TPE Link IIF LED Control CSMA/CD Unit SMOUT Local Memory Interface (DMA) AUI Serial Interface AUI Link IIF n 20 MHz XTAL '-----' Locol DRAM I/F 290458-1 Figure 1.82595 Block Diagram For the complete data sheet on this product, refer to the 1995 Networking handbook. 4-210 February 1994 Order Number: 290458-003 82593 CSMA/CD CORE LAN CONTROLLER • Supports Industry Standard LANs -IEEE 10BASE5 (Ethernet") -IEEE 10BASE2 (Cheapernet) -IEEE 10BASE-T (TPE) • Simple, High-Performance Control and Data Interface - Control and Status via RD, WR, and CS Lines - Data Transfers via DMA Interface - Two Clocks per DMA Transfer - Programmable Bus Throttle Timer • High-Performance Networking Functions - Automatic Retransmission from Internal FIFO - Back-to-Back Frame Reception with No CPU Intervention - Receive Ring Buffer Memory Structure - Transmit Frame Chaining - 96-Byte Transmit FIFO and 96-Byte Receive FIFO • High Speed, 5V CHMOS IV (P648.8) Technology • Serial Bit Rates up to 20 Mb/s (82593SX) - Direct Interface to Intel 82C501AD ESI or AMD 7992 SIA - Conforms to 802.3 CSMA/CD Standard • On-Board Diagnostics -Internal and External Loopback Operation -Internal Register Dump - TDR Functionality • 44-Lead PLCC Package Type N (82593SX), 44-Lead QFP Package Type S (82593SX) or 28-Pin PDIP - 82593SX (8/16-Bit) System Clock up to 20 MHz - 82593SX Package Pin Compatible with Intel 82592 PLCC (See Packaging Spec., Order No. 240800·001 Package Type N, S and P) BIU I Bus Throttle Timer " .A < ~ 'I Control I Interfaee .A D < .s, .A eti!;! ~a Interface (8 or 16 Bit) - 1 ~ I nrc Control Port 0 Command &: Status Regs ... I Port 1 Command &: Status Regs U CSMA/CD I I Receive FIFO < Logic )I I RXC .... TXD TXC ACCESS CONTROL UNIT I I Transmit nfO I ... I k I MEDIUM "I Two Channel DMA Interface RXD CRS COT RTS CTS Back-te-Back RCV &< Ring Buffer Logic (RfP &< Stop Reg) Internal Retransmit Control DIU -- Configuration Registers ~ Dt.tA Interface &< IA -V Parallel FIFO Serial Subsystem Subsystem Subsystem 290411-1 Figure 1. 82593 Block Diagram "Ethernet is a registered trademark of Xerox Corporation. For the complete data sheet on this product, refer to the 1995 Networking handbook. October 1992 Order Number: 290411·004 4·211 82503 DUAL SERIAL TRANSCEIVER (DST) 82503 PRODUCT FEATURE SET OVERVIEW • • • • Single Component Ethernet* Interface to Both 802.3 10BASE-T and AUI Automatic or Manual Port Selection Manchester Encoder/Decoder and Clock Recovery No Glue Interface to Industry-Standard LAN Controllers -Intel 82586, 82590, 82593 and 82596 - AMD 7990 (LANCE *) - National Semiconductor 8390 and 83932 (SONIC*) - Western Digital 83C690 - Fujitsu 86950 (Etherstar*) • • • • • • • Diagnostic Loopback Reset, Low Power Modes Network Status Indicators Defeatable Jabber Timer User Test Modes 10 MHz Transmit Clock Generator One Micron CHMOS** IV (Px48) . Technology .Single 5-V Supply • INTERFACE FEATURES TPE • Complies with 10BASE-T, IEEE Std. 802.3i-1990 for Twisted Pair Ethernet AUI • Complies with IEEE 802.3 AUI Standard • Direct Interface to AUI Transformers • Selectable Polarity Switching • Direct Interface to TPE Analog Filters • On-Chip AUI Squelch • On-Chip TPE Squelch • Defeatable Link Integrity (U) • Support of Cable Lengths> 100m . A block diagram of a typical application is shown in Figure 1. The 82503 Dual Serial Transceiver is a high-integration CMOS device designed to simplify interfacing industry standard Ethernet LAN Controllers to IEEE 802.3 local area network applications· (10BASE5, 10BASE2, and 10BASE-D. The component supports both an attachment unit interface (AUI) and a Twisted Pair Ethernet interface (TPE). It allows OEMs to desi"n a state-of-the-art media interface that is jumperless and fully automatic. The 82503 includes on-chip AUI and TPE drivers and receivers; it offers designers a cost-effective, integrated solution for interfacing LAN controllers to the wire medium. "CHMOS is a patented process of Intel Corporation. • Ethernet is a registered trademark of Xerox Corporation. LANCE is a registered trademark of Advanced Micro Devices. Etherstar is a registered trademark of Fujitsu Electronics. Sonic is a registered trademark of National Semiconductor Corporation. For the complete data sheet on this product, refer to the 1995 Networking handbook. 4-212 November 1992 Order Number: 290421-003 I Ethemet* LAN Card Product Brief Product Highlights Complete plug and play PCMCIA LAN solution. Comes with all drivers, installation, card and socket services and card management software necessary for reliable operation in a PCMCIA slot. _ Drivers for all major network operating systems. _ Industry·recognized Intel SoftSet installation software. Easy to operate and manage. _ Based on highly integrated Intel 82595 Ethernet Controller _ Complies with PCMCIA 2.01 JEID A4. I 68·pin standard. _ 5 mm·thick PCMCIA Type II card. _ Detachable line adapter module (LAM) for multiple media attachment. •_ Ethernet IEEE 802.3 compatibility (I0BASE·TrrPE, IOBASE· 2/BNC). _ Activity and link integrity LEDs. The Intel PCMCIA Ethernet LAN Card hrings high pelformance and ease of use to PCMCIA networking. It lets you pur networking capahilities into laptop computers without a lot of hassle, headache or expense, It's a plug and play solution, prol'iding PCMCIA network- readiness right out of the hox. All the software you need comes with the card: dri,'ers for the most popular network operating systems (Norell NetWare* 2.2, 3.11, 4.0 and Lite 1.0: Microsoft* LAN Manager* 2.x: IBM* LAN SelTer* 2.X: Banyan Vines* 4.x: and Microsoft Windows for Workgroup* 3.1!, PCMCIA- compliant card and socket seJl'icesjrom SystemSoft*, and Intel's own card manager and Softset auto-conJiKuration, autoinstallation software thar gets users on the network fast. The Intel PCMCIA Ethernet LAN Card is hased on the highly integrated Intel 82595 ISAIPCMCIA Ethernet Controller, gil'ing you/6-hit desktop pelformance in a PCMCIA form factor, As we del'e1op additional software functionality for our 82595 line, you'll he ahle to lel'erage it across \'our entire 82595-hased product line, from chips to NICs to PCMCIA form factor products. You decrease your time to market hy taking adrallIage of Intel's product del'elopment efforts, while increasing the I'olue of)'o"" products. The card is standard PCMCIA 2.0 68-pinformfactor and only 5 mm thick. It' s passed Inters extensil'e qualit\' and reliahility testing to ensure that it stands up to the rigors ofmohile users on the go. These include PCMCIA mechanical qualification testing such as torque, hend, shock, ,'ihration and enl'ironmental testing across extreme temperatures and mltages. Our CMOS technolog\' and highly integrated 82595 are my power efficient, letting mohile users stay connected to the network for a long time. Maximum power draw is 85 mAc in idle mode, power consumption drops to 20 mAo Finally, we're made it easy to ('ustomi:e our card and its accessories to -,"0111".(/\'1'11 OEM marketing needs, Manuals, software diskettes and the cards themsell'es can all he tailored to ref/ect your company's look, intet I 4-213 I Product Description The Intel PCMCIA Ethernet LAN Card is the fastest, easiest way to deliver networkready laptops to your customers. It snaps in and installs in minutes, not hours, thanks to a disk-full of software to make your job easier. Our industry-standard SoftSet installation utility automatically configures the card and sets up the software with a single command. Built-in card and socket services software provides card recognition and compatibility. Built-in card management software performs IRQ management and allows the card to be installed even when the system is running, so you don't have to reboot. There's even built-in driver support for popular network operating systems from Novell, Microsoft, IBM and Banyan. There are no jumpers or switches to set, no IRQ addresses to labor over. The card's installed and working in five minutes. Once installed, it's easy to operate too, improving customer satisfaction and decreasing the number of support calls you receive. The card is fully PCMCIA 2.0{JEIDA 4.1 compliant with astandard 68-pin form factor. It's also fully compliant with Ethernet IEEE 802.3 standard for 10BASE-T and IOBASE-2 wiring. There's a detachable line adapier module (LAM) for attachment to multiple media and LEDs to indicate active and link integrity. Features Benefits - Installation, card manager, and card and socket services software - A complete solution; no need for any other pieces Easy to install, easy to configure, easy to use - Broad client driver support - Meets broad target market. Usable on industrystandard networks like Novell, LAN Manager and Banyan .. - Complies with PCMCIA 2.0/JEIDA 4.1 standards - 4-214 - Nojumpers Easy to transport Small form factor - Intel-based 82595 - Great performance; 8 & 16-bit data path Glueless interface to PCMCIA Bus - Activity and link integrity LEDs - Indicates card status, improves diagnostics - Supports both twisted pair Ethernet ( IOBASE- T) and ThinCoax (BNC IOBASE-2) - Flexibility to connect to multiple network media I I Product Codes MBLA8110 MBLA8120US MBLA8120EU TPE all geographies BNCUS BNC Europe Additional Literature Local Area Networking Family Product Brief EtherExpress Family Brochure TokenExpress Family Brochure Intel 82595 Data Sheet Intel 82595 User's Manual 297085-002 0413.01 0414.02 290458-003 TBO * Other brdnds and names are the property of their respective owner,. Order Number: 297120-003 © Intel Corporation, 1993 I 4-215 I DataFax 14.4 Card Product Brief Product Highlights LAM·1ess design (Int~grated DAA) . _ Complies with PCMCIA 2.0 and JEIDA 4.1 standards _ PCMCIA Type II cardS.Omm thick _ Automatic power down moM _ DTMF AND PULSE dialing _ Fully compliant with CCITT TJO and T.4 (Group 3 fax) _ Compliant with CCITT V.17 (14A Kbps fax) _ Provides both Send and Receive fax capability With the design of the Datafax 14.4. Intel introduces a high speedfax card Iwith no external line adapter module. Lightweight. Easy to carry. And less chance of damage or loss. Not only does the DataFax 14.4 feature intI" _ Includes Ring Detect notification grated DAA. but its high speed transmission helps your customers reduce to host computer in the power the telephone expenses associated with faxes and modems. down mode _ 8-bit 1/0 bus interface _ FCC Class B, UL and DOCI UL(Canada) _ Compatible with EIAITIA 602 (AT command set) _ Data modem complies with CCITT·V.32bis, V.32, V.22bis, Vol2, Voll, BellO 212a,103 _ Supports CCITT V.42 error detection and V.42bis data compreSsion _ Provides MNP/S' data compres· sion for backward compatibility _ Requires no external power _ MNP/IO • Hayes AutoSync 4-216 The DataFax 14.4 fax card is·also Group 3 compliant, assuring worldwide compatibility with most fax machines operating today. It sends and receil'es faxes and transfers files to orfrom notebook or hand-held computers ol'er the public telephone network. The Intel DataFax 14.4 also conforms to PCMCIA 2.0 AND lElDA 4.1 physical and electrical standards for portable computers. The si:e of a credit card, the DataFax 14.4 slides into an external slot in the notebook computer. and connects easily into the puhlic switched telephone network. The card features four power management modes - on·line. actil'e. pOll'er sal'e. and power down - to ensure the lowest possihle power consumption. No external power is required. Product Description The DataFax 14.4 combines high speed with an integrated DAA design to provide optimal connectivity for notebooks and sub-notebooks. PCMCIA cards enhance the multiple functionality of the notebook computer. Intel's integrated DAA design incorporates the telephone interface circuitry directly onto the card. This includes a ring detector and telephone line coupling transformer. A six·foot citble connects the DataFax 14.4 card to a standard RJ·II modular telephone jack for connectivity to the telephone network. I I The DataFax 14.4 supports V.42 error correction which ensures that errors caused by the phone system are automatically corrected. The DataFax 14.4 increases data throughput with V.42bis data compression. This detects redundant characters. character sequences. and uses fewer bits to send more frequently occurring sequences. DataFax 14.4 users enjoy an effective throughput of up to 57.600 bits per second. Unlike other card manufacturers. Intel provides a full-service program for card labeling. custom kitting with third pany applications. and fulfillment. The Intel DataFax 14.4 is the only fax card designed with recessed covers in order to accept adhesive labels. front and back. insuring quick turnaround for custom orders. Features Benefits - Integrated DAA - No external circuitry Compact and lightweight Easy to carry Fits in briefcase with notebook computer - Exchangeable with other cards - Single slot serves multiple functions - CCITT V.32bis ( 14.4 Kbps) - Ensures connectivity worldwide - Faster transmission means less costly phone bills - Supports V.42/V.42bis - Provides high throughput (57.6 Kbps) - MNP/IO - Enhanced data throughput with cellular connections . (required phone adapter) - Hayes AutoSync - Provides synchronous communications capability (software required) *Other brands and names are the propert~ .. of their respective owners. Order Number: 297311-002 © Intel Corporation. t993 I 4-217 I Faxmodem 24/96 Card Product Brief Product • ,Highlights Integrated DAA on the card for U.SJCanada • Complies with PCMCIA 2.0 and JEIDA 4.1 standards • PCMCIA Type II card .Smmthiek . ' Fully compliant with ccm T.3O and T.4 (group 3 fax) • Provides both Send and Receive. fax capability • Fully compliant with EIAmA 578 (fax class I command set) • Data modem complies with CCITT V.22bis, V.22, V.21, Bell' 2120 and 103 • Supports ccm V.42 error detection and V.42bis dat~ compression • Provides MNPS data compression for backward compatibility • Includes Ring Detect notification to host computer in the power down mode • Multiple power conservation modes • Requires no external power The Faxmodem 24/96 enables you to send or receil'e faxes and transfer files to or from notebook computers ol'er the public telephone network. It is the latest member of Intel's 110 card "plug and play" family. The Intel Faxmodem 24/96 also conforms to PCMCIA 2.0 AND JEIDA 4.1 physical and electrical standards for portable computers. The card's light weight and low power consumption combine to p/'O\'ide high-peiformance. low-cost connectil'ity for notebooks and sub-notebooks. Approximately the size of a credit card, the Faxmodem 24/96 slides into an external slot in the notebook computer, connecting easily into the public switched telephone network. The card featuresfou/' power management modes - on-line, actil'e, power sal'e, and power down -to ensure the lowest possible power consumption. No external power is required. Product Description The Faxmodem 24/96 card provides convenient. high-performance mobile fax and data communications capability for the notebook computer user. Adherence to accepted national and international standards pmvides the user with worldwide connectivity for both fax (approximately 30 million machines in use worldwide) and data transfer (V.22bis is the most widely used communications siandard in the world). Its exchangeability with other PCMCIA cards enhances the multiple functionality of the notebook computer. intel· 4-218 I I The Faxmodem 24/96 card contains the Intel 89C 124FX integrated data-fax modem chipset, a UART, a microcontroller, an analog front-end and other supporting devices. The 16450-type UART provides an 8-bit data bus interface. This,together with the Class I AT command set support, provides compatibility with all the major fax application software. For U.S. and Canada, the DAA is integrated on the card. For rest of world,the detachable line adapter module incorporates country-specific telephone interface circuitry. This consists of the ring detector and telephone line coupling transformer. A standard RJ-II modular telephone jack provides connectivity to the telephone network. A short cable connects the line adapter module to the Faxmodem 24/96 card. The Faxmodem 24/96 supports V.42 error correction which ensures that errors caused by the phone system are automatically corrected. The Faxmodem 24/96 also supports V.42bis data compression. This increases data throughput by detecting redundant characters. character sequences, and using fewer bits to send more frequently occurring sequences. Throughput for file transfer operations is increased by as much as 400 percent, providing the user with an effective 9600 bits per second. Features Benefits - Integrated DAA - - Exchangeable with other cards - Single slot serves multiple functions - Group 3 fax - Compatible with worldwide installed base of fax machines - Class I fax command set - - CCITT V.22bis, V.22, V.2\. Bell 212 and \03 - Ensures connectivity worldwide - Supports V.42/V.42bis - Faster; up to 4 to I data compression providing an effective throughput of 9600 bits per second for file transfers - Modem supports AT command set - Compatible with standard communications packages - Multiple power conservation modes - Prolong system banery life - Factory Configuration Option for Cellular Network - Ease of use No external circuitry Compact and lightweight Easy to carry Fits in briefcase with notebook computer Com~atible with standard communications packages '" Other brJ.nd'l Olnd name~ are the property of their respective owner~. Order Number: 297)90·()()t ©lntel Corporation. 1993 I 4·219 82489DX ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER 82489DX FEATURES OVERVIEW • • • • Advanced Interrupt Controller for 32-Bit Operating Systems Solution for Multiprocessor Interrupt Management Dynamic Interrupt Distribution for Load Balancing in MP Systems Separate Nibble Bus (Interrupt Controller Communications (ICC) Bus) for Interrupt Messages • • • • • • • Inter-Processor Interrupts Various Addressing SchemesBroadcast, Fixed, Lowest Priority, etc. Compatibility Mode with 8259A 32-Bit Internal Registers Integrated Timer Support 33 MHz Operation 132-Lead PQFP Package, Package Type KU (See Packaging Specification. Order Number: 240800) 82489D~ Block Diagram . . PNMI PINT . - . - 0[31:0] +- PRST . . I RESET ClKIN ~ ADS .... ... r LintO lint1 local Unit BGT OLE System Decode & Control Unit M/iO o/C W/R "" '"u """0 '"....'" :;: .... a; ExtlNTA A[10:3] " I I ...."" .... ... TRST TOI TMS TOO Inti nO Intin1 I/O Unit Int!n2 Intin 15 I N '" TCK .,.. ICC Bus JTAG Compatible Boundary Scan TAP Controller Timer Inl . - - - , .... .... ... 1 rl 32-Bit Timer I. I TMBASE 290446-1 Refer to Application Note AP-388: 82489DX User's Manual (Order Number 292116) when evaluating your design needs. 4-220 October 1993 Order Number: 290446-002 82489DX Advanced Programmable Interrupt Controller CONTENTS PAGE CONTENTS PAGE 1.0 INTRODUCTION ................... 4·224 Interrupt Command Register [63:32] ........................ 4-238 2.0 FUNCTIONAL OVERVIEW . ........ 4·225 6.5 IRR, ISR, TMR Registers ........ 4-238 ICC Bus ............................. 4-225 Interrupt Acceptance ............. 4-238 Local Unit ........................... 4-225 Acceptance Mechanism .......... 4-240 1/0 Unit ............................. 4-225 6.6 Tracking Processor Priority ...... 4-242 Timer ............................... 4-225 3.0 PIN DESCRIPTION ................ 4-225 4.0 FUNCTIONAL DESCRIPTION ..... 4-229 1/0 Unit ............................. 4-229 Local Unit ........................... 4-230 5.0 INTERRUPT CONTROL MECHANISM ........................ 4-232 5.1 Interrupts ........................ 4-232 Total Allowed Interrupt Vectors ... 4-232 Interrupt Sources ................. 4-233 Interrupt Destinations ............ 4-233 Interrupt Delivery ................. 4-233 5.2 Interrupt Redirection ............. 4-234 Inter-82489DX Communication ... 4-234 6.0 82489DX LOCAL UNIT REGISTERS DESCRIPTION ...................... 4-234 I Task Priority Register ............. 4-242 6.7 Dispensing Interrupts ............ 4-243 Dispensing Interrupts to the Local Processor ..................... 4-243 6.8 Spurious Interrupt Vector Register .......................... 4-243 Spurious Interrupt ................ 4-243 Unit Enable ...................... 4-243 6.9 End-Of-Interrupt (EOI) Register .. 4-243 6.10 Remote Read Register ......... 4-244 6.11 82489DX Local Configuration ... 4-244 Local Version Register ........... 4-244 6.12 82489DX Timer Registers ...... 4-244 Overview ......................... 4-244 Time Base ....................... 4-244 Timer ............................ 4-245 Timer Vector Table ............... 4-245 6.1 Local Unit ID Register ........... 4-234 7.0 82489DX 1/0 UNIT REGISTERS . .. 4-246 82489DX Local Unit ID Register .. 4-234 6.2 Destination Format Register ..... 4-234 Registers Addressing Scheme ....... 4-246 82489DX 1/0 Unit Configuration ..... 4-247 6.3 Localinterrrupt Vector Table Registers ......................... 4-235 1/0 Unit ID Register .............. 4-247 1/0 Unit Version Register ......... 4-247 Local Interrupts 0,1 Interrupt Vectors ........................ 4-235 6.4 Inter-Processor Interrrupt Registers ......................... 4-236 1/0 Unit Interrupt Source Registers .. 4-247 Interrupt Command Register [31 :0] .......................... 4-236 Destination .......................... 4-249 Redirection Tables ............... 4-247 Descriptions ........................ 4-248 4-221 CONTENTS PAGE 8.0 ICC BUS DEFINITION ............. 4-250 Physical Characteristics ............. 4-250 Bus Arbitration ...................... 4-250 Lowest-Priority Arbitration ........... 4-251 ICC Bus Message Formats .......... 4-251 Long Message Format .............. 4-253 CONTENTS PAGE Pause-IR ......................... 4-273 Exit 2-IR ......................... 4-273 Update-IR ........................ 4-273 Instruction Register .............. 4-274 Bypass Instruction ............... 4-274 Extest Instruction ................. 4-274 Sample/Preload Instruction ...... 4-274 9.0 HARDWARE TIMINGS ............. 4-254 dcode Instruction ................. 4.274 Interfacing to the ICC Bus ........... 4-255 Device Identification Register (DID) .. 4-274 First Order Buffer Models ............ 4-255 Boundary Scan Register ............. 4-274 MBO Pull-Up Register ............... 4-255 Driving Lumped Capacitance ........ ·4-255 Boundary Scan Cell Names in Order from tdi to tdo ..................... 4-276 Driving Transmission Lines .......... 4-256 Bypass Register ..................... 4-278 External Drivers/Buffered ICC Bus ... 4-258 JTAG TAP Controller Initialization .... 4-278 Transmission Line Termination ...... 4-260 ICC Bus Operating Frequency ....... 4-260 11.0 ELECTRICAL CHARACTERISTICS ................ 4-279 9.1 ~2.489DX Register Access Timing ............................ 4-262 11.2 A.C. Specifications ............. 4-280 11.1 D.C. Specifications ............. 4-279 Timing Diagram Notation ......... 4-262 Register WRITE Timing ........... 4-263 12.0 REGISTER SUMMARY ........... 4-282 Register READ Timing ........... 4-268 I/O Unit Registers ................... 4-283 Interrupt Acknowledge Timing .... 4-268 Local Unit Registers ................. 4-284 Reset and Miscellaneous Timing ......................... 4-269 13.0 TIMING DIAGRAMS .............. 4-285 14.0 PACKAGE PIN·OUT .............. 4-289 10.0 BOUNDARY SCAN DESCRIPTION ...................... 4-269 10.1 Boundary Scan Architecture .... 4-269 Test Access Ports ................ 4-270 TAP Controller ................... 4-270 15.0 PACKAGE THERMAL SPECIFICATION .................... 4-290 .16.0 GUIDELINES FOR 82489DX USERS .............................. 4-291 Test-Logic-Reset ................. 4-271 16.1 Initialization .................... 4-291 Run-Test/Idle .................... 4-272 16.2 Compatibility ................... 4-291 Select-DR-Scan .................. 4-272 Compatibility Levels .............. 4-291 Select-IR-Scan ................... 4-272 82489DX/8259A Interaction ...... 4-291 82489DX/8259A Dual Mode Connection .................... 4~292 Capture-DR ...................... 4-272 Shift-DR ......................... 4-272 Exit 1-DR ........................ 4-272 16.3 Hardware Guidelines ........... 4-293 Pause-DR ........................ 4-272 82489DX Hardware State.on Reset .......................... 4-293 Exit 2-DR ......................... 4-272 Pull Up and Pull Down Resistors .. 4-293 Update-DR ....................... 4-273 Pint and ExlNTA Timings ......... 4-293 Capture-IR ....................... 4-273 ExtiNTA Timings ................. 4-293 Shift-IR .......................... 4-273 Exit 1-IR .................. : ...... 4-273 4-222 I CONTENTS PAGE 82489DX and Memory Mapping .. 4-293 JT AG Circuit Considerations ...... 4-293 CONTENTS PAGE I Disabling Local Unit .............. 4-295 Issuing EOI ...................... 4-296 16.4 Programming Guidelines ....... 4-294 External Interrupts and EOI ....... 4-296 Unique 10 Requirement ........... 4-294 Spurious Interrupts and EOI ...... 4-296 Atomic Write Read to Task Priority Register ....................... 4-294 Task Priority Register ............. 4-296 Critical Regions and Mutual Exclusion ...................... 4-294 ExtlNT Interrupt and Task Priority ......................... 4-296 Interrupt Command Register Programming Sequence ........ 4-294 NMI and EOI ..................... 4-296 Removing Masks ................. 4-296 Interrupt Vector .................. 4-294 Delivery Mode and Trigger Mode .......................... 4-296 Local and I/O Unit ............... 4-294 Assigning InterruptVectors ....... 4-297 ICR (Interrupt Command Register) ....................... 4-294 Sending Inter-Processor Interrupts ...................... 4-297 ISRIIRR/TMR ................... 4-295 Focus Processor ................. 4-295 ExtlNT Interrupt Posting .......... 4-295 Delay with Level Triggered Interrupts ...................... 4-297 Reset Deassert .................. 4-297 Synchronizing Arb IDs ............ 4-295 Lowest Priority ................... 4-295 Changing Redirection Tables ..... 4-297 Interrupt Masking ................ 4-297 Device Drivers with 82489DX ..... 4-297 SYSTEM HARDWARE AND SOFTWARE DESIGN CONSIDERATIONS ................. 4-298 DIRECTIONS FOR EASY MIGRATION TO FUTURE INTEGRATED APIC ... 4-301 I . 4-223 82489DX can be specified per pin. A 32-bit wide timer is provided that can be programmed to interrupt the local processor. The timer can be used as a counter to provide a time base to software running on the processor, or to generate time slice interrupts locally to that processor. The 82489DX provides 32-bit software access to its internal registers. Since no 82489DX register reads have any side effects, the 82489DX registers can be aliased to a user readonly page for fast user access (e.g., performance monitoring timers). 1.0 INTRODUCTION The 82489DX Advanced Programmable Interrupt Controller provides multiprocessor interrupt management, providing both static and dynamic symmetrical Interrupt distribution across all processors. The main function of the 82489DX is to provide interrupt management across all processors. This dynamic interrupt distribution includes routing of the interrupt to the lowest-priority processor. The 82489DX works in systems with multiple I/O subsystems, where each subsystem can have its own set of interrupts. This chip also provides inter-processor interrupts, allowing any processor to interrupt any processor or set of processors. Each 82489DX I/O unit Interrupt Input pin is individually programmable by software as either edge or level triggered. The interrupt vector and interrupt steering information .. t DATA/ ADDR Bu, rl To Processor/MBC The 82489DX supports a generalized naming/addressing scheme that can be tailored by software to fit a variety of system architectures and usage models. It also supports 8259A compatibility by becoming virtually transparent with regard to an externally connected 8259A style controller, making the 8259A visible to software. PINT J .7 Interrupt Management Logic t Interface . I ~ IJo.. ... .... ' Internal DATAl AODR Bu, ---=---- - Task Priority Reg .. .. 0 8 Logic 15 L-- 32 IRR Bits .. Remote Reg Local Unit 10 Reg Timer 1 ::; ~ 0 ~ ~ Entry 0 1 . .'" ,. 63 0 0 ~ Redirection Tobie 16 Entries "0 '"' 1.-..--.+ 15 30 1----+ I L t 31 Interrupt t LINT 0 ~~-- LINT 1 ~~-- l I/O Select Reg. .. .,j1Jo.. I Command Reg Entry 15 32 31 Entry 15 '-- 0 1 63 I/o "lJr Unit 10 Reg. -1 290446-2 Figure 1. 82489DX Architecture 4-224 c Dost. Format Entry 0 32 . Logical Oest . 31 0 -----t ICC BUS MESSAGE INTERFACE 32 TMR Bits ~ .. 0 I rcc Acceptance 32 ISR Bits ~ Orai n Bu, Interrupt Message 0 Local Bu, .. ~ 4-BI t Open PNMI.PRST EXTINTA 82489DX 2.0 FUNCTIONAL OVERVIEW 82489DX Functional Blocks 824890X contains one Local Unit, one 1/0 unit and a timer. The ICC bus is used to pass interrupt messages. ICC BUS The ICC bus is a 5-wire synchronous bus connecting all 824890Xs (all 1/0 Untis and all Local Units). The Local Units and 1/0 Units communicate over this ICC bus. Four of these five wires are used for data transmissions and arbitration, and one wire is a clock. LOCAL UNIT The Local Unit contains the necessary intelligence to determine whether or not its processor should accept interrupt messages sent on the ICC bus by other Local Units and 1/0 Units. The Local Unit also provides local pending of interrupts, nesting and masking of interrupts, and handles all interactions with its local processor such as the INT/INTAlEOI protocol. The Local Unit further provides inter-processor interrupt functionality and a timer to its local processor.. The interface of a processor to its 824890X Local Unit is identical for every processor. the form of an edge or a level. The 1/0 unit also contains a Redirection Table for the interrupt input pins. Each entry in the Redirection Table can be individually programmed to indicate whether an interrupt on the pin is recognized as either an edge or a level; what vector and also what priority the interrupt has; and which of all possible processors should service the interrupt and how to select that processor (statically or dynamically). The information in the table is used to send interrupt messages to all 824890X Units via the ICC bus. TIMER The 824890X provides a 32-bit wide timer that can be programmed to interrupt the local processor. The timer can be used as a counter to provide a timebase to software running on the processor, or to generate time-slice interrupts local to that processor. 3.0 PIN DESCRIPTION The 824890X pin description is organized in a small number of functional groups. Pin definitions and protocols have been designed to minimize interface issues. In particular, they support the notion of independently controlled address and data phases. The primary host interface is synchronous in nature. In the following pin definition table if the signal name has over it, the signal is in its active state when it has a low level. The signal direction column identifies output only signals as a continuous drive (0), tristate (TIS), or open drain (010). All bi-directional (81-0) Signals have tri-stating outputs. <-) 1/0 UNIT The 1/0 Unit provides the interrupt input pins on which 1/0 devices inject interrupts into the system in 4-225 82489DX Pin Definition Table Symbol Pin No. Type Function SYSTEM PINS RESET 65 I The RESET INPUT forces 82489DX to enter its initial state. The 82489DX local Unit in turn asserts it PRST (Processor Reset) output. All tri-state outputs remain in high impedance until explicitly enabled. ExtlNTA 41 0 The EXTERNAL INTERRUPT ACKNOWLEDGE output is asserted (high) when an external interrupt controller (e.g., 8259) is expected to respond to the current INTA cycle. If deasserted (low), 82489DX will respond, and the INTA cycle must not be delivered to the external controller. ClKIN 57 I CLOCK INPUT provides reference timing for most of the bus signals. TRST 56 I TEST RESET is the JTAG compatible boundary scan TAP controller reset pin. A weak pull-up keeps the pin high if not driven. TCK 55 I TEST CLOCK is the clock input for the JTAG compatible boundary scan controller and latches. TOI 53 I TEST DATA INPUT is the test data input pin for the JTAG compatible boundary scan chain and TAP controller. A weak pull-up keeps this pin high if not driven. TOO 52 0 TEST DATA OUTPUT is the test data output for the JTAG compatible boundary scan chain. TMS 54 I TEST MODE SELECT is the test mode select pin for the JTAG boundary scan TAP controller. A weak pull-up keeps this pin high if not driven. 59 I The TIME BASE input provides a standard frequency that is only used by the 82489DX timer and that is independent of the system clock. 82-97 I These 16 INTERRUPT INPUT pins accept edge or level sensitive interrupt requests from 1/0 or other devices. The pin numbers are specified respectively. INTIN15 corresponds to pin number 82, INTIN 14 corresponds to pin number 83 etc., and INTINO corresponds to pin number 97. These pins are active high. 80 81 I I Two LOCAL INTERRUPT INPUT pins accept edge or level sensitive interrupt requests that can only be delivered to the connected processor. These pins are active high. I ADDRESS STROBE signal indicating the start of a bus cycle. 82489DX does not commit to start the cycle internally until BUS GRANT is detected active. TIMER PIN TMBASE INTERRUPT PINS INTIN[15:0] LlNTIN[1] LlNTIN[O] REGISTER ACCESS PINS ADS 4-226 64 82489DX Pin Definition Table (Continued) Symbol Pin No. Type Function REGISTER ACCESS PINS (Continued) w/R 63 61 62 I I I Bus cycle definition signals. Note that since the 824890X registers can be mapped in either memory or 1/0 space, the MilO pin is not used for register access cycles; it is only used to decode interrupt acknowledge cycles. 824890X does not respond to code read cycles. BGT 66 I The BUS GRANT input is optional and is used to indicate the address phase of a bus cycle in configurations where address timing cannot be inferred from ADS. This signal is really used as an address latch enable, but is named as it is to indicate that it can normally be connected to the Intel Cache Controller generated signal of the same name. Must be tied low if not used. CS 74 I The CHIP SELECT input indicates that the 824890X registers are being addressed. A3 A4 A5 A6 A7 A8 A9 A10 31 29 28 27 26 24 22 21 OLE 73 I 031 030 029 028 027 026 025 024 023 022 021 020 019 018 017 016 015 014 013 012 011 105 107 109 110 111 112 114 115 116 118 119 121 122 123 124 125 128 129 130 131 2 81-0 81·0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 81-0 MilO, DIG, 81·0 81-0 81·0 81-0 ·81·0 81-0 81·0 81-0 The address pins are used as inputs in addressing internal register space. Output function is reserved. They are also used to latch local unit 10 on reset. DATA LATCHIENABLE is optional and is used to indicate committing the data phase of a bus cycle in configurations where data timing cannot be inferred from other cycle timings. Must be tied low if not used. The OATA BUS is for all register accesses and interrupt vectoring. 4-227 82489DX Pin Definition Table (Continued) Symbol Pin No. Type Function REGISTER ACCESS PINS (Continued) 010 09 08 07 06 05 04 03 02 01 00 3 4 7 8 9 11 12 13 14 16 18 BI-O BI-O BI-O BI-O BI-O BI-O BI-O BI-O BI-O BI-O BI-O OP3 OP2 OP1 OPO 101 102 103 104 BI-O BI-O BI-O BI-O ROY 43 0 One Oata Parity pin for each byte on the data bus. EVEN parity is generated any time the data bus is driven by the 824890X. READY output indicates that the current bus cycle is complete. In the case of a read cycle, valid data and the return to inactive state after going active low may be delayed till OLE goes active. PROCESSOR PINS PINT 35 T/S The PROCESSOR INTERRUPT OUTPUT indicates to the processor that one or more maskable interrupts are pending. This pin is tri-stated at reset, and has an internal pull-down resistor to prevent false signaling to the processor until the 824890X local Unit is enabled and this pin is actively driven. PRST 38 0 The PROCESSOR RESET OUTPUT is asserted/de-asserted upon 824890X reset, and also in response to ICC bus messages with "RESET" delivery mode. This pin should be used with care. PNMI 37 T/S The NON·MASKABLE INTERRUPT output is signaled in respone to ICC bus messages with "NMI" delivery mode. This pin is tri-stated at reset, and has an internal pull-down resistor to prevent false signaling to the processor until the local Unit is enabled and this pin is actively driven. ICC BUS PINS IClK MBI[3:0j MB03 MB02 MB01 MBOO 4-228 60 I The ICC BUS CLOCK input provides synchronous operation of the ICC bus. 76-79 I The four ICC BUS IN inputs are used for incoming ICC bus messages. In smaller configurations the ICC bus input and outputs may be tied directly together at the pins. Pin number for MBI3 is 76, MBI2 is 77, MBI1 is 78 and MBIO is 79. 45 48 49 51 0/0 The four ICC BUS OUT outputs are used for outgoing ICC bus messages. The current capacity is only 4 rnA. So external bufferes will be needed. 82489DX Pin Definition Table (Continued) Symbol Pin No. Function Type RESERVED PINS Reserved 34,42 Reserved 70, 72, 75 Reserved by Intel. These pins should be strapped to Vee. Reserved 71,19,20 Reserved by Intel. These pins should be strapped to GND. NC These pins MUST BE LEFT OPEN. POWER AND GROUND PINS VCC 1,32,69,98 POWER Nominally + 5V. These pins along with Vss and VSSI should be separately bypassed. Vccp 6,15,25,100, 108, 117, 126 POWER Nominally bypassed. 39,46 POWER Nominally + 5V. These pins along with VSSpo should be separately bypassed. VSS 5,33,67, 68,99 GND Nominally OV. These pins along with Vcc should be separately bypassed. Vssp 10,17,23,30, 106,113,120, 127,132, GND Nominally OV. These pins along with Vccp should be separately bypassed. 36,40,44, 47,50 GND Nominally OV. These pins along with Vccpo should be separately bypassed. 58 GND Nominally OV. These pins along with Vcc should be separately bypassed. VccPo VSSpo VSSI + 5V. These pins along with VSSP should be separately NOTE: Vcc, Vccp and Vccpo should be of same voltage. Vss, Vssp, Vsspo and VSSI should be av. 4.0 FUNCTIONAL DESCRIPTION I/O Unit As far as interrupt management is concerned, the 82489DX's interrupt control function spans over two functional units, the I/O Unit of which there is one per I/O subsystem, and the Local Unit of which there is one per processor. 82489DX has one I/O unit and one Local Unit in a single package. This section takes a detailed look at both local and I/O Units. The I/O Unit consists of a set of Interrupt Input pins, an Interrupt Redirection Table, and a message unit for sending and receiving messages from the ICC bus. The I/O Unit is where I/O devices inject their interrupts, the I/O Unit selects the corresponding entry in the Redirection Table and uses the information in that entry to format an interrupt request message. The message unit then broadcasts this message overthe ICC bus. The content of the Redirection Table is under software control and is assigned benign defaults upon reset. The masks in the Redirectional Table entries are set to 1 at hardware reset to disable the interrupts. 4-229 82489DX r··--··--··--"'---lI::'::::.~.:-- i -i---+:' -LW- ~ ~!I)I ~ ...--··--··--·---··, ______~I""r-------~ = d Z I ~~I ~------~i • • I I LLJ ~~~ I I I /~ "__ ~------~L Redirection - - - . . . . (!) <> ~"'I i 01 ·i ·i ··i I ..--..--..--..--..--..--..--..--..--..--..--..--.. . Table I/O UNIT 10 REG I/O UNIT VERSION REG ~ ~ 290446-3 Figure 2. 82489DX 1/0 Unit Block Diagram Local Unit Interrupt Management of the Local Unit is responsible for local interrupt sources, interrupt acceptance, dispensing interrupts to the processor, and sending inter-processor interrupts. Depending on the delivery 4-230 mode of the interrupt, zero, one or more units can accept an interrupt. A Local Unit accepts an inter· rupt only if it will deliver the interrupt to its processor. Accepting an interrupt is purely an inter-82489DX matter; dispensing an interrupt to the local processor only involves a 82489DX and its local processor. 82489DX DATA/ADDR INTERRUPT ACK. CYCLE I- Z c:: lV) '" c.. ExtlNTA -, r-------------- Timer Vector Table PRIORITIZER -+---+1 LlNTIN 1 -+---+1 LlNTINO ISR IRR 256-BIT VECTOR - - - - - - - ARRAY REMOTE REG. VECTOR L__________________ _ _______________________ ~ 290446-4 Figure 3. 82489DX Local Unit Block Diagram 4-231 82489DX highest. Priority of interrupt A "is higher than" the priority of interrupt B if servicing A is more urgent than servicing B. An interrupt's priority is implied by its vector; namely priority = vector/16. 5.0 INTERRUPT CONTROL MECHANISM This section describes briefly the interrupt control mechanism in the 82489DX. With 256 vectors and 16 different priorities, this implies that 16 different interrupt vectors can share a single interrupt priority. 5.1 Interrupts The interrupt control function of all 82489DXs are collectively responsible for delivering interrupts from interrupt sources to interrupt destinations in the multiprocossor system. When a processor accepts an interrupt, it uses the vector to locate the entry point of the handler in its interrupt table. The 82489DX architecture allows for 16 possible interrupt priorities; zero being the lowest priority and 15 being the Register Access r-- i • --., ___ 82489DX Local Unit L~_] --., .i ,.-i i L___t_ooJ ( Loo_rooJ .i r-- i • • J Subsystem Out of 256 vectors, interrupt vectors 0 to 15 should not be used in the 82489DX. Only 240 interrupt vectors (vectors from 16 to 255) are supported in the 82489DX. Register Access , I/o TOTAL ALLOWED INTERRUPT VECTORS ~i .. ~! --., 82489DX Local Unit Register Access I _1_00' " • 82489DX Lacal Unit ICC BUS - I . , 82489DX I/O Unit 'I , ________ .J,, 290446-5 Figure 4. 110 Units and Local Units 4~232 82489DX INTERRUPT SOURCES Interrupts are generated by a number of different interrupt sources in the system. Possible interrupt sources are: • Externally connected (liD) devices. Interrupts from these external sources manifest themselves as edges or levels on interrupt input pins and can be redirected to any processor. • Locally connected devices. These originate as edges or levels on interrupt pins, but they are al ways directed to the local processor only. • 82489DX timer generated interrupts. Like locally connected devices, 82489DX timer can only interrupt its local processor. • Processors. A processor can interrupt any individual processor or sets of processors. This supports software self-interrupts, preemptive scheduling, TLB flushing, and interrupt forwarding. A processor generates interrupts by writing to the interrupt command register in its Local Unit. INTERRUPT DESTINATIONS liD Units can only source interrupts whereas Local Units can both source and accept interrupts, so whenever "interrupt destination"is discussed, it is implied that the Local Unit is the destination of the interrupt. In physical mode the destination processor is specified by a unique 8-bit 82489DX 10ca11D. Only . a single destination or a broadcast to all (LOCAL ID of all ones) can be specified in physical destination mode. In logical mode destinations 'are specified using a 32-bit destination field. All Local Units contain a 32-bit Logical Destination register against which the destination field of the interrupt is matched to determine if the receiver is being targeted by the interrupt. An additional 32-bit Destination Format register in each Local Unit enables the logical mode addressing. INTERRUPT DELIVERY The description of interrupt delivery makes frequent use of the following terms: • Each processor has a processor priority that reflects the relative importance of the code the processor is currently executing. This code can be part of a process or thread, or can be an interrupt handler. A processor's priority fluctuates as a processor switches threads, a thread or handler raises and lowers its priority, level to mask out interrupt, and the processor enters an interrupt handler and returns from an interrupt handler to previously interrupted activity. • A processor is lowest priority within a given group of processors if its processor priority is the lowest of all processors in the group. Note that more than one processor can be the lowest priority in a given group. • A processor is the focus of an interrupt if it is currently servicing that interrupt, or if it currently has a request pending for the interrupt. Interrupt delivery begins with an interrupt source injecting its interrupt into the interrupt system at one of the 82489DX. Delivery is complete only when' the servicing processor tells its 82489DX Local Unit it is complete by issuing an end-of-interrupt (EOI) command to its 82489DX Local Unit. Only then has all (relevant) internal state regarding that occurrence of the interrupt been erased. The interrupt system guarantees exactly-once delivery semantics of interrupts to the specified destinations. Exactly-once guaranteed delivery implies a number of things: • The interrupt system never rejects interrupts; it never NAKs interrupt injection, interrupts are never lost, and the same interrupt (occurrence) is never delivered more than once. Clearly a single edge interrupt or level interrupt counts as a single occurrence of an interrupt. In uniprocessor systems, an occurrence of an interrupt that is already pending (lRR) cannot be distinguished from the previous occurrence. All occurrences are recorded in the same IRR bit. They are therefore treated as "the same" interrupt occurrence. For lowest-priority delivery mode, by delivering an interrupt first to its focus processor (if it currently has one), the identical behavior can be achieved in a MP (Multiprocessor) system. If an interrupt has a focus processor then the interrupt will be delivered to the interrupt's focus processor independent of priority information. This means that even if there is a lower priority processor compared to the focus processor, the interrupt still gets delivered to the focus processor. Each edge occurring on an edge triggered interrupt input pin is clearly a one-shot event; each occurrence of an edge is delivered. An active level on a level triggered interrupt input pin represents more of a "continuous event". Repeatedly broadcasting an interrupt message while the level is active would cause flooding of the ICC bus, and in effect transmits very little useful information since the same processor (the focus) would have to be the target. Instead, for level triggered interrupts the 82489DX merely recreate the state of the interrupt input pin at the destination. The source 82489DX accomplishes this by tracking the state of the appropriate destina- 4-233 82489DX tion 82489DX's Interrupt Request Register (or pending bit) and only sending inter-82489DX messages when the state of the interrupt input pin and the destination's interrupt request enter a disagreement. Unlike edge triggered interrupts, when a level interrupt goes into service, the interrupt request at the servicing 82489DX is not automatically removed. If the handler of a level sensitive interrupt executes an EOI then that interrupt will immediately be raised to the processor again, unless the processor has explicitly raised its task priority, or the source of that . interrupt has been removed. 5.2 Interrupt Redirection This section specifically talks about how a processor is picked during interrupt delivery. The 82489DX supports two modes for selecting the destination processor: Fixed and Lowest Priority. • Fixed Del/very Mode In fixed delivery mode, the interrupt is unconditionally delivered to all local 82489DXs that match in the destination information supplied with the interrupt. Note that for liD device interrupts . typically only a single 82489DX would be listed in the destination. Priority and focus information are ignored. If the priority of a destination processor equal to or higher than the priority of the interrupt, then the interrupt is held pending locally in the destination processor's Local Unit, until the processor priority becomes low enough at which time the interrupt is dispensed to the processor. More than one processor can be the destination in fixed-delivery mode. • Lowest Priority Del/very Mode Under the lowest priority delivery mode, the processor to handle the interrupt is the one in the specified destination with the lowest processor priority value. If more than one processor is at the lowest priority, then a unique arbitration ID is used to break ties. For lowest priority dynamic delivery, the interrupt will always be taken by its focus processor if it has one. The lowest priority delivery method assures minimum interruption of high priority tasks. Since each Local Unit only knows its own processor priority, determining the lowest priority processor is done by arbitration on the ICC bus. Only one processor can be the destination in lowest-priority delivery mode. INTER·82489DX COMMUNICATION All 110 and Local Units communicate during interrupt delivery. Interrupt information is exchanged between different units on a dedicated five wire ICC bus in the form of broadcast messages. A 82489DX Unit's 8-bit ID is used as its name for the purpose of using the ICC bus, and all 82489DX units using one ICC bus should be assigned a different ID. The Arbitration 4-234 ID of the Local Units used to resolve ties during low, est priority arbitration is also derived from the Local Unit's ID. 6.0 82489DX LOCAL UNIT REGISTERS DESCRIPTION 6.1 Local Unit ID Register Each 82489DX Local Unit has a register that contains the Local Unit's 8-bit ID. The Local Unit ID serves as a physical name of the 82489DX Local Unit. It can be used in specifying destination information and is also us~d for accessing the ICC bus. Eight address lines A[10]-A[3] are sampled on every clock edge while RESET is asserted. The last sample remains in the Local Unit ID register after reset. Alternatively, the ID can be loaded with a register write as part of software initialization. The Local Unit ID is read-write by software. Bits [31 .. 24] Bits [23 .. 0] 824890X Local Unit 10 Register Bits [31..24] Local Unit ID: The Local Unit ID serves as the physical "name" of the Local Unit used for addressing the 82489DX in physical destination mode and for the ICC bus usage. In a system with say four 82489DX, there are 4 Local Units and 4 liD Units. All the 8 units should be assigned different IDs. For future compatibility use only IDs from 0 to 14. Bits [23 .. 0] Bits [23 .. 0] are Reserved. They should be written with O. 6.2 Destination Format Register Interrupt Destination can be either addressed physically or logically. When the interrupt message addresses the destination physiyally, each 82489DX in the ICC bus compares the address with its own unit ID. If the message is a broadcast type then every Local Unit accepts the interrupt. When the message addresses the destination using logical addressing scheme each Local Unit in the ICC bus compares the logical address in the interrupt message with its own Logical Destination Register. If there is a bit match (Le., if at least one of the corresponding pair of bits match) this local unit is selected for delivery. 82489DX All the 32 bits of Destination Format Register of all 82489DX connected in the ICC bus should be written with "1" to enable the addressing scheme. Destination Format Register Bits 131:0] Logical Destination Register Bits [31:0] For future compatibility, use only bits 31-24 of Logical Destination Register. For binary compatibility, it is strongly recommended that 82489DX software use only 8 MSB of Logical Destination Register. 6.3 Local Interrupt Vector Table Registers The Redirection Table serves to steer interrupts that originate in the I/O subsystems to the processors. The Local Vector Table is its equivalent for interrupts that are restricted to only the local processor. The Local Vector Table contains three 32-bit registers. Register 0 corresponds to the timer, registers 1 and 2 correspond to local interrupt input pins, LlNTINO and LlNTIN1. The format of both the Local 0 and Local 1 interrupt vector tables are identical. The following register description talks about both Local Interrupts 0,1 vectors. Bit 11: 110: < reserved> 111: ExtlNT 000: (fixed) means deliver the signal on the INT pin of the local processor. Trigger mode for "fixed" Delivery Mode can be edge or level. 100: (NMI) means deliver the signal on the NMI pin of the local processor. Vector information is ignored. A Delivery Mode equal to "NMI" requires a "level" triggered mode. 111: (ExtINTA) means deliver the signal to the INT pin of the local processor as an interrupt that originated in an externally connected (8259A compatible) interrupt controller. ExtlNTA pin is as- serted also. The INTA cycle that corresponds to this Extl NTA delivery, should be routed to the external controller that is expected to supply the vector. A delivery mode of ExtlNT requires an edge trigger mode. (See the section on compatiblity for more details.) Bit 11 is Reserved. It should be written o. Delivery Status: [Bit 12) Local Interrupts 0, 1 Interrupt Vectors Vector: [Bits 7-0) This is the vector to use when generating an interrupt for this entry. Delivery Mode: [Bits 10-8) 000: Fixed 001: 010: 011: 100: NMI 101: This field is software-read only. Software writes to this field (as part of a 32-bit word) have no effect on this bit. Delivery status is a 1-bit field that contains the current status of the delivery of this interrupt. Two states are defined. 0: (idle) means that there is currently no activity for this interrupt. 1: (send pending) indicates that the interrupt has been injected, but its delivery is temporarily held up by the recently injected interrupts that are in the process of being delivered. Local INTO Vector Table Local INT1 Vector Table Figure 5. Local Vector Table 4-235 82489DX Bit 13: Bit 13 is Reserved. It should be written Remote IRR: [Bit 14] This bit is used for level triggered local interrupts; its meaning is undefined for edge triggered interrupts. Remote IRR mirrors the interrupt's IRR bit of this local unit. Remote IRR is software read-only; software writes to this bit do not affect it. Vector: [Bits 7-0] The vector identifies the interrupt being sent. If the Delivery Mode is "Remote Read", then the Vector field contains the address of the register to be read in the remote 82489DX's Local Unit. The addresses are listed in the section discussing 82489DX Local Unit register summary. For example, for 10 register, remote read address of 02 should be specified in vector field. Trigger Mode: [Bit 15] Delivery Mode: [Bits 10-8] o. The Trigger Mode field indicates the type of signal on the local interrupt pin that triggers an interrupt. 0: indicates edge sensitive, 1: indicates level sensitive. The Delivery Mode is a 3-bit field that specifies how the 82489DX listed in the destination field (bits 63:32) should act upon reception of this signal. Note that certain Delivery Modes will only operate as intended when used in conjunction with a specific Trigger Mode. These restrictions are indicated for each Delivery Mode. .000: (Fixed) means deliver the signal on the INT pin of all processors listed in the destination. Trigger Mode for "fixed" Delivery Mode can be edge or level. 001: (Lowest Priority) means deliver the signal on the INT pin of the processor that is executing at the lower priority among all the processors listed in the specified destination; Trigger Mode for "lowest priority" Delivery Mode can be edge or level. 010: Intel Reserved. Should not be used. 011: (Remote Read) is a request to a remote 82489DX Local Unit to send the value of one of its registers over the ICC bus. The register is selected by providing its address in the Vector field. The register value is latched by the requesting 82489DX and stored in the Remote Register where it can be read by the local processor. A Delivery Mode of "Remote Read" requires an "edge" Trigger Mode. Only the local interrupt pins can be programmed as edge or level triggered. Timer interrupts are always treated as edges. MASK: [Bit 16] 0: enables injection of the interrupt, 1: masks injection of the interrupt. Bits [31:17] Bits [31:17] are Reserved. Should be written O. 6.4 Inter-Processor Interrupt Registers A processor generates inter-processor interrupts by writing to the Interrupt Command Register in its 82489DX Local Unit. Conceptually, this can be thought of as the processor providing the interrupt's Redirection Table Entry on the fly. Not surprisingly, the layout of the Interrupt Command Register resembles that of an entry in the Redirection Table. Note that the format of this register allows a processor to generate any interrupt. A processor may use this to forward device interrupts originally accepted by it to other processors. All fields of the Interrupt Command Register are read-write by software with the exception of the Delivery Status field which is read-only. Writing to the 32-bit word that contains the interrupt vector causes the interrupt message to be sent. Interrupt Command Register [31:0] [BjIS[31:20] 4-236 BiI5[19:18] BiI5[17:16] Bil15 Bil14 Bil13 Bil12 Bil11 Bils[10:8] BiI5[7:0] 82489DX 100: (NMI) means deliver the signal on the NMI pin of all processors listed in the destination, vector information is ignored. A Delivery Mode equal to "NMI" requires a "level" Trigger Mode. . 101: (Reset) means deliver the signal to all local units listed in the destination. The destination local unit will assert/deassert its PRST output pin. All addressed 82489DX Local Units will assume their reset state but preserve their ID. One side effect of an ICC bus message with Delivery Mode equal to "Reset" that results in a deassert of reset is that all Local Units (whether listed in the destination or not) will reset their lowest-priority tie breaker arbitration ID to their Local Unit ID (see the section on the ICC bus for details). A Delivery Mode of "Reset" requires a "level" Trigger Mode. "RESET" should not be used with "self" or "all incl self" Shorthand mode since it will leave the system in non-recoverable reset state. If "RESET" is used with "all excl self" mode software should make sure that only one CPU executes this instruction in a MP system. 110: Intel Reserved. Should not be used. 111: Intel Reserved. Should not be used Destination Mode: [Bit 111 This field determines the interpretation of the Destination field. 0: (Physical Mode): in Physical Mode, a destination 82489DX is identified by its Local Unit ID. Bits 56 through 63 (8 MSB of the destination field) specify the 8-bit 82489DX Local Unit ID. 1: (Logical Mode): in Logical Mode, destinations are identified by matching on Logical Destination under the control of the Destination Format Register in each .Local 82489DX. The 32-bit Destination field is the logical destination. Delivery Status: [Bit 121 Delivery Status is a 1-bit field that contains the current status of the delivery of this interrupt. Two states are defined: 0: (Idle) means that there is currently no activity for this interrupt; 1: (Send Pending) indicates that the interrupt has been injected, but its delivery is temporarily held up by other recently injected interrupts that are in the process of being delivered; Delivery Status is software read-only; software writes to this field (as part of a 32-bit word) do not affect this bit. Software can read to find out if the current interrupt has been sent, and the Interrupt Command Register is available to send the next interrupt. If the Interrupt Command Register is overwritten before the Delivery Status is "Idle", then the destiny of that interrupt is undefined; i.e., the interrupt may have been lost. Bit 13: Bit 13 is Reserved . Should be written o. Level: [Bit 141 Software can use this bit in conjunction with the Trigger Mode bit when issuing an inter-processor interrupt to simulate assertion/deassertion of level sensitive interrupts. To assert: Trigger mode = 1 and Level = 1. To deassert: Trigger mode = 1 and Level = O. For example, a message with Delivery Mode of "Reset", a Trigger Mode of "Level", and Level bit of 0 deasserts Reset to the processor of the addressed 82489 DX Local Unit(s). As a side effect, this will also cause all 82489DX to reset their Arbitration ID to their unit ID. (The Arb ID is used for tie breaking in lowest priority arbitration.) Trigger Mode: [Bit 151 .Software can use this in conjunction with Level Assert/Deassert to generate interrupts that behave as edges or levels. 0: Edge 1: Level 4-237 82489DX Remote Read Status: [Bits 17,16] This field indicates the status of the data contained in the Remote Read register. This field is read-only to software. Whenever software writes to the Interrupt Command Register using Delivery Mode "Remote Read" the Remote Read status becomes "in-progress" (waiting for the remote data to arrive). The remote 82489DX Local Unit is expected to respond in a fixed amount of ICC bus cycles. If the remote 82489DX Local Unit is unable to do so, then the Remote Read status becomes "Invalid". If successful, the Remote Read status resolves to "Valid". Software should poll this field to determine completion and success of the Remote Read command. 00: (invalid): the content of the Remote Read >Register is invalid. This is the case after a Remote Read command issued and the remote 82489DX Local Unit was unable to deliver the Register content in time. 01: (in progress): a Remote Read command has been issued and this 82489DX is waiting for the data to arrive from the remote 82489DX Local Unit. 10: (valid): the most recent Remote Read command has completed and the remote register content in the Remote Read Register is valid. 11: reserved. Destination Shorthand: [Bits 19,18] This field indicates whether a shorthand notation is used to specify the destination of the interrupt and if so, which shorthand is used. Destination Shorthands do no use the 32-bit Destination field, and can be sent by software with a single 32-bit write to the 82489DX's Interrupt Command Register. Shorthands are defined for the following common cases: software self interrupt, interrupt to all processors in the system including the sender, interrupts to all processors in the system excluding the sender. 00: (dest field): means that no shorthand is used. The destination is specified in the 32-bit Destination field in the second word (bits 32 to 63) of the Interrupt Command Register. 4-238 01: (self): means that the current 82489DX Local Unit is the Single destination of the interrupt. This is useful for software interrupts. The Destination· field in the Interrupt Command Register is ignored. RESET Delivery mode should not be used with self destination. Only FIXED delivery mode s~ould be used with SELF. 10: (all incl self): means that the interrupt is to be senUo "all" processors in the system including the processor sending the interrupt. The 82489DX will broadcast a message with destination unit ID field set to all ones. RESET assert Deliv. ery mode should not be used with "all incl self" destination. 11: (all excl self): means that the interrupt is to be sent to "all" processors in the system with the exception of the processor sending the interrupt. The 82489DX will broadcast a message with destination unit ID field set to all ones. AII-exclself is useful during selection of a boot processor (init) and also for TLB flush where "self" is flushed using the processor flush instruction. Only one CPU in a MP system should execute "all excl self" destination if used with RESET Delivery mode. Bits [31 :20] Bits [31 :20] are Reserved. They should be written O. Interrupt Command Register [63:32] I Bits [63:32] I Destination: [Bits 63-32] This field is only used when the Destination Shorthand is set to "Dest Field". If Destination Mode is Physical Mode, then the 8 MSB contain a Destination unit ID. If Logical Mode, the full 32-bit Destination field contains the logical address. The enabling is done by Destination Format Register. 6.5 IRR, ISR, TMR Registers INTERRUPT ACCEPTANCE All 82489DX Local Units listen to all messages sent over the ICC bus. For each message, the local unit first checks if it belongs to the destination in the ~@W~OO©~ OOO~©lm!Ml~'iJ'O©OO I 82489DX message. It does this by matching the 32-bit Destination field in the message against its logical Destination Register, if the message addresses in logical mode, and against its physical ID, if the message addresses in physical mode. All 82489DX Local Units that match are said to "belong to the group". Each 82489DX Local Unit contains three 256-bit registers that playa role in the acceptance of interrupts and in dispensing accepted interrupts to the local processor. Each of these registers is a bit array where bit position i tracks information about the interrupt with vector L These bits track information about the (PINT) maskable interrupts only. They are not relevant for NMI, RESET or ExtlNT type of interrupts. The Interrupt Request Register (IRR) contains the interrupts accepted by this 82489DX Local Unit but not yet dispensed to the processor. The In Service Register (ISR) contains the interrupts that are currently in service by the processor, Le., the interrupts that have been dispensed to the processor but for which the processor has not yet signaled the End-Of-Interrupt. Note that the 82489DX's IRR and ISR registers have the same meaning and operation as in the 8259A in fully nested/non-specific EOI mode. Note also that these registers play no role in providing 8259A compatibility. Compatibility is handled by making an ex- Device A ternal 8259A-style controller directly visible to the processor and having the 82489DX become transparent. Each interrupt has a vector associated with it, which determines the bit position, and hence the priority for the interrupt. When an interrupt is being serviced, all equal or lower priority interrupts are automatically masked by the 82489DX Local Unit. The Trigger Mode Register (TMR) indicates for each interrupt whether the interrupt is edge or level. This information is transmitted with each 82489DX interrupt request message and reflects the Trigger Mode bit in the interrupt's Redirection Table entry. If an interrupt goes in service and the TMR bit is 0 (edge), then the interrupt's IRR bit is cleared at the same time the ISR bit is set. If the TMR bit is 1 (level), then the IRR bit is not cleared when the interrupt goes in service. In the latter case, the IRR bit mirrors the state of the interrupt's input pin. The following diagram shows 82489DX operation with devices A and B sharing a level triggered interrupt input. The diagram illustrates how Remote IRR, and the IRR bit at the destination 82489DX track the state of INTIN. It also illustrates how an EOI isfollowed immediately be re-raising the interrupt as long as the INTIN is still asserted by some device. --.-l_________\1..-______________ l _______________\..._______ Device B _ _ _ _..... INTIN \1..______ ~ ____________________ ~ XOR of INTIN and Remote IRR 0'------- ~nd "level assert" message ~nd "level deassert" message --J.I__ ____________________\>--_____ --1.1______________________\1..-_____ Remote IRR _ _ IRR a\ des\ Unit _ _ INT/EOI _ _oJ 290446-6 Figure 6. Interrupt Sharing 4-239 82489DX ISR, IRR, and TMR are read-only by software. Each of these 256-bit registers is accessed as four separate 32-bit registers. Note that there is no general Interrupt Mask Register (IMR) as in the 8259A. The processor masks interrupts temporarily by writing to the Icoal unit's Task Priority Register (described shortly). ISR [Interrupt Status Register] Bits [255:0] IRR [Interrupt Request Register] Bits [255:0] TMR [Trigger Mode Register] IRR (Interrupt Request Register): It contains the active interrupt requests that have been accepted, but not yet dispensed by this 82489DX Local Unit. A bit in IRR is set when the 82489DX Local Unit accepts the interrupt. When TMR is 0, it is cleared when the interrupt is serviced; when TMR is 1, it is cleared when the 82489DX Local Unit receives a message to clear it. ISR (In Service Register): It marks the interrupts that have been delivered to the processor, but that have not been fully serviced in that an End-Of-Interrupt has not yet been received. The ISR register reflects the current state of the processor's interrupt stack. Bits [255:0] ACCEPTANCE MECHANISM Figure 7. ISR, IRR, and TMR TMR (Trigger Mode Register): If 0 [edge triggered] the corresponding IRR bit is automatically cleared when interrupt service starts. If 1 [level triggered] this is not the case; instead, the source 82489DX must explicitly request the IRR bit be cleared (upon deassert of the interrupt input pin or upon sending an appropriate interprocessor interrupt). Upon acceptance of an interrupt, the TMR bit is cleared for edge triggered interrupts and set for level triggered interrupts. This information is carried in the accepted interrupt message. The source 82489DX 1/0 unit also tracks the state of the destination unit's IRR bit (Remote IRR bit in the Redirection Table). When a level triggered interrupt input is deasserted, the source 82489DX 1/0 unit detects the discrepancy between the input pin state and the Remote IRR, and automatically sends a message telling the destination 82489DX to clear IRR for the interrupt. 4-240 Interrupt acceptance proceeds as follows. If the delivery mode is Fixed, then each unit in the destination group unconditionally accepts the interrupt message and sets the interrupt's IRR bit. If the delivery mode is Lowest Priority, then each processor in the group first checks if it is currently the focus of the interrupt by checking its ISR and IRA. If an 82489DX finds one of these bits set for the incoming interrupt, then that 82489DX Local Unit" accepts the interrupt independent of priority, and "signals" the other 82489DX Local Units to abort the priority arbitration. This avoids multiple delivery of a same interrupt occurrence to different processors, consistent with interrupt delivery semantics in uniprocessor systems as described above. If a message is to be delivered for NMI or Reset, then all 82489DX Local Units listed in the destination unconditionally assertl deassert the corresponding output pin. ISR, IRR, etc. are bypassed for NMI or reset and vector information is undefined. 82489DX The acceptance decision process is illustrated in the flow chart below. Discard Message Set Interrupt's IRR 290446-7 Figure 8. Interrupt Acceptance Flow Chart 4-241 82489DX 6.6 Tracking Processor Priority Each 82489DX Local Unit should be programmed with task priority so that it can mask interrupts that are less priority than that of the processor temporarily. Task switching and task priority changes are the result of explicit software action. The operating system may define a number of task scheduling classes. Examples are an idle class, a background class, a foreground class, and a time critical class. Alternatively, different classes can be assigned to user code versus system code. If tasks in different classes are executing when an interrupt comes in, then it may be advantageous to interrupt the processor currently running the task in the least important class. Clearly, if one processor is idle while others are doing work, the idle processor would be the obvious target for servicing the interrupt. This implies that there is use in defining priority levels below all interrupt levels that can participate in lowest priority delivery selection. At times, the operating system may need to block out interrupts from being serviced. For example, to synchronize access to a shared data structure between a device driver and its interrupt handler the driver raises it priority to equal or higher than the interrupt's priority. The local 82489DX supports this via its Task Priority Register (the 8259A supports this via the interrupt mask register (IMR).) Software that wants to make use of this is required to inform its 82489DX Local Unit of the prioity change by updating the Task Priority Register. The Task Priority field is 8 bits providing up to 256 distinct priorities. The 4 MSB of this register correspond to the 16 interrupt priorities while the 4 LSB provide more precision. Priorities are best noted as x:y, where x is the value of the 4 MSB and y is the value of the 4 LSB. For example, Task Priority Register values O:y with 0 < Y < 15 (and 0 in the 4 MSB) can .be used to represent the priorities of the task scheduling classes described above (y = 0 for idle; y = 1 for background; etc.). Except for interrupts with vectors 0 through 15 (which are often predefined by the processor) which all have priority 0:0, the priorities of all other interrupts and their handlers is x:O with 1 < x < 15 and is above the base task priorities O:y. For example, interrupt vector 123 has priority 7:0 (123/16 = 7) and can be masked by any task that raises its priority to a value equal or higher than 7:0. 4-242 82489DX uses Task priority register for the purpose of masking the interrupts. The task priority register should be programmed with a priority value to specifiy the priority of task the processor is executing. 82489DX masks any interrupts of lower or equal priority when compared with task priority. When task priority register is programmed with the priority 15, all the interrupts are masked. When task priority register is programmed with priority level X, by definition, all the interrupts of priority X and below X will be masked. When task priority register is programmed with the priority 0 then all the interrupts above priority 0 are allowed to interrupt the processor. This means that when task priority register is programmed even with the lowest value, i.e., 0, interrupts of priority 0 will be masked. So only 240 interrupt vectors should be used in 82489DX. Interrupt vecotrs from 0 to 15 should not be used. The first priority value computed is the maximum of: • Task Priority (4msb : 4lsb) and • the priority of the highest order ISRbit set «vector/16) :0). The value is used to determine whether or not a pending interrupt can be dispensed to the processor. The second priority value computed is the maximum of: • Task Priority (4msb : 4Isb), and • the priority of the highest order ISR bit set «vector/16) :0), and • the priority of the highest order IRR bit set «vector/16) :0). This value is used during arbitration as part of lowest-priority delivery. Task Priority Register Bits [31:8] Bits [7:0] From the information in the Task Priority Register and the priority information derived from the ISR and IRR register, the 82489DX Local Unit computes two additional priority values: Bits [31 :8] Bits [31 :8] are Reserved. They should be written O. Bits [7:0] Task Priority Bits [7:0] are used to specify the task priority. 82489DX 6.7 Dispensing Interrupts , DISPENSING INTERRUPTS TO THE LOCAL PROCESSOR Once a 82489DX Local Unit accepts an interrupt, it guarantees delivery of the interrupt to its local processor. (This part of the 82489DX functions similarly to an 8259A.) Dispensing a maskable interrupt to the local processor begins when the Local Unit asserts the INT pin of its processor. If the processor has interrupts enabled, it will respond by issuing an INTA cycle. This causes the Local Unit to freeze its internal priority state and release the 8-bit vector of the highest priority interrupt on the data bus where it is read by the processor and used to find the handier's entry point. The INTIINTA protocol also causes the interrupt's ISR bit to be set. The corresponding bit in the IRR register is only cleared if the TMR register indicates it should do ,so (edge triggered interrupts), otherwise (level triggered interrupts), IRR is only cleared when the Interrupt Input Pin is deasserted. 6.8 Spurious Interrupt Vector Register SPURIOUS INTERRUPT Note that it can happen that a level-triggered interrupt is deasserted right before its INTA cycle. In that case, all IRR bits may be clear and the prioritizer may not find a vector to give to the processor. To satisfy the processor's demand for a vector, instead, the 82489DX will return a spurious interrupt vector instead. A similar situation may occur when the processor raises its Task Priority at or above the level of the interrupt for which the Processor INT pin is currently being asserted. When the INTA cycle is issued, the interrupt that was to be dispensed has become masked (masked but remembered). Dispensing the spurious interrupt vector does not affect the ISR register, so the handler for this vector should just return without EOL If the vector is shared with a valid interrupt, then the handler can 'read the vector's bit in the ISR register to check if it is invoked for the valid interrupt (ISR bit set) or not (ISR bit clear). Given the range of 240 vectors, overloading the spurious interrupt with a valid interrupt is not expected to be common practice. The spurious interrupt vector to be used by a Local Unit is programmable via the Spurious Interrupt Vector Register. UNIT ENABLE It is possible that Local Units exist in the system that do not have a processor to which to dispense interrupts. The only danger this represents in the system is that if any interrupt is broadcast to all processors using lowest priority delivery mode when all processors are at the lowest priority, there is a chance that a Local Unit without the processor may accept the interrupt if this Local Unit happens to have the lowest Arb 10 at the time. To prevent this from happening, all Local Units initialize in the disabled state and must be explicitly enabled before they can either start accepting or transmitting messages from the ICC bus. A disabled 82489DX Local Unit only responds to messages with Delivery Modes set to "Reset". Reset deassert messages should be sent in Physical Destination mode using the target's Local Unit 10 since the logical destination information in the Icoal units is undefined (all zeroes) when the 82489DX comes out of Reset. BitS [31 :9] Bit B Bits [7:0] Figure 9. Spurious Interrupt Vector Register Bits [31 .. 9] Reserved Bits. Should be written o. Unit Enable:[Bit 8] 0: When a 0 is written to this bit, this Local Unit gets disabled with regard to responding to messages sent as well as transmitting on the ICC bus. It only responds to messages with Delivery Mode set to "Reset". Reading a 0 at this bit indicates that the unit is disabled. 1: When a 1 is written to this bit, the current Local Unit is enabled for both transmitting and receiving unit messages. Reading a 1 at this bit indicates that the unit is enabled. Spurious Vector: [Bits 7-0] For future compatibility, bits [3-0] should be 1111. ,6.9 End-of-Interrupt (EOI) Register Before returning from the interrupt handler, software must issue an End-Of-Interrupt (EOI) command to the 82489DX Local Unit. The data written to EOI register is don't care. This tells the 82489DX to clear the highest priority bit in the ISR register since the interrupt is no longer in service. Upon EOI, 82489DX goes through prioritization returning to the next highest priority activity. This can be a previously interrupted handler (from ISR), a pending interrupt request (from IRR), or an interrupted task (from Task Priority). Bits [31:0] Figure 10. EOI Register Bits [31:0]: are don't care. 4-243 82489DX 6.10 Remote Read Register Since all 82489DX Local Units would typically occupy the same address range, an 82489DX local unit's registers can only be accessed by the local processor. From a system debugging point of view, this would mean that a large amount of state would become inaccessible if its corresponding processor hangs for whatever reason. To assist in ~he debugging of MP systems, the 82489DX support a mechanism that provides read-only access to any register in any other 82489DX Icoal unit in the system. To read any register ina "remote" 82489DX Local Unit, the processor writes to the Interrupt Command Register specifying a Delivery Mode equal to "Remote Read". The remote 82489DX is specified in the Destination field of the Interrupt Command Register in the usual fashion. Debug software would make sure that this selects a single 82489DX onlyfor example by using the target's 82489DX Local Unit 10 in physical destination mode. Since no vector is associated with remote register access, the Vector field in the Interrupt .Command Register is used to select the individual remote 32-bit register to be read. The selector value corresponds to the address (offset) of the register in the local 82489DX's address space. Sending a "Remote Read" command results in sending a message on the ICC bus. The destination 82489DX responds by placing the 32-bit content of the selected register on the ICC bus. This value is read by sending the 82489DX and place it in the Remote Register where software can get at it using regualr register access to its 82489DX Local Unit. The Remote Register is software read-only. The contents of the Remote Register is valid when the Delivery Status in the Interrupt Command Register has become "Idle" again. Remote Read Register BiIS.[31:0] Figure 11. Remote Register Bits [31 :0] Bits [31 :0] contain the contents of Remote Read Register. 6.11 82489DX Local Configuration LOCAL VERSION REGISTER Each 82489DX Local Unit contains a hardware Version Register that identifies this 82489DX Local Unit version. This register is read only. Local Version Register Bils [31:8] BilS [7:0] Figure 12. Local Version Register Version: [Bits 7-0] This is a version number that identifies this version. This field is hardwired and is read-only. Will be read as "1" for 82489DX. Bits [31:8] Bits 31:8 are reserved. 6.12 82489DX Timer Registers Overview 82489DX Local Unit contains one· 32-bit wide programmable binary timer for use by the local processor. The timer can select its clock base from one of three possible clock inputs. A timer mode can be programmed to operate in either one-shot mode or periodic mode. The timer can be configured to interrupt the local processor with a vector. Time Base The 82489DX has two independent clock input pins: 1. The CLK pin provides the clock signal that drives the 82489DX's internal operation. 2. The TMBASE pin allows an independent clock signal to be connected to the 82489DX for use by the timer functions. Signals from both CLK and TMBASE can be used as clock inputs that feed the timer. In addition, the 82489DX contains a divider that can be configured to divide either input clock signal. The divider can be programmed to divide the selected input clock by 2, 4, 8, or 16. CLK, TMBASE, and the output of the divider together provide three time bases: Base 0, Base 1, and Base 2. Base 0 is always equal to CLK: Base 1 is always equal to TMBASE; and base 2 is one of; CLK/2, CLK/4, CLK/8, CLK/16, TMBASE/2, TMBASE/4, TMBASE/8, or TMBASE/16. The timer can independently select one of these three time bases as its clock input as depicted in the following diagram. eLK -"""'T+--------+~ Base 0 1-1 t . TI.lBASE DIVIDE BY 2, 4, 8, 16 1---+ .- B 2 ase ~Basel 290446-8 Figure 13. Time Bases 4-244 82489DX Bits [31:3] Bit 2 A timer set up with its interrupt masked is useful as a time base that can be sampled by the local processor by reading the Current Count Register, for the purpose of measuring the intervals. By mapping the 82489DX's register space into a read-only user page, safe and efficient performance monitoring of user programs can be supported. Bits [1:0] Figure 14. Divider Configuration Register Bits [31 :3] Bits 31 to 3 are reserved. They should be written O. Divider Input: If necessary, software may want to ensure that periodic timer interrupts on the different 82489DX local Units are staggered such that the 82489DXs don't all deliver their interrupt (e.g., a timer slice interrupt) to their local processor at the same time. This staggering avoids bursts of contention for shared resources (bus, cache lines, dispatch queue, locks). Randomness occurring "naturally" may be sufficient to ensure staggering. [Bit 2] Selects whether divider's input connects to the 82489DX local Unit's ClK pin or TMBASE pin. 0: means the divider takes its input signal from ClK, 1: means use TMBASE. Divide By: [Bits 1,0] This field selects by how much the divider divides. Initial Count Register 00: divide by 2 Bits [31:0] Initial Count 01: divide by 4 10: divide by 8 Current Count Register 11: divide by 16 Bits [31 :0] Current Count Timer Figure 15. Initial Count and Current Count Registers Initial Count: Software writes to this register to set the initial count for timer. This register can be written at· any time. When written, its value is copied to the Current Count Register and countdowl") starts or continues from there. The Initial Count Register is read-write by software. Software starts a timer going by programming its Initial Count Register. The timer copies this value into the Current Count Register and starts counting down at the rate of one count for each time base pulse. The time is one of Base 0, Base 1, or Base 2. The timer has a program mabie mode which can be One-Shot or Periodic. After the timer reaches zero in One-Shot mQde, the timer simply stays at zero until it is reprogrammed. In Periodic mode, the timer automatically reloads its Current Count from the Initial Count and starts counting down again. Current Count: This is the current count of timer. It is read-only by software and can be read at any time. For the timer, interrupt generation can be disabled or enabled, and an arbitrary interrupt vector can be specified. When enabled and the timer reaches zero, an interrupt is generated at the 82489DX local Unit. Timer generated interrupts are always treated as edges. They can only generate maskable interrupts to the local processor. The timer is configured via its local Vector Table entry shown below (see also Interrupt Control in this section). Vector: [Bits 7-0] This is the 8-bit interrupt vector to be used when timer generates an interrupt. TIMER VECTOR TABLE I Bits (31:20] Bits (19:18] Bit 17 I Bit 16 I Bits (15:13] Bit 12 I Bits (11:8] Bits (7:0] I Figure 16. Local Vector Table: Timer Entry 4-245 82489DX Bits 11-8 Reserved. Should be written O. Delivery Status: [Bit 12] Delivery Status is a 1-bit field that contains the current status of the delivery of this interrupt. Two states are defined: 0: (Idle) means that there is currently no activity for this interrupt; 1: (Send Pending) indicates that the interrupt has been injected, but its delivery is temporarily held up by other recently injected interrupts that are in the process of being delivered; Delivery Status is software readonly; software writes to this field (as part of a 32-bit word) do not affect this bit. Bits 15-13: RElserved. Should be written O. MASK: [Bit 16] This bit serves to mask timer interrupt generation. 0: means not masked, when timer reaches 0, it generates an interrupt with vector at. the 82489DX local Unit 1: means masked, and. no interrupt is generated. Timer Mode: [Bits 17] This field indicates the operation mode of timer. 0: (One-Shot): the Current Count Register remains at zero after the timer reaches zero, and software needs to reassign the timer's Initial Count Register to rearm the timer. 1: (Periodic): when the timer reaches zero, the Current Count Register is automatically reloaded with the value in the Initial Count Register, and the timer counts down again. Timer Base: [Bits 19,18] This field selects the time base input to be used by timer. 00: (Base 0) uses "ClKIN" as input; 01: (Base 1) uses "TMBASE"; 10: (Base 2) uses the output of the divider (Base 2). Bits [31 :20] Bits [31 :20] are Reserved. Should be written O. 4-246 7.0 82489DX 1/0 UNIT REGISTERS REGISTERS ADDRESSING SCHEME The 1/0 Unit indirect addressing scheme uses two registers directly mapped into the processor's address space: the I/O Register Select register and the I/O Window register. The I/O register select register selects which I/O unit Register appears in the I/O Window register where it can be manipulated by software. I/O Register Select Register Bits [31:8) Bits [7:0) Figure 17.1/0 Register Select Register Bits [31:8]: Reserved. Should be written O. Bits [7:0]: I/O REGISTER SELECT: This register selects an 82489DX I/O unit register. The contents of the selected 32-bit register can be manipulated via the I/O Window Register. The I/O Register Select register is read-write by software. I/O Window Register Bits [31:0) Figure 18. 1/0 Window Register Bits [31 :0] I/O WINDOW REGISTER: This register is mapped onto the I/O Unit's register selected by the I/O Register Select register. Readability/writability by software is determined by the I/O unit register that is currently selected. The addresses (offsets to a platform-de" fined· base address) of all registers are listed in the register summary section. Note that register offsets are aligned on 128-bit boundaries; in other words, registers are located only .at every fourth 32-bit address. This eliminates the need for lane-steering glue logic when connecting the 82489DX's 32-bit data bus to a wider (64-bit and 128-bit) bus. 82489DX 824890X 1/0 UNIT CONFIGURATION 1/0 Unit 10 Register Each 82489DX I/O Unit has a register that contains the flO Unit's 8-bit ID. The I/O unit ID serves as a physical name of the 82489DX I/O Unit. It is used in arbitrating for ICC bus ownership when the I/O unit wants to access the ICC bus for sending any interrupt message. Unlike the local unit ID, the I/O unit ID is not latched-in from the address bus during hardware reset. The I/O unit ID is set to 0 during reset. The software has to write different IDinto the I/O Units before starting interrupt messages on the ICC bus. I/O Unit ID Bits [23:0] Bits [31 :24] Bits [31 :241 I/O Unit ID: The I/O unit ID serves as the physical "name" of the 82489DX unit used for arbitration purposes for the ICC bus usage. In a system with, say, four 82489DX, there are 4 Local Units and 4 I/O Units. All the 8 units should be assigned different ID. The IDs should start with 0 and each unit should have different ID. Bits [23:01 Bits 23 .. 0 are reserved. Should be written O. 1/0 Unit Version Register Each 82489DX I/O Unit contains a hardware Version Register that identifies this 82489DX I/O unit version. This register is read only. I/O Unit Version Register Version: [Bits 7-01 This is a version number that identifies this version. This field is hardwired and is read-only. Will be read as "1" for 82489DX. Bits [15:81 Bits [15:81 are reserved. Max Redir Entry: [Bits 23-161 This is the entry number (0 being the lowest entry) of the highest entry in the Redirection Table. It is equal to the number of Interrupt Input Pins minus one of this I/O Unit. This field is hardwired and is read-only. In the 82489DX I/O unit this is read as 15. Bits [31 :241 Bits [31 :241 are reserved. 1/0 UNIT INTERRUPT SOURCE REGISTERS Redirection Tables The Redirection Table has a dedicated entry for each interrupt input pin. Unlike IRQ pins of the 8259A, the notion of interrupt priority is completely unrelated to the position of the physical interrupt input pin on the 82489DX. Instead, software can decide for each pin individually what it wants the vector (and therefore the priority) of the corresponding interrupt to be. For each individual pin, the operating system can also specify whether the interrupt is signaled as edges or levels, as well as the destination and delivery mode of the interrupt. The information in the Redirection Table is used to translate the interrupt manifestation on the corresponding interrupt pin into an inter-82489DX message. In order for a signal on an edge-sensitive Interrupt Input pin to be recognized as a valid edge ( and not a glitch). the input level on the pin must remain asserted until the time 82489DX flO Unit sends the corresponding message over the ICC bus. Only then will the source 82489DX be able to recognize a new edge on that Interrupt Input pin. That new edge will only result in a new invocation of the handler if its acceptance by the destination 82489DX causes the Interrupt Request Register bit to go from 0 to 1. (In other words, if the interrupt wasn't already pending at the destination.) 82489DX I/O unit has 16 Redirection Table entries. The layout of an entry in the Redirection Table is as follows: Redirection Table Entry I Bits [31:17] I Bit16 I Bit15 I Bit 14 I Bit13 I Bit 12 I Bitll I Bits [10:S] I Bits [7:0] 4-247 82489DX Vector (Bits [7:0] Interrupt vector for this interrupt Delivery Mode (Bits [10:8]) 000: Fixed 001: Lowest Priority 010: < reserved> 011: 100: NMI 101: Reset 110: < reserved> 111: ExtlNT Destination Mode (Bit 11) 0: Physical 1: Logical Delivery Status (Bit 12) 0: Idle 1: Send Pending Bit 13: Reserved. Should be written Bit 13 Remote IRR (Bit 14) Reflects the Remote IRR bit 0: Remote IRR bit is clear. 1: Remote IRR bit is set. Trigger Mode (Bit 15) 0: Edge 1: Level o. Mask (Bit 16) Bits [31 :17] 0: Not Masked 1: Masked ReserVed. Should be written O. DESCRIPTIONS Vector: [Bits 7-0] The vector field is an 8-bit field containing the interrupt for this interrupt. Delivery Mode: [Bits 10-8] The Delivery Mode is a 3-bit field that specifies how the 82489DXs listed in the destination field should act upon reception of this signal. Note that remote read is not supported for liD device interrupts. Note that certain Delivery Modes will only operate as intended when used in conjuction with a specific Trigger Mode. These restrictions are indicated for each Delivery Mode. 4-248 000: (Fixed) means deliver the signal on the INT pin of all processors listed in the destination. Trigger Mode for "fixed" Delivery Mode can be edge or level. 001: (Lowest Priority) means deliver the signal on the INT pin of the processor. that is executing at the lower. priority among all the processors listed in the specified destination; Trigger Mode for "lowest priority" Delivery Mode can be edge or level. 100: (NMI) means deliver the signal on the NMI pin of all processors listed in the destination, vector information is ignored. A Delivery Mode equal to "NMI" requires .a "level" Trigger Mode. 101: (Reset) means deliver the signal to all processors listed in the destination by assertingldeasserting the 82489DX's Reset output pin. All addressed 82489DXs' Local Units will assume their reset state but preserve their unit ID. One side effect of a unit message with Delivery Mode equal to "Reset" that results in 'a deassert of reset is that all 82489DXs' Local Units (whether listed in the destination or not) will reset their lowest-priority tie breaker arbitration ID to their unit ID (see the section on the ICC bus for detailS). A Delivery Mode of "Reset" requires a "level" Trigger Mode. 111: (Extl NT) means deliver the signal to the INT pin of all processors listed in the destination as an interrupt that originated in an externay connected (8259A-compatible) interrupt controller. The Local Unit receiving this interrupt Will activate ExtlNTA in response to this interrupt message. A Delivery Mode of "ExtINT" requires an "edge" Trigger Mode. (See the section on Compatibility for details.) 82489DX Destination Mode [Bit 11] This field determines the interpretation of the Destination field. 0: (Physical Mode): in Physical Mode, a destination 82489DX Local Unit is identified by its unit 10. Bits 56 through 63 (8 MSB of the destination field) specify the 8-bit unit 10. 1: (Logical Mode): in Logical Mode, destinations are identified by matching on Logical Destination under the control of the Destination Format Register in each 82489DX Local Unit. The 32-bit Destination field is the logical destination. Delivery Status: [Bit 12] Bit 13: Delivery Status is a 1-bit field that contains the current status of the delivery of this interrupt. Two states are defined: 0: (Idle) means that there is currently no activity for this interrupt; 1: (Send Pending) indicates that the interrupt has been injected, but its delivery is temporarily held up by other recently injected interrupts that are in the process of being delivered; Delivery Status is software read-only; software writes to this field (as part of a 32-bit word) do not affect this bit. Bit 13 is Reserved. Should be written O. Remote IRR: [Bit 14] This bit is used for level triggered interrupts; its meaning is undefined for edge-triggered interrupts. Remote IRR mirrors the interrupt's IRR bit of the destination 82489DX Local Unit. When the value of the bit disagrees with the state of the Interrupt Input line, a unit message is automatically sent to make the destination's IRR both reflect the new state of the Interrupt Input line, and then the Remote IRR bit is updated to track its associated IRR bit. Remote IRR is software read-only; software writes to this bit do not affect it. Trigger Mode: [Bit 15] The Trigger Mode field indicates the type of signal on the interrupt pin that triggers an interrupt. 0: indicates edge sensitive, 1: indicates level sensitive. Mask: [Bit 16] Use this bit to mask injection of this interrupt. 0: indicates that injection of this interrupt is not masked. An edge or level on an interrupt pin that is not masked results in the delivery of the interrupt to the destination. 1: indicates that injection of this interrupt is masked. Edge-sensitive interrupts signaled on a masked interrupt Input pin are simply ignored (Le., it is not delivered and is not held pending). Level-asserts or deasserts occurring on a masked level-sensitive pin are also ignored and have no side effects. As expected, changing the mask bit from unmasked to masked while the level remains asserted has the side effect of deasserting the level. It is software's responsibility to deal with the case where the Mask bit is set after the interrupt message has been sent but before the interrupt is dispensed to the processor. Bits [31:17] Bits [31:17] are reserved. Should be written O. Destination Bits [63:32] Destination: If the Destination Mode of this entry is "Physical Mode", then the 8 MSB [bits 56 through 63] contain an 82489DX Local Unit 10. If Logical Mode, then the Destination field potentially defines a set of processors. The interpretation of the 32-bit destination field is further enabled by the Destination Format Register in the 82489DX Local Units. 4-249 82489DX Note that it is likely in MP systems that additional processors be located on plug-in boards. Since the ICC bus would be part of the connector, the 82489DX to ICC bus connection is defined so that it can be electrically isolated using external drivers. The 82489DX has separate ICC bus input and output pins that can be connected externally to the 82489DX ~o either provide or not provide isolation. 8.0 ICC BUS DEFINITION Physical Characteristics The ICC bus is a 5-wire synchronous bus connecting all 82489DXs (all lID Units and all Local Units). Four of these five wires are used for data transmissions and arbitration, and one wire is a clock. The description refers to the logical state of the ICC bus. Electrical levels are just inverse of the logical state described. For example, the section describes that the ICC bus is 0000 when not transmitting any message. This refers to logical state. Electrically, the ICC bus is ·1111 when not transmitting any messag~. The isolation can also be used to provide a hierarchical connection of ICC buses electrically supporting large numbers of processors. The number of 82489DXs supported using the hierarchical connection is limited only by ICC bus bandwidth. It should be noted that ICC bus output low current is just 4 mA. The bus is electrically an open-drain connection providing for both bus use arbitration and arbitration for lowest priority. Being open-drain, the bus is run at a "comfortable" speed such that design-specific termination tuning is not required. Furthermore, each 82489DX receiving a message or participating in an arbitration must be given enough time in a single bus cycle to latch the bus and perform some simple logic operations on the latched information in order to determine whether the next drive cycle must be inhibited. Bus Arbitration Arbitration (both for use of the bus and for determining the lowest priority 82489DX) depends on all 82489DX message units operating synchronously. To deal with the event where multiple agents start transmitting simultaneously, a distributed arbitration approach is used. Bus arbitration uses a small number of· arbitration cycles in the ICC bus. During +5V 1K non-Isolated ICC BUS 290446-10 Figure 20. ICC Bus: Simple Direct Connection Isolated ICC BUS Figure 21. ICC Bits: Hierachical Connection 4-250 290446-11 82489DX these cycles, arbitration losers progressively drop off the bus until only the winner remains transmitting. The winner then transmits its actual inter-unit message. Once the sending of a message (including bus arbitration) has started, any possible contender must suppress transmission until enough cycles have elapsed for the message to be fully sent. The number of message cycles depends on the type of message being sent. A bus arbitration cycle starts by the agent driving its unit 10 on the ICC bus. High-order 10 bits are driven first, successive cycles proceeding to the low bits of the 10. All losers in a given cycle drop off the bus, using every subsequent cycle as a tie breaker for the previous cycle. By the time all arbitration cycles are completed, there will be only a single agent left driving the bus. The 8-bit unit 10 (17 16 15 14 13 12 11 10) is chopped up in successive groups of 2 bits (17 16)(15 14)(13 12) (11 10). Each of these tuples is first decoded before driving them on the bus. The Os and 1 indicate logical levels and not signal levels. The ICC bus is 0000 when not transmitting any message. The decoding used is: IDTupie ICC Bus ~ (Hi + 11 Hill B3 B2 B1 BO 0 0 ~ 0 0 0 1 0 1 ~ 0 0 1 0 1 0 ~ 0 1 0 0 1 1 ~ 1 0 0 0 Note that the pattern generated on the ICC bus by tuple (13 12) will be represented as i32 i32 i32 i32. The lower case signifies this encoding. Each tuple of the 10 only contributes to a single wire, making it possible for an agent to determine with certainty whether to "drop off" or to continue arbitrating in the next cycle for the following two bits of the unit 10 simply by checking whether the bus line the agent is driving is also the highest order 1 on the bus. Each ICC bus cycle therefore arbitrates 2 bits. 1: 2: 3: 4: i76 i76 i76 i76 ICC bus arbitration i54 i54 i54 i54 i32 i32 i32 i32 i10 i10 i10 i10 Lowest-Priority Arbitration Arbitration is also used to find the 82489DX Local Unit with the lowest processor priority. Lowest-priority arbitration uses the value of the 82489DX's Processor Priority value appended with an 8-bit Arbitration 10 (Arb 10) to break ties in case there are mUltiple units executing at the lowest priority. Using the constant 8-bit unit 10 as the Arb 10 has a tendency to skew symmetry since it would favor 82489DXs with .Iow 10 values. An 82489DX Local Unit's Arb 10 is therefore not the unit 10 itself but is derived from it. At reset, an 82489DX Local Unit's Arb 10 is equal to its unit 10. Each time a message is broadcast over the ICC bus in lowest priority mode, all 82489DX Local Units increment their Arb 10 by one, which gives them a different Arb 10 value for the next arbitration. The Arb 10 is then endian-reversed (LSB becomes MSB, etc.) to ensure better rotation of which B24B9DX gets to have the lowest Arb 10 next time around. The reversed Arb 10 is then decoded to generate arbitration signals on the ICC bus as described above. To support hot insertion of processor boards in a running MP system, a mechanism is provided to allow the B24B9DX of the added processor to synchronize its Arb 10 with the existing B2489DXs. This is accomplished by broadcasting a message with Delivery Mode equal to "Reset", Trigger Mode equal to "Level", and Level equal to O. This message must be broadcast before the newly added 82489DX is allowed to participate in a lowest-priority arbitration. Depending on the exact sequence under which the newly inserted board is powered-up and initialized, this Arb 10 synchronization may occur naturally if a Reset-deassert to the new B24B9DX is part of that sequence. If not, the local processor can always send this as an inter processor interrupt (with a null destination), causing only Ihe side effect of resetting all B24B9DX Arb IDs. ICC BUS MESSAGE FORMATS The short message format is described first. Note that the first 19 cycles of both short and long message formats have the same interpretation. 1: 2: 3: 4: 5: i76 i54 i32 i10 OM i76 i54 i32 i10 M2 i76 i54 i32 i10 M1 i76 i54 i32 i10 MO 6: 7: B: "0" V7 V3 "0" V6 V2 L V5 V1 TM V4 VO ICC bus arbitration destination mode and delivery mode control bits vector 4-251 82489DX· 9: 10: 11 : 12: 13: 14: 15: 16: 17: 031 027 023 019 015 011 007 003 C 030 026 022 018 014 010 006 002 C 029 025 021 017 013 009 005 001 C 028 024 020 016 012 008 004 000 C 18: 19: "1" A "1" A "1" A "1" A 20: 21: "0" "0" "0" "0" "A" "0" "0" "0" destination checksum for cycle 5 through 16 post amble accept (1000 if OK, 1110 if preempt, else error) idle 1 idle 2 Cycles 1 through 4 are bus arbitration as described earlier. Cycle 5 (OM M2 M1 MO) is the Oestina~ion Mode which is 0 for Physical mode and 1 for Logical Mode, and the Oelivery Mode of the message. The encoding used for the Oelivery Mode in the mes~age is identical to the encoding used for the Oellvery Mode in the Redirection Table, Local Vector Table, and Interrupt Command Register. M2 M1 MO Delivery Mode 0 0 0 Fixed 0 0 1 Lowest Priority 0 1 0 0 1 1 Remote Read 1 0 0 NMI 1 0 1 Reset 1 1 1 ExtlNT Cycle 6 contains the Control Bits of the message. The control bits are: • TM (Trigger Mode): indicates whether this message corresponds to an edge or level; • L (Level): indicates whether this is an Assert or a Oeassert of a "level" signal. L is undefined when TM is edge. 6: "0" "0" L TM Control Bits TM = Trigger Mode (0 = edge, 1 = level) L = Level (0 = deassert, 1 = assert) The length of the message is derived from the Oelivery Mode, the Control Bits, and the Accept cycle of the message. TM/L(AAAA) Edge 4-252 Level = Assert Level = Deassert Fixed Short Short Lowest Priority Short (1110) Short (1110) Short (1110) Long (1000) Long (1000) Short Short Remote Read Long Long Short NMI Short Short Short Resst Short Short Short ExtlNTA Short Short Short intel~ 82489DX Cycles 7 and 8 are the 8-bit interrupt vector. The vector is only defined for Delivery Modes Fixed, and Lowest-priority. For Delivery Mode of "Remote Read", the vector field contains the address of the register to be read remotely. If DM is 0 (physical mode), then cycles 9 and 10 are the unit ID and cycles 11 through 16 are zero. If DM is 1 (logical mode), then cycles 9 through 16 are the 32-bit Destination field. The interpretation of the logical mode 32-bit Destination field is performed by the Local Units using the Destination Format Register. The sending 82489DX knows whether it should (incl) or should not (excl) respond to its own message. Cycle 17 is a checksum over the data in cycles 5 through 16. The checksum is computed by adding all 4-bit quantities of cycles 5 through 16, feeding carry out of the MSB back into the LSB. This protects the data in these cycles against transmission errors. The (single) 82489DX driving the message provides this checksum in cycle 17. Cycle 18 is a post amble cycle driven as 1111 by the sending 82489DX allowing all 82489DXs to perform various internal computations based on the information contained in the received message. One of the computations takes the computed checksum of the data received in cycles 5 through 16 and compares it against the value in cycle 17. If any 82489DX computes different checksum than the one passed in cycle 17, then that 82489DX will signal an error on the ICC bus in cycle 19 by driving it as 1111. If this happens, all 82489DXs will assume the message was never sent and the sender must try sending the message again, which includes re-arbitrating for the ICC bus. In lowest priority delivery when the interrupt has a focus processor, the focus 82489DX will signal this by driving 1110 during cycle 19. This tells all the other 82489DXs that the interrupt has been accepted, the 82489DXs is preempted, and short message format is used. All (non-focus) 82489DXs will drive 1000 in cycle 19. Under lowest priority mode, 1000 implies that the interrupt currently has no focus processor and that priority arbitration is required to complete the delivery. In that case, long message format is used. If cycle 19 is 1000 for non Lowest Priority mode, then the message has been accepted and is considered sent. 19:EEEE 1000 OK preempt 1110 < others> error (drive error as 1111) When an 82489DX detects and reports an error during the error cycle, that 82489DX will simply listen to the bus until it encounters two consecutive idle (0000) cycles. These two idle cycles indicate that the message has passed and a new message may be started by anyone. This allows an 82489DX that got itself out of cycle on the ICC bus to get back in sync with the other 82489DXs. Long Message Format Cycles 1 through 19 of the long message format are identical to cycles 1 through 19 of the short message format. As mentioned, long message format is used in two cases: (1) Lowest Priority delivery when the interrupt does not have a focus. Cycles 20 through 27 are eight, arbitration cycles where the destination 82489DXs determine the one 82489DX with lowest processor priority/ARB ID value. (2) Remote Read messages. Cycles 20 through 27 are the 32-bit content of the remotely read register. This information is driven on the bus by the remote 82489DX. Cycle 28 is an Accept cycle. In lowest priority delivery, all 82489DXs that did not win the arbitration (including those that did not participate in the arbitration) drive cycle 28 with 1000 (co accept), while the winner 82489DX drives 1111. If cycle 28 reads 1111, then all 82489DXs know that the interrupt has been accepted and the message is considered delivered. If cycle 28 reads 1100 (or anything but 1111 for that matter), then all 82489DXs assume the message was unaccepted or an error occurred during arbitration. The message is considered undelivered, and the sending 82489DX will try' delivering the message again. For Remote Read messages, cycle 28 is driven as 1100 by all 8 2489DXs except the responding remote 82489DX, who drives the bus as 1111 in case it was able to successfully supply the requested data in cycles 20 through 27. If cycle 28 reads 1111 the data in cycles 20 through 27 is considered valid; otherwise, the data is considered invalid. The source 82489DX that issued the Remote Read uses cycle 28 to determine the state of the Remote Read Status field in the Interrupt Command Register (valid or invalid). In any case, a Remote Read request is always successful (although the data may be valid or invalid) in that a Remote Read is never retried. The reason for this is that Remote Read is a debug feature, and a "hung" remote 82489DX that is unable to respond should not cause the debugger to hang. 4-253 82489DX Cycles 29 and 30 are two idle cycles. The ICC bus is available for sending the next message at cycle 31. The two idle cycles at the end of both short and long messages, together with non zero (i.e., non idle) encoding for certain other bus cycles allow an ICC bus agent that happens to be out of phase by one cycle to sync back up in one message simply by waiting for two consecutive idle cycles after reporting its checksum error. This makes use of the fact that valid arbitration cycles are never 0000. 1: i76 i54 i32 i10 DM 2: 3: 4: 5: 6: 7: i76 i54 i32 i10 M2 "0" i76 i76 i54 i54 i32 .i32 i10 i10 M1 MO L TM V5 V4 V1 VO D29 D28 D25 D24 D21 D20 D17 D16 D13 D12 D09 D08 DOS D04 D01 DOO V7 V3 D31 D27 D23 D19 D15 D11 D07 D03 V6 V2 D30 D26 D22 D18 D14 D10 D06 D02 C C C 18: 19: "1" "1" "1" A A A 20: p76 p76 p76 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: p54 p32 p10 a76 a54 a32 a10 A p54 p32 p10 a76 a54 a32 a10 A "0" "0" 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: ''~O'' 9.0 "0" p54 p32 p10 a76 a54 a32 a10 A "0" "0" ICC bus arbitration delivery mode control bits vector destination C checksum for cycles 5 through 16 "1" postamble A accept (1 000 if OK, 1110 if preempt, else error) p76 lowest priority arbitration or 32 bits of remote register p54 processor priority p32 p10 a76 a54 arbitration ID a32 a10 A accept "0" idle1 "0" idle2 HARDWARE TIMINGS This section covers the following: - Timing Diagram Notation 4-254 - 82489DX Register Access Timing Diagrams with Descriptions A block diagram of the configuration of the CPU module of a MP system is shown. This in no way is intended to be a complete representation of 486/lntel Cache/Intel Cache Controller connections. It is intended to show all the 82489DX connections, and how they connect to other components on and off the module. This module has arbitrarily been drawn with a 64-bit data bus to show how the expanded address space architecture fits. The unit can be similarly attached to either a 32-bit or 128-bit data bus, with total transparency to shrink-wrap software. In this configuration, the 82489DX uses the same clock source as the processor and cache. However, it is quite possible to consider 82489DX as a memory bus device and hence supply 82489DX with the memory bus clock, which can be slower than the CPU module clock frequency. In the configuration shown, the processor's INT and NMI pins could be supplied by other source to allow for the possibility that 82489DX can be totally bypassed if desired, by allowing those signals to be driven from off the module while the 82489DX is disabled. The reset signal generated by the 82489DX goes to the MBC (memory bus controller) which is required to drive configuration lines at reset time. This would probably be configured as a "warm" reset by the MBC. A future version of cache controller may generate the chip select for 82489DX at a fixed memory location of hexFEEOOOOO. By having the cache controller to provide the chip select signal, it would encourage a standard mapping for 82489DX address space. In some MBC designs, this signal should be connected to the MBC since 82489DX cycles limit bus pipelining by constraining how soon the next bus cycle can come. The 82489DX chip select can be generated by the MBC completely. The address, data and most of the bus control signals share the respective bus with cache and cache controller. The block diagram shows attachment for only 6 address lines: A4-A9. A10 should be o. This is all the 82489DK needs for operation, however, if the address lines are used ,to initialize 82489DX local ID at reset time, 8 address lines are required, A3-A10. 82489DX INTERFACING TO THE ICC BUS The 82489DX has separate ICC bus input and output pins to facilitate using external drivers. The ICC bus input pins (MBI0-3) are TTL-level compatible CMOS inputs. The output pins (MBOO-3) are opendrain pins which required external pull-ups. The open-drain output buffers are small buffers with: Sink current of < 4 mAo Special consideration must be exercised when driving large capacitive loads or long transmission lines. The pullup resistor and the capacitive load constitute RC time constant that will affect the output transition times. This in turn will limit the operating frequency of the ICC bus. When designing in the ICC bus, one needs to consider the loads that each 82489DX will be driving and whether external drivers should be used. In most situations, the ICC bus driven high (MBa pins pulled high by the external pull-up resistors) poses the most challenge. Simulating the target design on an electrical simulator (such as SPICE) will help greatly. as shown in the following examples. First Order Buffer Models Figure 21 a and 21 b are first order input buffer and output buffer models of the MBI and MBa pins. The open-drain of the MBa is modeled as a switch as the primary interest here is the MBa pins going high. These models can be used on SPICE simulations to obtain first order behaviors. The parameters for these models are as follows: Cp (package capacitance) = 3 pF Lp (package inductance) = 15 nH Rb (bond wire resistance) = O.OS!! Ci (input buffer capacitance) = 3 pF Co (output buffer capacitance) = 6 pF Ro (output buffer impedance) = 30!!-80!! MBO Pull-up Resistor To minimize the RC time constant, one would like to use the smallest pull-up resistor value possible. The MBa pins has a worst case lol-spec of 4 mA and Vee = 4.75V. This translates to a minimum pull-up of about 1 K!!. Where stronger drive is needed (smaller pull-up resistors), external drivers must be used. Driving Lumped Capacitance In systems where external drivers are not used, the MBI pins will be tied to the MBa pins. Figure 21 d is a SPICE simulation of the MBa output with a 1 K!! pull-up driving lumped capacitive loads from 10 pF to 150pF. At a load of 50 pF, it takes about 30 ns to charge up to 2V. At 100 pF, it takes an additional 25 ns. Figure 21 d can be used to estimate the loading delay at different lumped capacitive loads. 290446-29 Figure 21a. First Order Input Buffer for MBI Pins 290446-30 Figure 21b. First Order Open-Drain Output Buffer for MBO Pins 4-255 82489DX In real systems, the loads are made up of lumped capacitance and transmission lines. More accurate results can be obtained using transmission line mod~ e~ Driving Transmission Lines Two device model In this example the ICC bus is a signal line on an FR-4 printed circuit board. The line width is 6 mils. Line length of 12 inches and 18 inches are modeled. The FR-4 PC board has the following characteristics: resistivity = 0.6 mO/sq. (0.10/inch for 6 mil width) inductance = 60 pH/sq. (10 nH/inch for 6 mil width) capacitance = 0.55 nF/sq. in. (3.3 pF/inch for 6 mil width) . The ICC bus is shared by two 82489DXs, one at each end. The ICC bus is modeled as a transmission line. For the simulation, only one of the 82489DX is driving. A pull-up resistor of 2 KO is used at each end (1 KO equivalent value) as shown in Figure 21 e. Figure 211 shows the signals at each end of the 12 inch transmission line. Trace 1 is the wave form at the driven end and trace 2 is the signal at the receiving end of the line. The 2 ns delay between the two signals is the propagation delay (or flight time) through the 12 inch.transmission line. It takes about 35 ns for the voltage to charge up to 2V. Figure 21g shows the received signal with different line length and with additional lumped capacitance. Trace 1 .is for 12 inch only. Trace 2 is for 12 inch with additional 20 pF lumped capacitance to represent interconnect socket capacitance. Trace 3 is for 18 inch plus 20 pF. The presence of the 20 pF at each end of the 12-inch transmission line increases the delay time by 20 ns at 2V. GJr! +SV RPULLUP; I Kn MBO MBI . ICc =GND 290446-31 Figure 21c. ICC Bus Driving Lumped Capacitance r-------------~--r_--------------==~==============--~IOp 4.0 --I------------------,,+-----------------_+__----=-=--=-----_ _ sap -----,----:::::>t) 7Sp / ~------------------ lOOp 3.0 I-----------J'------+-------",..L..--~---.,.....::...-+---~--"''''-----------b 150p . / / / ,. - ..... -_................ --------- --------~" -----------...... ... ... 2.01-----~r-----~~~~--~~--~~'--_+_----------------~ 1.01---~'--~~~~~+-----------------_+_----------------~ 50.0 ns 100.0 ns 150.0 ns TIME 290446-32 Figure 21d. 1 KO Pull-Up Driving Lumped Capacitance 4,256 82489DX Four Device Model In this example (Figure 21 h). the ICC bus is a 12-inch transmission line with four 82489DXs connected at 4 inch intervals. The loading at each junction consisted of the MBI and MBO buffers and a 20 pF lumped capacitance. 2 Kn pull-ups are at each end of the transmission line. One way to improve the low to high transition time is to use a stronger pull-up (smaller resistor value) which is possible using external line drivers with their larger current drive capabilities. Figure 21j shows the difference in output when the model is used with 300n pull-ups at each end of the transmission line. As shown in Figure 21 i. it takes more than 90 ns for the signal level at both ends to reach 2V. >5V >5V 2K 2K 82489DX MBO MBI 82489DX -- J-, Transmission Line IL - J-, MBO }c4- - MBI - 290446-33 Figure 21e. Unbuffered ICC Bus with Two 82489DX 5.0,------------,-------------.-------------r------------, 4.0~----------~------------_+------------_+------------~ 3.0~----------~------------_+--------~~~------------~ 2.0~-----------1--------~~-+-------------+------------~ 1.0~--~,_--_r~~----------_+------------_+------------~ 25.0 ns 50.0 ns 75.0 ns 100.0 ns TIME 290446-34 Figure 21f. Waveform at Both Ends of 12" Toline 4-257 82489DX External Drivers/Buffered ICC bus The 82489DX has separate ICC Bus input (MBI) and output (MBO) pins that can be connected to external line drivers in systems that has appreciable loading on the ICC Bus or where modularity of the bus is needed. Figure 21 k is a typical implementation using external drivers with tri-state outputs. Drivers such as 74F125 or its equivalent can be used. The drivers should be placed as close to the MBO pins as possible. The input buffer on MBI is optional depending on the us- 5.0 ers ICC Bus scheme. The total delay through the drivers, buffers, transmission line, clock skews etc. must be calculated to ensure that all the ICC bus timing requirements are met. A hierarchical bus connection can also be used in applications that cannot afford driver/buffer per unit and where bus loading are localized in cluster groups. Figure 211 shows such a connection where each cluster group is connected directly and drivers are used to connect to other clusters. Each cluster group is assumed to be close together physically with small loading on the local ICC bus. r-------------------,--------------------r-------------------, .-.-..--'- 4.0 12" with 2K+20p V 0 L --- 3.0 12" with 2k T I N 2.0 ~J 18" with 2K+20p 1.0 50.0 ns 100.0 ns 150.0 ns TIME 290446-35 Figure 21g. Waveform at End of T-line with Load +5V +5V 2K 2K r 20PF r 20PF r 20PF 290446-36 Figure 21h. Four 82489DX Configuration 4·258 82489DX 5.0 r-------------------~--------------------~--------------------_, 4.0 r_------------------~--------------------~--------------------~ v o 3.0 r_------------------~--------------------~--------------------~ 2.0 r_----------------~~r_--------------_,--_q--------------------~ 1.0 r_--------------~~~r_------------------~--------------------~ 50.0 ns 100.0 ns 150.0 ns TIME 290446-37 Figure 21i. Waveform for Four Devices on 12" T-Iine 5.0 r---------------------r-------------------~------.=-~.-~.-~-=-~.~-~--~.=-~-l') ;--.-./.-.----. .r---"'_' ./"j (" 4.0 r_------------------~~~----------------~--------------------~ ;" (V v o / 30011 each end I 3.0 r_--------~~+_------r_------------------~--------------------~ L T L I N 2.0 r_-----P~~----------r_--------------~-=~~------------------~ 2 Kll each end 1.0 r_----+---------~~~r_------------------~r_------------------~ 100.0 ns 50.0 ns 150.0 ns TIME 290446-38 Figure 21j. Waveform with Different Pull-Ups 4-259 82489DX Transmission Line Termination As with all high speed designs, one has to consider transmission line effects on signals, especially clock signals. Even though the ICC bus clock, IClK is usually operated in the 10 MHz range, one has to consider proper transmission line termination also for short rise times. Figure 21 m shows the IClK wave form at the end of a 12 inch T-line when driven by a clock generator with and without series matched termination. Series termination should not be used for the ICC bus data lines (MBO). The combination of the pull-up resistor and series resistor would degrade the output low voltage, Vol. For example, with a pull-up of 300n and a series termination of 50n at each end, the Vol voltage at the receiving end would be at 1.55V if the driving end is at O.4V (see Figure 21 n). ICC BUS Operating Frequency The 82489DX ICC BUS has a design target.of operating up to 16 MHz (62 ns period). As shown in the examples above, the MBO low-to-high transition times are strongly dictated by the loads and the pullups used. This will in turn affect the maximum operating frequency of IClK. In general, the minimum period is the larger of 62 ns or MBO-to-MBI low data time or MBO-to-MBI high data time. MBO-to-MBI low time = (IClK sk~w + MBO valid low delay + T-line prop.delay + ext. buffer delay + MBI setup time) MBO-to-MBI high time = (IClK skew + MBO Hi-Z delay + pull-high time + T-line prop.delay + ext. buffer delay + MBI setup time) Maximum MBO valid low delay = 50 ns Maximum MBO H-Z delay. = 15 ns MBI minimum setup time = 8 ns. In the example shown earlier where two 82489DXs are at each end of a 12-inch T-line with no other loads, the pull-high time to 2V is 35 ns (trace 1 in Figure 21 g). If the IClK skew is 2 ns, then this configuration caD operate to 62 ns period or 16 MHz. If the same configuration has additional 20 pF loads at each end, then the pull-high time is 55 ns (trace 2 in Figure 21 g). The maximum frequency decreases to 12 MHz (82 ns period). In the four device model discussed earlier, where the ICC Bus is unbuffered, the pull-high time is 90 ns (Figure 21j). The operating frequency will be less than 8 MHz (117 ns period). If external buffers are used (whereby. allowing use of 300n pull-up) and assuming the external buffers have delays of 10 ns, the operating frequency is limited by the MBO-toMBIIow time of 72 ns or 14 MHz. NOTE: Each application is unique in its configuration and loading on the ICC Bus. The above examples highlighted some of the factors that need to be considered. It is important to do electrical simulation to ascertain if the propose implementation is viable before committing to the printed circuit board. +5V +5V 290446-39 Note: Input buffer is optional Figure 21k_ External Driver/Buffer Implementation 4-260 82489DX +5V lK Isolated ICC BUS 290446-40 Figure 211. ICC Bus: Hierarchical Implementation !" 6.0 ; 1 '.0 2.0 ! ! \-Rs - CD N Ol .I>- CD N ~ CO C >< Register Access (Write) T3 I T4 CLOCK ADS MIlO, ole, will "'1'1 cO' c A[ 1O]-A[ 4] iil N ~ -I 3' s· c ce iii' BGT ce ~ ...~ l§J ROY (82489DX driven) ~. ~ ;:g OLE © Iiiiil = ;:g DATA (write) I 'iii! © :w ~ ~ ~ - © ;:g ~.- 82489DX write delay stretch here for: ~ • - BGT delay 290446-13 -- :l c[ @ - _. ::l c[ ~ ©1 ~ Register Access (Read) ~ T3 © IiiiiI I ® T4 CLOCK ~ '1iil © 2:ID rs~ ADS I .1 ______ _ ~ © ~ "TI C' BGT t: iil II.) ~ RDY (82489DX driven) -I 3' 5' ea C DLE C iii' ea ; DATA (read) 3 -----1--,\------- ---. -- --- - ----- II.) (data driven concurrently with RDY#) DATA (read) ~--:::~-t::--~l-::--~--:--::--':l-::::------- stretch here for: c - - BGT delay ~--: 5~t:: ~--:-T-:-+ I 82489DX read delay - - - " +:: ~ I t-::__ ~--::: :--t--::: :---:--::-- ~--:--: 82489DX driven ~-- DLE delay 290446-14 CD .... ro0> U1 II.) ~ CD ID C >< .... CD ~ CD ~ N • CO ~ Interrupt Acknowledge Tl T2 T3 T4 T5 T6 T7 T8 T9 Tl0 CLKIN Figure A: INTERRUPT ACKNOWLEDGE for EXTERNAL 8259 , ---1 , ', I PINT I ,---->-___ _ • , ! ! :\ ~I m;;m ... m.m mm.... .m .. m.:.......... .. ....... :.;;.~ ExtiNTA EA"LIESr NEXT CYCLE ADS ..... : .....~ .... ;.~ , M/IO,D /C!R C, W R ..... ~:..... "II ~i I" '::h !.. n~ I. ........~ .......... .. ! I ' ! .c---~------ C J. _ _ _ _ .: _ _ _ _ _ _ _ _____ ~------ "'____ I Figure B: ' INTERRUPT ACKNOWLEDGE for 82489DX c: iii N !" ::! 3 Sea C ~ ::g © IiiiiI ~ 'iii! © aID ~ ~ C::J - © ::g ExtiNTA M/iO, D/e, W/R iil Cs 3 -.:.'::;;::=::::::::: ::::::;;;t;;:;;;;;;;;;;;;~;j;;;:.;:;~;;~;:::::::r:::::::::;:.'........ ..........,.:. ----. ! \! L. ...... l A '1 &: : ADS iii ea ~ @ PINT w ROY (82489DX ddven) OLE ~ C ..... ;..... !.. I, A .. m; .. ml ..=tIY.m .... ;.~ " L. ..... L.m .. :.5:'1 .. ! ~I : : m.. ;m .. fm.~m ........ :mmm.; . .\ : ; EARLIEST NEXT CYCLE I )000( !I , : I II : , :C _._: : ! ~I' ~ C I: : : ~.f ..... ;.. m~ .. m:m.m ..... mm~rr : i: Dala (Read) C, ' m •• m ••• , : BGT • : : i 82498D~ DRIVEN: IDLE nol I:OLE u~ed : ~~ J used ,. ';---...:':.;f. \ : : y~ ! ! , : ..... :hmfh:~.~~.r.~.::-.~.~:::'.:.:\.t::~.-::':::< RANDOM Dala DRIVEN . >::ir.~.1r.:.-::,.:,:::::,.:,.-:::~.:,.-::<4'i : ). ~ - - - - -, ! 't ' ~' BGT delay' , . t ' ~ , , tiNT VECTOR. '82498DX driven I , STRETCH HERE FOR: 1_____ L----------r----------=.:....----------..J , OLE delay INTERRUPT I , L. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ . J ACKNOWLEDGE DELAY, OTHER BUS CYCLES COULD BE INJECTED HERE ----..I 82489DX read delay 290446-15 -:::s c[ @ - -- €: ~ § ~ ~ @ © IiiiiI ~ T1 'iii] T2 Tn Tn + 1 Tn + 2 Tn + 3 Tn + 4 CLOCK @ aID ~ A[10]-A[3] _____ ~ ±____ {{{I{V{{I.{{{1 (MBC DRIVEN) ._______ _________ ~_ ~ __ \.: _ C::3 © !! ~ cg ;; ~ ~ Minimum 2 Clocks RESET ADS - - - - - 3" s- ea C FRST .... _--_ ........ _..... _.... __ ..... _--------_ . __ ............. _......'\.--+------+------+---- iii' ea ~ ICC Bus Timing """ ,tt '::: 1~ t ~xxxx 1X; 290446-16 Q) r.) ~ <» I\) "-..J "'CD"' Q) C >< 82489DX REGISTER READ TIMING For discussion of this bus cycle, refer to Figure 24, 82489DX timing Diagram 2. It shows the relationship of the three phases of the bus cycle, however, the dependent control and address signals are not shown here, since they behave exactly as in the case of a register write. See the previous section for the description of control and address phases of the bus cycle. In the case of a read when DLE is used (Figure 24A), it works logically like an asynchronous, output data enable. The 82489DX drives the data bus within time delay "A" after DLE is asserted, which must not occur before 8GT. Note that even though the bus is being driven, the data only becomes valid during the clock cycle in which RDY is asserted, after that point, valid data is maintained on the bus as long as DLE remains asserted, after which the data bus returns to high impedance state within time delay "8" of DLE deassertion. If DLE is asserted late, 82489DX could complete its internal read cycle and return RDY early. In that case, RDY active low state will be maintained until DLE is asserted. In the case of a read when DLE is NOT used (Figure 248) the data is driven for exactly one clock cycle, coincident with RDY being asserted. DLE is sampled with the control signals to determine whether it is being used. If sampled in the asserted state when ADS is active, the DLE will be considered not used, and its state during the remainder of the cycle doesn't matter. This is consistent with the notion of permanently tying this signal low when not used, as described in the previous section. Indication of the end of the bus cycle is dependent on the use of DLE. When it is not used, (Le., permanently tied to ground) RDY indicates the end of the bus cycle, as it does in the case of write access. When it is used, DLE deassertion indicates the end of the cycle, since the 82489DX could be driving the bus well after RDY is deasserted. In either case, the 82489DX can accept the next ADS pulse anytime after RDY has been asserted. However, note that if DLE is being used, the next ADS should be delayed until DLE can be safely sampled inactive. Note these two options for earliest next cycle in the timing diagram. INTERRUPT ACKNOWLEDGE TIMING For discussion of this bus cyclE!, refer to Figure 25 82489DX timing diagram 3, This cycle is the result of the 82489DX posting an interrupt to the processor by asserting PINT. After PINT is asserted, other bus cycles may occur before the interrupt acknowledge cycle. 4-268 PINT can be asserted for any external (8259) interrupt as shown in Figure 25A, or for an 82489DX generated interrupt as shown in Figure 258. ExtlNTA indicates whether PINT was asserted in response to an 8259 request or an 82489DX request. This signal is used by external control logic to either allow or preclude the 8259 from responding to the subsequent interrupt acknowledge cycle. It should be noted that ExtlNTA pin gets deactivated at third clock after the ADS of the second INTA cycle. When ExtlNTA is high, the 82489DX will not respond to the acknowledge cycle other than to deassert PINT signal, and clear the pendi~xternal interrupt two full clock cycles after the ADS of the second INTA cycle, as shown in Figure 25A. Once PINT is asserted in response to an external interrupt, it can only be deasserted by an INTA cycle. An INTA cycle is recognized by 82489DX as soon as the bus cycle definition is sampled with ADS in a low state. 8GT is not needed in this case. When ExtlNTA is low, the 82489DX will respond to the acknowledge cycle, as shown in Figure 258. In this case, external logic (e.g., the memory bus controller) is expected to prevent any attached 8259 from seeing the acknowledge cycle. When PINT is asserted in response to an 82489DX internal interrupt, it can be deasserted by an INTA cycle. The PINT signal will be deasserted 5 ·full clocks after 8GT of the second INTA cycle. Note that ExtlNTA is stable at all times while PINT is asserted. That means that even if new interrupts arrive between the time an interrupt is posted to the processor, and the acknowledge occurs, the 82489DX will not change its commitment for an external (8259) or" internal (82489DX) acknowledge cycle, regardless of priority. This also means that PINT may be raised for a high priority internal interrupt right after responding to the external Interrupt. In any event. PINT will be kept low for a minimum of two clocks before reasserting itself. The interrupt acknowledge cycle is indicated by the bus cycle definition signals all being low, and looks like two consecutive read cycles, except that there is no explicit address information. The actual content of the address pins during this cycle is processor dependent, and therefore there is no chip select either. Chip select is implied by a combination of the bus cycle definition signals (all low) and 8GT. Note that there is a "dummy" data phase in the first interrupt acknowledge cycle. This allows parity to be generated on the bus for processors like i860XP. During this cycle, 82489DX drives random data on the bus with the appropriate parity. The interrupt Request Register is "frozen" and the highest priority 82489DX pending interrupt vector is returned to the processor in the second acknowledge cycle. The second acknowledge cycle has a complete data phase with timings identical to those of an ordinary register read. The data returned is the vector of the highest priority internally pending 82489DX interrupt, or the spurious interrupt vector, if there is no interrupt pending higher than the current processor priority. Note that the timing diagram shows OLE being used (sampled high during ADS). However, just as in a normal read cycle, the option exists not to use OLE (Le., permanently tied to ground). RESET AND MISCELLANEOUS TIMING For discussion of this bus cycle, refer to Figure 26 82489DX Timing Diagram 4. It shows the 82489DX reset cycle, the timing of some related signals, and the ICC bus. The RESET input has a setup and hold time to the system clock edge, CLKIN, as do other independently timed signals, The RESET signal will reset the two asynchronous system on the chip, namely the ICC bus unit running synchronously to the ICLK and all the other unit running synchronous to the system clock, CLKIN. RESET must meet the minimum reset time with respect to both clocks and there should be at least one ICLK rising edge during reset. The TAP controller should also be initialized. During reset, an eight-bit 82489DX Local Unit 10 can be optionally initialized. Eight address lines, A10-A3 are sampled on every clock edge while RESET is asserted. The last sample remains in the 82489DX Local Unit 10 register after reset. Alternatively, the 82489DX Local Unit 10 can be loaded with a register write as part of software initialization, before 82489DX operation is started. In any event, the register must be initialized before the 82489DX can communicate on the ICC bus, including sending/receiving RESET messages. All valid signal to 82489DX should wait at least two full clocks after RESET is deasserted. The PRST signal (reset output) is asserted both with RESET input, or under software control. Its on-off delay times are relative to the rising clock edge. The duration of PRST under software control is defined by the software itself. Also note that the PNMI pin has the same timing as does PRST when the latter is software controlled. The ICC bus signals are both input and output on each cycle. Setup, hold and delay times are all measured with respect to the ICC bus Clock ICLK which has no relationship to the Processor clock on which the remainder of the 82489DX runs. This means that the ICC bus is independently sampled on each ICLK edge, as shown. It also implies that largest possible hold time will not exceed the minimum delay time. After reset, all 82489DX registers are reset to "0" state. The mask bits in the local vector table and the redirection table are reset to "1" state to mask out all interrupts. All reserved bits are all wired to "0" state permanently on chip. 10.0 BOUNDARY SCAN DESCRIPTION The 82489DX is equipped with the JTAG boundary scan standard. This feature allows the user to test the interconnections between 82489DX and the external hardware once they have been assembled onto a printed circuit board or other substrate. In addition to the JT AG mandatory instruction set, 82489DX also provides the INTEST instruction which allows static testing of the on-chip logic. The detailed information related to the IEEE Std 1149.1-1990 (the JTAG standard) can be obtained from the reference document IEEE Standard Test Access Port and Boundary Scan Architecture (IEEE Std 1149.1-1990). 10.1 Boundary Scan Architecture The boundary scan logic contains the following elements: - Five Test Access Ports (TAP): They are labeled as trst, tck, tdi, tdo and tms. All ports are input pins except tdo, which is a tri-state output pin. - A TAP Controller: The logic is used to control the boundary scan activity. - 82489DX Device ID Register: This is a 32-bit read-only register. The DID can be shifted out in ascending order to the tdo pin. - JTAG Instruction Register (IR): This is a 4-bit register which accepts instruction code shifted in from the tdi pin. The opcode stored in the IR register is used to control operation. - Boundary Scan Register: This is a 137 stages scan path which connects almost all 82489DX signal pins for boundary scan purposes. - Bypass Register: This register simply allows the data which goes into tdi pin to be shifted out directly from tdo. The following block diagram illustrates the implementation of the JTAG architecture in the 82489DX design. 4-269 82489DX Boundary scan reg. tdi ICC JTAG 10 reg. By-pass reg. instruction reg. tm •. tr.t TAP controller 290446-17 Figure 27. Block Diagram of the JTAG Architecture Test Access Ports trst TAP controller master reset pin. When trst is low, the TAP controller's state machine will be reset to "test-logic-reset" state asynchronously. This pin is tied to a weak internal pull-up for keeping to be a logical 1 when not driven. tck This is the test logic clock. The test logic will change state on the rising edge of the tck. tdi Test data input. Data is shifted into the tdi pin on the rising edge of tck. This pin is tied to a weak internal pull-up for keeping it to be a logical 1 when not driven. tms Test mode select. This pin is used to select the state of the TAP controller. This pin is synchronous to the rising edge of the tck. This pin is tied to a weak internal pull-up for keeping it to be al9gical 1 when not driven. 4-270 tdo Test data output. This is a tri-state pin which allows the data to be shifted out. TAP CONTROLLER The TAP controller in 82489DX is implemented to conform the IEEE1149.1 standard. The TAP controller is a single phase clock, synchronous finite state machine. It controls the sequence of the operation of the test logic. The value of the test mode state (tms) pin at a rising edge of tck controls the sequence of the state changes. The state diagram for the TAP controller is shown in Figure 28. Test designers must consider the operation of the state machine in order to have the correct sequence of value to drive on tms. The behavior of the TAP controller and other test logic in each of the controller states is briefly described as follows. 82489DX 290446-18 Figure 28. TAP Controller State Diagram TEST-LOGIC-RESET The test logic is disabled so that normal operation of the on-Chip system logic (i.e., in response to stimuli received through the system pins only) can continue unhindered. This is achieved by initializing the in· struction register to contain the IOCOOE instruction. No matter what the original state of the controller, it will enter Test-Logic-Reset when tms is held high for at least five rising edges of tck. The controller remains in this state while tms is high. If the controller should leave the Test-Logic-Reset controller state as a result of an erroneous low signal on the tms line at the time of the rising edge on tck (for example, a glitch due to external interfer- ence), it will return to the Test-Logic-Reset state following three rising edges of tck with the tms line at the intended high logic level. The operation of the test logic is such that no disturbance is caused to on-chip system logic operation as the results of such. an error. On leaving the Test-Logic-Reset controller state, the controller moves into the Run-Test/Idle controller state where no action will occur because the current instruction has been set to select operation of the device identification register. The test logic is also inactive in the Select-OR-Scan and SelectIR-Scan controller states. Note that the TAP controller will also be forced to the Test-Logic-Reset controller state asynchronously by applying a low logic level at trst. 4-271 82489DX RUN-TEST/IDLE A controller state between scan operations. Once entered, the controller will remain in the Run-Test! Idle state as long as tms is held low. When tms is high and a rising edge is applied at tck, the controller moves to the Select-DR-Scan state. In the Run-Test/Idle controller state, activity in selected test logic occurs only when certain instructions are present such as RUNBIST. Since 82489DX does not have RUNBIST instruction, this state is acting like an idle state. The instruction does not change while the TAP controller is in this state. SELECT-OR-SCAN This is a temporary controller state in which all test data register (82489DX has one test data register which is the boundary scan shift registers path) selected by the current instruction retain their previous state. If tms is held low and a rising edge is applied to tck when the controller is in this state, then the controller moves into the Capture-DR state and a scan sequence for the selected test data register is initiated. If.tms is held high and a rising edge is applied to tck, the controller moves on to the Select-IR-Scan state. The instruction does not change while the TAP controller is in this state. When the TAP controller is in this state and a rising edge is applied to tck, the controller enters either the Exit 1-DR state if tms is held at 1 or the Shift-DR state if tms is held at O. SHIFT-DR In this controller state, the test data register connected between tdi and tdo as a result of the current instruction shifts data one stage towards its serial output on each rising edge of tck. Test data registers that are selected by the current instruction, but are not placed in the serial path, retain their previous state unchanged. The instruction does not change while the TAP controller is In this state. When the TAP controller is in this state and a rising edge is applied to tck, the controller enters either the Exit 1-DR state if tms is held at 1 or remains in the Shift-DR state if tms is held at o. EXIT 1-DR This is a temporary controller state, if tms is held high, a rising edge applied to tck while in this state causes the controller to enter the Update-DR state, which terminates the scanning process. If tms is held low and a rising edge is applied to tck, the controller enters the Pause-DR state. All test date registers selected by the current instructions retain their previous state unchanged. The instruction does not change while the TAP controller is in this state. SELECT-IR-SCAN This is a temporary controller state in which all test data registers selected by the current instructing retain their previous state. If tms is held low and a rising edge is applied to tck when the controller is in this state, then the controller moves into the Capture-IR state and a scan sequence for the instruction register is initiated. If tms is held high and a rising edge is applied to tck, the controller returns to the Test-Logic-Reset state. The instruction does not change while the TAP controller Is In this state CAPTURE-DR In this controller state data may be parallel-loaded into test data registers selected by the current instruction on the rising edge of tck. If a test data register selected by the current instruction does not have a parallel input, or if capturing is not required for the selected test, then the register retains its previous state unchanged. The instruction does not change while the TAP controller is in this state. 4-272 PAUSE-DR This controller state allows shifting of the test data register in the serial path between tdi and tdo to be temporarily halted. All test data registers selected by the current instruction retain their previous state unchanged. The controller remains in this state while tms is low. When tms goes high and a rising edge is applied to tck, the controller moves on to the Exit 2-DR state. The instruction does not change while the TAP controller is in this state. EXIT 2-DR This is a temporary· controller state. If tms is held high and a rising edge is applied to tck while in this state, the scanning process terminates and the TAP controller enters the Update-DR controller state. If· tms is held low and a rising edge is applied to tck, the controller enters the Shift-DR state. intet All test data registers selected by the current in· struction retain their previous state unchanged. The instruction does not change while the TAP controller is in this state. UPDATE-DR Some test date registers may be provided with a latched parallel output to prevent changes at the parallel output while data is shifted in the associated shift·register path in response to certain instructions (e.g., EXTENT, INTEST, and RUNBIST). Data is latched onto the parallel output of these test data registers from the shift·register path on the falling edge of tck in the Update·DR controller state. The data held at the latched parallel output should not change other than in this controller state unless op· eration during the execution of a self test is required (e.g., during the Run·Test/ldle controller state in re· sponse to a design· specific public instruction). All shift·register stages in test data registers select· ed by the current instruction retain their previous state unchanged. The instruction does not change while the TAP controller is in this state. When the TAP controller is in this state and a rising edge is applied to tck, the controller enters either the Select·DR·Scan state if tms is held at 1 or the Run· Testlldle state if tms is held at O. CAPTURE-IR In this controller state the shift·register contained in the instruction register loads a pattern of fixed logic values on the rising edge of tck. In addition, design· specific data may be loaded into shift·register stages that are not required to be set to fixed values. Test data registers selected by the current instruc· tion retain their previous state. The instruction does not change while the TAP controller is in this state. When the TAP controller is in this state and rising edge is applied to tck, the controller enters either the Exit 1·IR state tms is held at 1 or the Shift·IR state if tms is held at O. SHIFT-IR In this controller state the shift·register contained in the instruction register is connected between tdi and tdo and shifts data one stage towards its serial out· put on each rising edge of tck. 82489DX Test data registers selected by the current instruc· tion retain their previous state. The instruction does not change while the TAP controller is in this state. When the TAP controller is in this state and a rising edge is applied to tck, the controller enters either the Exit1·IR state if tms is held at 1 or remains in Shift·IR state if tms is held at O. EXIT 1-IR This is a temporary controller state. If tms is held high, a rising edge applied to tck while in this state causes the controller to enter the Update·IR state, which terminates the scanning process. If tms is held low and a rising edge is applied to tck, the con· troller enters the Pause·IR state. Test data registers selected by the current instruc· tion retain their previous state. The instruction does not changes while the TAP controller is in this state and the instruction register retains its state. PAUSE-IR This controller state allows shifting of the instruction register to be halted temporarily. Test data registers selected by the current instruc· tion retain their previous state. The instruction does not change while the TAP controller is in this state and the instruction register retains its state. The controller remains in this state while tms is low. When tms goes high and a rising edge is applied to tck, the controller moves on to the Exit 2·IR state. EXIT 2-IR This is a temporary controller state. If tms is held high and a rising edge is applied to tck while in this state, termination of the scanning process results, and the TAP controller enters the Update·IR control· ler state. If tms is held low and a rising edge is ap· plied to tck, the controller enters the Shift·IR state. Test data registers selected by the current instruc· tion retain their previous state. The instruction does not change while the TAP controller is in this state and the instruction register retains its state. UPDATE-IR The instruction shifted into the instruction register is latched onto the parallel output from the shift·regis· ter path on the falling edge of tck in this controller state. Once the new instruction has been latched, it 4·273 82489DX becomes the current instruction. Test data registers selected by the current instruction retain their previous state. When the TAP controller is in this state and a rising edge is applied to tck, the controller enters the Se· lect·DR·Scan states if tms is held at 1 or the Run~ Test/Idle state if tms is held at O. INSTRUCTION REGISTER The function of the instruction register is to select the operating mode of the test logic. For instance, read the ID register, or capture the ,82489DX output signals. 82489DX has implemented 4 instructions. Instruction Mandatory/Optional Opcode bypass m 1111 extest m 0000 sample/preload m 0001 idcode m -0010 reserved 0 1001 Bypass Instruction The bypass instruction selects the bypass register to be connected to tdi and tdo, effectively bypassing the test logic on the 82489DX boundary scan path and reducing the shift length to be on one bit. Note that an open circuit fault in the board level test data path will cause the bypass register to be selected following an instruction scan cycle due to the inter· nal pull·up on the tdi pin. This has been done to prevent any unwanted interference with the proper operation of the system logic. NOTE: 82489DX must be reset after extest instruction has been executed. Sample/Preload Instruction The sample/preload instruction has two functions that it can perform. When the TAP controller is in the CAPTURE-DR state, the sample/preload instruction allows a snap-shot of the normal operation of the 82489DX without interfering with that normal operation. The instruction causes boundary scan register cells associated with outputs to sample the value being driven into the 82489DX. On both outputs and inputs the sampling occurs on the rising edge of tck. When pre loads data into the 82489DX pins to be driven to the board by executing the extest instruction. Data is preloaded to the pins from the boundary scan register on the falling edge of tck. dcode Instruction The idcode instruction selects the device identification register to be connected to tdi and tdo, allowing the device ID code to be shifted out of the device on tdo. Note that the bit stream shifted into tdi will appear on tdo after all 32 bits of the DID has been shifted out. DEVICE IDENTIFICATION REGISTER (DID) The device identification is a 32 bits number which can be read by the external hardware by using the idcode instruction. The 82489DX device ,ID is assigned to 1489A013 (hex). This is subject to change. The upper 4 bits of DID may be changed for different version. The 16-bit number (bit 27-bit 12) 489A (hex) is the part ID. The lower 12 bits are the manu· facturerlD for Intel which must be 013 (hex). Extest Instruction The extest instruction allows testing of circuitry ex· ternal to the component package, typically board interconnects. It does so by driving the values loaded into the 82489DX's boundary scan register out on the output pins corresponding to each boundary scan cell and capturing the values on 82489DX's input pins to be loaded into their corresponding boundary scan register locations. I/O pins are selected as in'put or output depending on the value located into the output control cell. Values shifted into input latch in the boundary scan register are never used by the internal logic of the 82489DX. 4-274 BOUNDARY SCAN REGISTER 82489DX has only one test data register, i.e., the boundary scan register. The boundary scan register is a single shift register path containing the boundary scan cells that are connected to all signal input and output pins of the 82489DX. There are three generic type of boundary scan cells-input, output, and bi-directional. For each input only cell, one stage of shift register is added to the boundary scan path. 82489DX All output pins will become tri-stateable when boundary scan is activated, regardless whether they are tri-stateable or not in the normal operation. To explain further, the user will enable/disable an out- put driver with a specific tri-state control cell in the scan path. The user must shift in a proper control Signal for these tri-state control cells in the scan path. ---------------------------. I i/o pin I 82489DX inpu pin I I I I I I I I I INTERNAL LOGIC I output pin I tdi tdo I I -------------- -------_. 290446-19 Figure 29. Logical Structure of Boundary Scan Register 4-275 82489DX BOUNDARY SCAN CELL NAMES IN ORDER FROM tdl TO tda The following table is a list of the boundary scan cell names in the order from tdi to tdo. The type information indicates the purpose of the cells. I = input only cell B = bi-directional cell T = tri-state output cell e = tri-state control. Note that the signal name enclosed within the parenthesis is controlled by this cell. Type 28 I Name Pin # INTIN13 84 85 29 I INTIN12 30 T reserved 31 e (reserved) 32 I INTIN11 86 33 I INTIN10 87 34 T reserved 35 e (reserved) Pin # 36 I INTIN9 88 1 I elKIN 57 37 I INTIN8 89 2 I TMBASE 59 38 T reserved 3 I lelK 60 39 e (reserved) 4 I DC 61 40 ,I INTIN7 90 5 I WR 62 41 I INTIN6 91 Cell Number 4-276 Cell Number Type Name 6 I MilO 63 42 T reserved 7 I ADS 64 43 e (reserved) 8 I RESET 65 44 I INTIN5 92 9 I BGT 66 45 I INTIN4 93 10 I reserved 70 46 T reserved 11 I reserved 71 47 e (reserved) 12 I reserved 72 48 I INTIN3 94 13 I OLE 73 49 I INTIN2 95 14 I es 74 50 T reserved 15 I reserved 75 51 e (reserved) , 16 I MBI3 76 52 I INTIN1 96 17 I MBI2 77 53 I INTI NO 97 18 I MBI1 78 ' 54 I reserved 19 I MBIO 79 55 B OP3 20 I LlNTIN1 80 56 e (OP[3:0]) 21 I LlNTINO 81 57 B OP2 102 22 T reserved 58 B OP1 103 23 e (reserved) 59 B OPO 104 24 I INTIN15 82 60 T reserved 25 I INTIN14 83 61 e (reserved) 26 T reserved 62 B 031 27 e (reserved) 63 e (0[31 :0]) 64 B 030 101 105 107 82489DX Cell Number Type 65 B Pin # Cell Number Type 029 109 102 B 110 103 104 Name Name Pin # 02 14 B 01 16 B DO 18 66 B 028 67 T reserved 66 C (reserved) 105 I reserved 19 69 B 027 111 106 B reserved 20 112 107 C (cell 106, A[1 0:3]) 108 B A10 21 109 B A9 22 110 B A8 24 70 B 026 71 T reserved 72 C (reserved) 73 B 025 114 74 B 024 115 111 B A7 26 , 75 B 023 116 112 B A6 27 76 T reserved 113 B A5 28 77 C (reserved) . 114 B A4 29 78 B 022 118 115 B A3 31 79 B 021 119 116 T reserved 34 80 B 020 121 117 C reserved 81 B 019 122 118 T PINT 82 B 018 123 119 C (PINT) 83 B 017 124 120 T PNMI 84 B 016 125 121 C (PNMI) 85 B 015 128 122 T PRST 86 B 014 129 123 C (PRST) 87 B 013 130 124 T ExtlNTA 88 B 012 131 125 C (reserved) 89 B 011 2 126 T reserved 90 T reserved 127 C (reserved) 91 C (reserved) 128 T ROY 92 B 010 3 129 C (ROY) 93 B 09 4 130 T MB03 94 T reserved 131 C (MB03) 95 C (reserved) 132 T MB02 96 B 08 7 133 C (MB02) 97 B 07 8 134 T MB01 98 B 06 9 135 C (MB01) 99 B 05 11 136 T MBOO 100 B 04 12 137 C (MBOO) 101 B 03 13 35 37 38 41 42 43 45 48 49 51 4-277 82489DX BYPASS REGISTER The bypass register is simply a 1-bit shift register which connects between the tdi and tdo. When selected by using the bypass instruction, the data shifted into tdi will be shifted out from tdo one tck clock later. JTAG TAP Controller Initialization The TAP controller must be reset to test-logic-reset state when 82489DX is first powered up. There are two ways to reset the TAP controller: 1. Assert trst to be 0, it will reset the TAP controller asynchronously. 4-278 2. Assert tms to be 1, and clock the TAP controller at least five times, the TAP controller will be reset after the fifth rising edge of the tck. After reset, the idcode instruction is loaded into the IR automatically. Note that the tms and trst pins both have an internal weak pull-up device to keep them to be logic 1 level. Therefore the user can simply apply 5 clocks at the tck input to reset the TAP controller. If the TAP controller is not reset properly, 82489DX may not function because the boundary scan logic might be active which will impact the Signals flow in and out to the chip. 82489DX 11.0 ELECTRICAL CHARACTERISTICS NOTICE: This data sheet contains information on products in the sampling and initial production phases of development. The specifications are subject to change without notice. Verify with your local Intel Sales office that you have the latest data sheet before finalizing a design . 11.1 D.C. Specifications • WARNING: Stressing the device beyond the "Absolute Maximum Ratings" may cause permanent damage. These are stress ratings only. Operation beyond the "Operating Conditions" is not recommended and extended exposure beyond the "Operating Conditions" may affect device reliability. ABSOLUTE MAXIMUM RATINGS Case Temperature Under Bias ... -65°C to + 110°C Storage Temperature .......... -65°C to + 150°C Voltage on Any Pin with Respect to Ground ...... -0.5 to Vee + 0.5 Vee = 5V ±5%; Te = O°C to +85°C Symbol Min (ns) Max Units Input LOW Voltage (TTL) -0.3 +0.8 V VIH Input HIGH Voltage (TTL) 2.0 Vee + 0.3 V VOL Output LOW Voltage (TTL) +0.45 V (Note 1) VaH Output HIGH Voltage (TTL) V (Note 2) 200 mA VIL Parameter 2.4 Notes lee 33 MHz Power Supply Current III Input Leakage Current 15 p.A ILL Input Leakage Current -600 p.A (Note 5) ILH Output Leakage Current 600 p.A (Note 4) ILO Output Leakage Current 15 p.A (Note 3) CIN Input Capacitance 3 pF Co 1/0 or Output Capacitance 6 pF CeLKIN Clock Capacitance 3 pF IMLO ICC Bus Output Low Current 4 mA CMC ICC Bus Total Capacitance 100 pF VMH ICC Bus Input High (TTL) 2.0 Vec + 0.3 V VML ICC Bus Input Low (TTL) -0.3 +0.8 V (Note 6) NOTES: 1. This parameter is measured with current load of 4 mAo 2. This parameter is measured with current load of 1.0 mAo 3. This parameter is for output without pulldown. 4. This parameter is for tri-state output with pulldown and VOH 5. This parameter is for input with pullup at VIL = OV. 6. Ice bus output low current is measured at 0.6V. = 3.0V. 4-279 82489DX 11.2 A.C. Specifications A.C. Parameters Referencing 33 MHz System Clock vcc = 5V ±5%; Tc = O°C to +85°C Symbol Parameter te ClKIN Period Ref. Fig. Min (ns) Max (ns) Notes 30 30 100 (Note 1) Load (pF) t1 ClKIN High Time 30 5 t2 ClKIN low Time 30 5 t3 ClKIN Rise Time 30 3 (Note 2) t4 ClKINFali Time 30 3 (Note 2) t5 ADS, SGT, OLE, MilO, O/C, WIR, CS Setup Time 31 8 t6 031-00, OP3-0PO, A9-A3 Setup Time 31 8 t8 ADS, SGT, OLE, MilO, O/C, WIR, CS Hold Time 31 5 t10 031-00, OP3-0PO, A9-A3 Hold Time 31 t11 031-00, OP3-0PO, Valid Delay 30 50 t12 031-00, OP3-0PO, low-Z Delay When OLE is Not Used 32 50 t13 031-00, OP3-0PO, High-Z Delay When OLE is Not Used 32 50 t14 031-00, OP3-0PO Enable Delay When OLE is Used 33 50 3 42 t15 031-00, OP3-0PO Disable Delay When DIE is Used 33 50 3 14 t20 ROY Valid Delay 30 50 3 18 t21 PRST, PNMI, PINT Valid Delay 30 50 3 34 t22 RESET Setup Time 31 8 (Note 5) t23 RESET Hold Time 31 5 (Note 5) RESET Cyele Time 124 INTIN[15:0], LlNTIN[1:0] low Time 5 18 (Note 7) 3 14 (Note 7) 5 te (Note 3) 1tie (Note 3) 10 (Note 6) . All parameters are given in nanoseconds. TTL Level timing is measured at 1.5V for both "0" and "1" levels. NOTES: 1. Ice bus clock ICLK period must be at least 5 ns longer than system clock eLKIN for proper synchronization of the internal asynchronous signals. 2. System clock eLKIN measured from 0.BV-2.0V. 3. Minimum Reset cycle is the greater of the two cycle times. 4. Minimum pulse width must be met for valid level to be attained on the DATA or ADDRESS output. 5. Set up and hold time is required for RESET to start at the next rising edge of the clock. 6. INTIN and LlNTIN low time is measured from 1.5V of the falling edge to 1.5V of rising edge. 7. Not 100% tested. Guaranteed by design characterization. 4-280 82489DX Time Base A.C. Parameters vee = 5V ±5%; Te = O°C to +85°C Symbol Parameter Ref. Fig. Min (n8) Max (ns) 10000 tmc TMBASE Period 35 40 t30 TMBASE High Time 35 10 10 t31 TMBASE Low Time 35 t32 TMBASE Rise Time 35 8 t33 TMBASE Fall Time 35 8 Note TAP Controller A.C. Parameters vee = 5V ±5%; Te = O°C to + 85°C Parameter Symbol Ref. Fig. Min (ns) Max (ns) 1000 ttc TCK Period 35 40 t50 TCK High Time 35 10 t51 TCK Low Time 35 10 t52 TCK Rise Time 35 t53 TCKFaliTime 35 t54 TOI, TMS, TRST Setup Time 34 Note 8 8 10 t55 TOI, TMS, TRST Hold Time 34 5 t56 TOO VALID Delay 34 5 24 (Note 1) t57 Output Delay in EXTest in EXTEST Mode 34 5 27 (Note 1) t58 TRST Minimum Low Time 10 (Note 2) All parameters are given In nanoseconds. TIL level timing is measured at 1.5V for both "0" and "1" levels. NOTES: 1. These parameters are specified for 50 pF load. 2. This parameter is measured at 1.5V between the rising and falling edges. 4-281 82489DX A.C. Parameters for ICC Bus vee = 5V ±5%; Te = O°C to +85°C Symbol Parameter Ref. Fig. Min (ns) tic IClK Period 35 60 t40 IClK High Time 35 20 20 t41 IClK low Time 35 t42 . IClK Rise Time 35 t43 IClK Fa" Time 35 t44 MBI3-MB10 Setup Time 36 8 t45 MBI3-MB10 Hold Time 36 5 t46 MB03-MBOO VALID low Delay 36 t47 MB03-MBOO VALID High·Z Delay 36 t48 MB03-MBOO VALID low·Z Delay 36 Max (ns) Notes (Note 1) 10 10 50 (Note 2) 5 15 (Note 3) 12 25 (Note 3) All parameters are given In nanoseconds. TTL level timing is measured at 1.5V for both "0" and "1" levels. NOTES: 1. MB13-0 and MB03-0 timing is tested at 150 ns cycle time. 2. This parameter is specified for 50 pF load. 3. Not 100% tested. Guaranteed by design characterization. 12.0 REGISTER SUMMARY 82489DX registers can be located at any 1 Kbyte boundary in either memory or 1/0 space for as far as the 82489DX architecture itself is concerned. From a platform standard point of view, it is recommended to locate a" 82489DX local Units in memory space at address OxFEEO-OOOO. It is further recommend· ed that a" 82489DX 1/0 Units also be located in memory space; 1/0 Unit 1 at address OxFECO0000, 1/0 Unit 2 (if present) at address OxFECO1000,and so on. Chip select for the 82489DX should be based on a full decode of address pins A31-A10. A" directly accessible 82489DX registers are 32 bits wide and are aligned at 128·bit boundaries. The reg e ister being accessed is determined by bits 4 through 9 of the address. This is listed in the tables below. 4·282 Addresses not listed are reserved by the architec· ture. The tables also show whether the register is readable andlor writable by software, and what the side effects are of software accessing the register. After reset, a" registers are initialized to a" zeroes with the following exceptions, The local Unit ID field is initialized with data present on the 8 lSB address pins. The Mask bit is initialized to 1 ("masked" state) in a" entries in both the local vector table and the redirection table. For the 1/0 Unit, only the I/O register select and 1/0 window registers are directly accessible in the ad· dress space. The other 1/0 unit registers are ac· cessed indirectly through the select and window reg· ister. 82489DX 1/0 Unit Registers Register Address (9:4) SW 1/0 Register Select 000000 W 1/0 Window Register 000001 Register Side Effects I/O Reg Select (7:0) SW 1/0 Unit ID Register 00000000 rw Version Register 00000001 r Redirection Table [0] (31 :0) 00010000 Redirection Table [0] (63:32) 0001 0001 Redirection Table [1] [31 :0] 00010010 Redirection Table [1] [63:32] 0001 0011 rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw rw Redirection Table [2] (31 :0) 00010100 Redirection Table [2] (63:32) 0001 0101 Redirection Table [3] [31 :0] 0001 0110 Redirection Table [3] [63:32] 0001 0111 Redirection Table [4] (31 :0) 00011000 Redirection Table [4] (63:32) 0001 1001 Redirection Table [5] [31 :0] 0001 1010 Redirection Table [5] [63:32] , 00011011 Redirection Table [6] (31 :0) 00011100 Redirection Table [6] (63:32) 0001 1101 Redirection Table [7] [31 :0] 00011110 Redirection Table [7] [63:32] 0001 1111 Redirection Table [8] (31 :0) 00100000 Redirection Table [8] (63:32) 00100001 Redirection Table [9] [31 :00] 00100010 Redirection Table [9] [63:32] 00100011 Redirection Table [10] (31 :00) 00100100 Redirection Table [10] (63:32) 00100101 Redirection Table [11] [31:0] 00100110 Redirection Table [11] [63:32] 00100111 Redirection Table [12] (31 :0) 00101000 Redirection Table [12] (63:32) 00101001 Redirection Table [13] [31 :0] , 00101010 Redirection Table [13] [63:32] 00101011 Redirection Table [14] (31 :0) 00101100 Redirection Table [14] (63:32) 00101101 Redirection Table [15] [31 :0] 00101110 Redirection Table [15] [63:32] 00101111 Side Effects 4-283 82489DX LOCAL UNIT REGISTERS Address (9:4) SW Local Unit ID Register Registers 000010 rw Version Register 000011 r Reserved 000100 Reserved 000101 Reserved 000110 Reserved 000111 Task Priority Register 001000 Reserved 001001 4-284 Reserved" 001010 EOI Register 00 1011 Side Effects rw mask intr dispense rw prioritization cycle Remote Register 001100 r Logical Destination Reg. 001101 rw Destination Format Reg. 001110 rw Spurious Vector Register 00 1111 rw ISR (31:0) 010000 r ISR (63:32) 01 0001 r ISR (95:64) 01 0010 r ISR (127:96) 01 0011 r ISR (159:128) 01 0100 r ISR (191:160) 01 0101 r ISR (223:192) 01 0110 r ISR (255:224) 01 0111 r TMR (31:0) 01 1000 r TMR (63:32) 01 1001 r TMR (95:64) 01 1010 r TMR (127:96) 01 1011 r TMR (159:128) 01 1100 r TMR (191:160) 01 1101 r TMR (223:192) 01 1110 r TMR (255:224) 01 1111 r IRR (31:0) 100000 r IRR (63:32) 100001 r IRR (95:64) 100010 r IRR (127:96) 100011 r IRR (159:128) 100100 r IRR (191 :160) 100101 r 82489DX LOCAL UNIT REGISTERS (Continued) Address (9:4) SW IRR (223:192) Registers 100110 r IRR (255:224) 100111 r Intrpt Comnd Reg. (31 :0) 11 0000 rw Intrpt Comnd Reg. (63:32) 11 0001 rw Local Vector Table [timer] 11 0010 rw Reserved 11 0011 Reserved 11 0100 Local Vector Table [local int 0] 11 0101 rw Local Vector Table [local int 1] 11 0110 rw Reserved 11 0111 Initial Count Register 11 1000 rw Current Count Register 11 1001 r Reserved 11 1010 Reserved 11 1011 Reserved 11 1100 Reserved 11 1101 Divider Configuration Reg. 111110 Reserved 111111 Side Effects send interrupt rw NOTE: Address space 101000 to 101111 and 111111 are reserved 13.0 TIMING DIAGRAMS f+----Ic----+i elKIN Output Signal ---~ 290446-20 Figure 30. Output Waveform 4-285 82489DX 13.0 TIMING DIAGRAMS (Continued) ClKIN Input Signal ----' 290446-21 Figure 31. Input Waveforms ClK 0[31:0].OP[3:0]------1( 290446-22 Figure 32. Data Bus Tri-State Delays when OLE Sampled Low with ADS 4-286 82489DX 13.0 TIMING DIAGRAMS (Continued) ~ " 0[31:0], OPI3:0]-------·-«XX* k>--- 290446-23 Figure 33. Data Enable/Disable Delay when DLE is Sampled High with ADS TCK TOI, TMS. TRST Output Signal, TOO 290446-24 Figure 34. TAP Signal Timings 4-287 82489DX 13.0 TIMING DIAGRAMS (Continued) TMBASE ... 290446-27 Figure 35. TMBASE, IClK, TCK Timing t.!6 ICLK ""."' ' ' . '. ' ".' "". """. . . , ,_····C MBI t.!7 MBa·....··......·....·..........·........·............................................·............. ---..... IIIIIIIIIIIIIIIIIIIIIIIIIUI 11111111111111111111111111 11111111111111111111111111111111111111111111 --~ ........................................ .. 290446-28 Figure 36. ICC BUS Open-Drain Output Delay 4-288 82489DX 14.0 PACKAGE PIN-OUT 132-Lead PQFP Package Type KU (See Packaging Specification. Order Number 240800) N~O~~~~~~~N~o~ro~W~~~N~omOO~W~~~N-O _____ --------------------------------~~~NNNNNNNNNN----- oooooooo~o Vss vcc INTINO INTINl INTIN2 INTIN3 INTIN4 INTIN5 INTlN6 INTIN7 INTIN8 INTlN9 INTIN10 TOP VIEW 132 - LEAD PQFP 290446-25 NOTE: See pin description section for appropriate pin-strapping of the reserved pins. 4-289 82489DX 15.0 any environment, to determine whether the device is within the specified operating range. PACKAGE THERMAL SPECIFICATION The 82489DX is specified for operation when the case temperature is within the range of O°C to + 85°C. The case temperature may be measured in The PQFP case temperature should be measured at the center of the top surface opposite the pins, as shown in Figure below. MEASURE PQFP CASE TEMPERATURE AT CENTER OF TOP SURFACE 290446-26 Plastic Quad Flat Pack (PQFP) PQFP Package Thermal Characteristics Thermal Resistance-°C/W Air Flow Rate (Ft.!Min) Parameter (J Junction to Case (J Junction to Ambient 0 200 400 600 800 1000 8 8 8 8 8 8 32.5 25.5 20 18.5 16 15 NOTES: 1. Table above applies to 82489DX PQFP plugged into a socket or soldered directly into the board. 2. 8JA = 8JC 4·290 + 8CA. 82489DX 16.0 GUIDELINES FOR 82489DX USERS 16.1 Initialization This section outlines one possible initialization scenario. Other scenarios are certainly possible, and one would be selected as part of a platform standard initialization scheme. The intent of this section is to illustrate that the initializ~tion support provided by the 82489DX is adequate to support MP (Multiprocessor) system initialization. Each 82489DX has a RESET input pin connected to a common Reset line. Upon system reset, this common reset line is activated, causing all the 82489DXs to go through reset. All 82489DX local units (note: only local units and not I/O units) latch their ID from their address bus on reset. The ID can be provided by the bus control agent based on slot number. The local units next assert their processor's Reset pin, holding the processor in reset, and next perform their internal reset, setting all registers to their initial state. "The initial state of all 82489DX Units (both· local and I/O units) is "all masks set" and all Local Units disabled; registers are otherwise initialized to zero. Note that the PINT and PNMI output pins are in tri-state mode when the local unit is disabled. After this, each 82489DX local unit will deassert its processor's Reset pin, allowing the processors to come out of reset and perform self test and start executing initialization code. Note that while connecting PRST pin· it should be noted that whenever PRST pin· is activated by 82489DX either because of software reset message or hardware reset, the 82489DX itself is reset. It should be taken care in the cases of Warm reset where only processors need to be reset and not the interrupt controller. In brief, the usage of PRST depends upon the system requirement on various reset. Somewhere in. this code sequence, the processors that are "alive" will enable their 82489DX local units, and attempt to force all the other processors back into Reset. Forcing the other processors into reset is performed by sending them the inter-processor interrupt with Destination Mode = "Physical", Delivery Mode = "Reset", Trigger Mode = "Level", Level = "1", and Destination Shorthand = "All Excl Self". Only the first processor to get the ICC bus will succeed in sending this signal and reset all other 82489DXs and their processors. The other processors are kept in reset until such time that an MP operating system decides they can become active again. The only running processor next. performs the rest of system initialization. Eventually, an MP operating system will be booted at which time the operating system would send "deassert reset" interprocessor signals to activate the other processors in the system. A mechanism must be provided by the platform that allows the added processors to differentiate the very first reset from a subsequent one. 16.2 Compatibility COMPATIBILITY LEVELS The 82489DX can be used in conjunction with standard 8259A-style interrupt controllers to provide a range of compatibility levels. At the lowest level we have "PC shrink-wrap" compatibility. This level effectively creates a uniprocessor hardware environment within the MP platform capable of booting/running DOS shrinkwrap software. In this mode, only the 8259A generates interrupts and the 82489DX becomes a virtual wire. The interrupt latency can be minimized by connecting the 8259A interrupt to local unit directly. The next level preserves the software compatible view of an 8259A but it allows more than one processor to be active in the system. This results in an asymmetrical arrangement, with one processor fielding all 8259A interrupts but with added inter-processor interrupt capability. In this mode, 82489DX "merges" 8259A interrupts with inter-processor interrupts. Existing I/O drivers would be bound to the compatible CPU and interface directly with the 8259A. At the next compatibility level, 8259A compatible drivers can be mixed with native 82489DX drivers. Devices can generate interrupts at either 8259A or an 82489DX. This provides for partial symmetry as individual drivers migrate from the 8259A to native 82489DXs. Another 8259A compatible point can be defined for MP systems. Each processor could have its own compatible 8259A controllers, allowing multiple processors to run compatible I/O drivers, but statically spreading the load across the available processors. 82489DX/8259A INTERACTION The principle of compatible operation is very straightforward; the 82489DX(s) become a virtual wire connecting the 8259A's INT output through to the processor, while at the same time making 8259A visible to the processor. 4-291 82489DX The two connection schemes described only differ in the number of 82489DX(s) (one or two) that are located in the patti from the 8259A to the processor. In the one 82489DX example illustrated in Figure 37, the INT output of the 8259A connects to one of the Interrupt Input pins of the 82489DX through an edge generation logic. This could be an interrupt pin on the 82489DX's 1/0 unit or local unit; assume a local interrupt input is used. The Local Vector Table entry for the interrupt pin that connects to the 8259A is set . up with a Delivery Mode of "ExtINT" and edge trigger mode. This indicates that the interrupt is generated by an external controller. The processor's INT pin connects to the 82489DX PINT pin. This setup enables the 82489DX local unit to detect assertions (up-edges) of the 8259A's INT output pin and pass this on to the processor's INT input. 82489DX asserts ExtlNTA pin along with (one clock prior to) PINT pin to indicate "8259" interrupt. ~hen the processor performs its INTA cycle the 82489DX itself does not respond other than deasserting PINT to the processor. At the third clock after ADS in the second bus cycle of INTA cycle ExtlNTA is deasserted. External logic should make use of the ExtlNTA signal to make the INTA cycle visible to the 8259A and the 8259A should provide the vector. At the same time, the local unit considers the external request as delivered, and need not wait for the external 8259A's INT to be deasserted. A new up-edge must be.generated on the 8259A INT pin before the local unit will assert the processor's INT pin on behalf of the 8259A. External edge generation logic should be used for this. Compatible software interacts directly with the 8259A. The mechanism is essentially the same in the two82489DX scheme. The difference is that the 8259A connects to an interrupt input pin of the 82489DX 1/0 unit in the 1/0 system. The Redirection Table entry for this pin is again programmed with an "ExtINT" Delivery Mode, and the (single) 82489DX destination local ID corresponding to the compatible DOS processor. Capturing the up-edges of the 8259A's INT pin by the 82489DX local unit now involves sending messages from the 82489DX 1/0 unit to the 82489DX local unit via the ICC bus. The "virtual wire" now includes messages over the ICC bus. Adding inter-processor ICC interrupts (or any other 82489DX generated interrupts) to the compatible operation is accomplished by having the 82489DX ,internally OR the 8259A's INT request with any 82489DX interrupt request. Before the 82489DX actually sends the interrupt signal to the processor, the 82489DX decides whether it does this for an 8248.9DX interrupt or whether it does this on behalf of the external controller. When the processor performs the corresponding INTA cycle, only the 82489DX knows whether it shouldre4-292 spond with a vector, or whether the external 8259A should. If the 82489DX needs to respond, then it will enable an externally implemented trap that prevents the 8259A from seeing the INTA cycle. If the 8259A needs to respond, then the 82489DX will not enable the INTA trap, and the INTAwili be allowed to reach the 8259A. 82489DX implements this by asserting its EXTINTA pin to indicate external 8259A should respond with the vector.' The 82489DX local unit controls the INTA trap via its "ExtINTA" output pin; the 82489DX does not actually provide the trap itself. ~p "SYST(M" BUS BUS CNTL PINT PROCESSOR functional DATA Ellock 8259A Equivalent I/o EXPANSION BUS 290446-9 Figure 37_ Edge Logic 82489DX/8259A DUAL MODE CONNECTION In systems that can be booted either as a configura~ tion with compatible 8259A or without, device interrupt lines are connected to both the Interrupt Request pins of the 8259A and Interrupt Input pins of the 82489DX with all interrupts either masked at the 82489DX or at the 8259A. Some EISA and MicroChannel chip sets that include on-chip 8259As also have internally connected interrupt requests. For example, the 82357 (the ISP of the EISA chipset) generates timer and DMA chaining interrupts internally. These are not available as separate interrupts outside the ISP. In non compatible mode the ISP timers are not used, since each local 82489DX unit provides its own timer. Therefore, the ISP's 8259Ais configured to mask out all interrupts except the DMA chaining interrupt which is configured in level-sensitive, auto EOI mode. This causes the 8259A's INT output to track the state of the internal DMA interrupt request. The 8259A's INT output is then connected to one of the 82489DX interrupt input pins programmed to generate a regular (Le., not 82489DX "ExtINT") level-sensitive interrupt. The ISP 8259A then no longer functions as an external interrupt controller; it has been logically disabled, and it needs no interrupt acknowledge or EOI. The INTA and EOI cycles occur only at the 82489DX. It should be noted that 82489DX accepts only active high level/edge interrupt inputs. External programmable logic should take care of polarity reversal that may be needed in EISA system for sharing of interrupts. 16.3 Hardware Guidelines 82489DX HARDWARE STATE ON RESET The 82489DX goes to reset state either by Hardware Reset state or Software Reset message received on the ICC bus. On reset, 82489DX is disabled. The following is the hardware state of 82489DX after reset. ExtlNTA gets deactivated at the same clock edge. ExtlNTA Timings In the interrupt acknowledge cycle for External Interrupt control, 82489DX asserts ExtINTA. It decodes the type of cycle from CPU control signals like MilO, DIG and WIA. CPU does two bus cycles back to back for interrupt acknowledge cycle. 82489DX maintains ExtiNTA active throughout the first cycle. For next cycle (when the vector will be given by external 8259) after 82489DX senses the start of the cycle (by ADS) 82489DX deactivates ExtINTA. External control logic may be inserting wait states to match the 8259 timings. Since 82489DX has no way of finding out the cycle completion, 82489DX deasserts ExtlNTA before the second bus cycle gets completed. This should be kept in mind while using ExtlNTA for external interrupt control logic. PRST Active (HIGH) 82489DX AND MEMORY MAPPING PNMI TRI-STATED (Internal Pull-Down Provided) The 82489DX is a 32-bit high performance interrupt controller. It allows the CPU to do 32-bit read and write to it. By memory mapping 82489DX the system performance can be enhanced. It should be noted that 82489DX does not support pipelining. Eventhough 82489DX can be memory mapped, its functionality as an interrupt controller should be kept in mind while programming the virtual memory management control data structure. The caching policy for the page where 82489DX is mapped should also be done with the functionality of 82489DX in mind. For example, the reads to 82489DX should not be cached and writes should be write-through. Since 82489DX registers are aligned at 128-bit boundaries, memory mapping 82489DX with interleaved memory system should not be a problem. PINT TRI-STATED (Internal Pull-Down Provided) 82489DX is disabled on Reset and unless specifically enabled, it does not start its interrupt mechanism. The difference between hardware reset and software reset message is that during hardware reset 82489DX samples the address bus and stores the last sample in local Unit ID whereas for software reset it does not sample and store the unit ID. In addition, during the hardware reset pulse should be wide enough to accommodate for at least one rising and falling edge of IClK. On hardware reset ExtlNTA is held high. PULL UP AND PULL DOWN RESISTORS JTAG CIRCUIT CONSIDERATIONS PNMI, PINT are tri-stated at power on and they are maintained in tri-state condition till the unit is enabled. Eventhough internal pull down resistor is provided on PNMI and PINT external additional pull down resistor may be needed depending upon the loading on these pins by external logic. The DC characteristics gives the control specification from which the value of resistor, if needed, can be calculated. It should be kept in mind that the ICC bus being electrically open drain bus requires pull up resistors at the MBO pins. ICC bus output low current is just 4 mA. PINT and ExtlNTA Timings It should be noted that for ExtlNTA type of interrupts PINT gets activated one clock after ExtlNTA gets activated. When getting deactivated, both PINT and I b\@Wb\iill!;:1l! Oiilil'@OOIl>lb\'ii'O@iil The JTAG circuit is used for boundary scan test. The JTAG pins has a TCK, (JTAG clock), TRST, (JTAG Reset), TDI, (Test Data Input), TMS, (Test Mode Select) and TDO, (Test Data Output). The JTAG circuitry, if not used, should be properly deactivated so that it will not interfere in the normal functional operations. The JTAG can be inactivated in anyone of the following ways: 1. JTAG inactivation through TRST: The TMS, Test mode Select should be either left open (internal pull up is provided) or tied to Vee. The TRST can be pulsed low (bring it low and after meeting the pulse width requirement bring it back high again) to keep the JT AG circuitry to idle state. The TRST pulse brings the JTAG circuitry to idle state and TMS being kept high maintains the JTAG circuitry in idle state. 4-293 82489DX 2. JTAG inactivation through TCK: The TMS, Test mode select should be either left open (internal pull up is provided) or tied to Vee. The TCK within 5 clocks brings the JTAG circuitry to idle state. The TMS, being held at logic high level, maintains the JTAG circuitry in idle state. 16.4 Programming Guidelines The 82489DX register data structure contains different fields to specify the mode of operations and the options available within each mode. Since certain options are applicable to specific modes only (for example "Remote Read'! mode applies only to interrupt command register, it does not have relevance to I/O unit's redirection tables) the following programming guidelines are provided. UNIQUE 10 REQUIREMENT All the local units and I/O units hooked in a ICC bus should have unique ID before they can use the bus. This should be ensured by the programmer since for ICC bus arbitration the units (whether it is local unit or I/O unit) arbitrate with their unit ID. . ·For future compatibility, the Units should be assigned IDs starting with 0, 1, 2 etc. with the highest ID in the system being number of units minus 1. So in a four 82489DX system there are four local units and four I/O units. The ID starts with 0 and the highest ID in the system will be 7. Note that each unit should have different ID in the system. ATOMIC WRITE READ TO TASK PRIORITY REGISTER Normally, the task priority register is written with highest priority to mask certain low level interrupts before entering into critical section code. In a system where 82489DX is memory mapped the CPU may buffer this task priority register write to its on chip write buffer. The following scenerio can happen in such situation: CPU posts task priority register write to its on chip write buffer and enters into the critical code. A lower priority interrupt (which should not enter the critical code) interrupts the CPU before the write buffer gets flushed into task priority register). The CPU accepts the lower priority interrupt. To avoid the situation atomic write read to task priority register should be done. The read following write ensures that the write buffer is flushed to task priority register and the atomicity ensures that no interrupt will be accepted by the CPU during its write to task priority. It should be noted that if the CPU does interrupt acknowledge cycle only after flushing the write buffers then the above situation may not arise. 4-294 CRITICAL REGIONS AND MUTUAL EXCLUSION Each 82489DX has a single Interrupt Command Register that it uses to send interrupts to other processors. The programmer should make sure to synchronize access to this register: Specifically, 1). writing all fields of the register, 2). Sending the interrupt message (by writing the LSB register), and 3). waiting for Delivery State to become Idle again, should occur as a single atomic operation. For example, if interrupt handlers are allowed to send inter-processor interrupts, then interrupt dispensing to the processor must be disabled for the duration of these ac-. tivities. INTERRUPT COMMAND REGISTER PROGRAMMING SEQUENCE The interrupt command register (31 :0) has a side effect of sending interrupt once it is written. The destination is provided in the interrupt command register (63:32). So always interrupt command register (63:32) should be programmed before programming interrupt command register(31 :0). Program Interrupt Command Register (63:32) Program Interrupt Command Register (31:0) INTERRUPT VECTOR Two different interrupts should not be programmed with the same interrupt vector. LOCAL AND I/O UNIT Only Interrupt command register supports "Remote Read" Delivery mode. Local and I/O unit interrupts do not support "Remote Read". ICR (INTERRUPT COMMAND REGISTER) 1. ExtlNTA delivery mode is not supported for all destination shorthands. 2. "Remote Read" should always be programmed as "Edge" triggered interrupt. 3. "Remote Read" should always be programmed with physical destination mode (and not with Logical Destination mode). Broadcast addressing should not be used for Remote Read. 4. For "all incl Self" and "All exc. self" destination shorthands, "remote read" delivery mode should not be used. 5. For "all incl self" and "self" destination shorthands "Reset" delivery mode should not be used. 82489DX ICR (INTERRUPT COMMAND REGISTER) (Continued) 6. For "all exc self" destination shorthand if "Reset" delivery mode is used, it should be ensured at system level that only one processor executes this instruction at any time. 7. Messages could be sent out in "Logical" or "Physical" mode with destination 10 of all 1's depending on the way Destination mode entry is programmed. In brief, "All incl self" and "All exc. Self" support both "Logical" and "Physical" addressing mode. 8. When destination shorthand (Le., broadcast) is used with lowest priority destination mode, then even though all participates in arbitrating for destination, only the lowest priority gets the message. So even though the addressing is broadcast since the destination mode is lowest priority only one gets the message. 9. When destination shorthand (Le., broadcast) is used with "Fixed" destination mode, then all the units get the message. ISRIIRRITMR Bits 0-15 of IRR/ISR/TMR do not track interrupt. No interrupt of vector numbers from 0-15 can be posted. The total interrupt supported are 240. When reading the lowest 32 bits of these registers, 0 will be returned for the lower 16 bits. FOCUS PROCESSOR Focus processor is applicable only within the addressed units. ExtlNT Interrupt Posting The external interrupt has no priority relationship with the 82489DX priority. But when posting an interrupt to the processor, if both an external interrupt and a 82489DXinterrupt are pending, 82489DX could post either one to the processor. In 82489DX implementation, it would post external interrupt whenever there is no other 82489DX interrupt that can be posted to the processor. It should be also noted that External interrupts can not be masked by raising task priority. However, they can be masked by the mask bit in the table entry for the ("ExtINTA") external interrupt. The extlNT interrupts are specific in their characteristics in that they do not have any priority relationship with the rest of the interrupt structure. ISR and IRR bits in 82489DX are used to do the housekeeping functions for interrupt priority. Since extlNT interrupts do not have any priority relationship, ISR and IRR bits are not maintained for external interrupts. As far as interrupt acceptance is concerned, if more than one extlNT interrupts are directed towards a local unit, that local unit treats all the extiNT interrupts directed to it as only one extlNT interrupt. This leads to an important point that in a system not more than one interrupt should be programmed as extlNT interrupt type with the same destination. It should be noted that there can be more than one extlNT type of interrupt in a system with each having different local unit as destination. Synchronizing Arb IDs Initialization of an 82489DX's local unit 10 is implementation dependent. In some platforms, power-on reset will latch the right values into the 82489DXs; in other platforms, unique IDs may be assigned by initialization firmware. In both cases the 82489DX 1/0 unit should be assigned unique 10 by initialization firmware. The important point is that the 82489DXs are required to have unique IDs before they can use the bus, and in addition, all their Arb IDs must be "in sync". Synchronizing Arb IDs is accomplished as a side effect of a "deassert reset" interrupt command. This resets the (rotating) Arb 10 to the (constant) unit 10; it assumes that all 82489DXs have their unique 10. LOWEST PRIORITY "Only once delivery" semantics for a group destination is guaranteed only if multiple fixed delivery of the same interrupt vector are not mixed. For lowest priority arbitration to work, all the arbitration 10 of local 82489DXs in the system should be in sync. This means after local unit IDs are written in all local units (each 10 should be different from other IDs) a RESET DEASSERT message should be sent in ALL INCLUSIVE mode. The RESET DEASSERT message should be sent before system is used for lowest priority arbitration. This ensures that all ARB IDs are also different. (Arb IDs are copied from local unit IDs during RESET DEASSERT message.) The RESET DEASSERT message, if not sent, only one delivery semantics may not be guaranteed in the cases where lowest arbitration is used in the system. DISABLING LOCAL UNIT Once the 82489DX is enabled by setting bit 8 of spurious vector register to 1, the user should not disable the local unit by resetting the bit to O. The result will put the local unit in an inconsistant state. The local unit can be disabled by sending "reset" interrupt message to the local unit. 4-295 82489DX ISSUING EOI EOI, End of Interrupt issuing indicates end of service routine to 82489DX. The ISR bit which is set during INTA cycle gets cleared by EOL This section discusses the relevence of EOI to the specific types of interrupts and its timing related to interrupt deassertion. EXTERNAL INTERRUPTS AND EOI External Interrupts should be programmed as edge type. INTA cycles to external interrupts are taken automatically as EOI by 82489DX. This is similar to AEOI, Automatic End of Interrupt of 8259A. So there is no need to issue EOI to 82489DX for external interrupt servicing. This is done to achieve software transparency in the compatible mode. SPURIOUS INTERRUPTS AND EOI Spurious interrupts do not have any priority relationship to other interrupts in the system. So IRR is not set for spurious interrupts. EOI should not be issued for spurious interrupts. It is advisable not to share the spurious interrupt with any vector. If spurious interrupt vector is shared with some other interrupt then while servicing issuing EOI depends on the source of interrupt. If the source is spurious interrupt (for which the corresponding IRR is not set) then EOI should not be used. If the source is a valid interrupt sharing the spurious interrupt vector (for which the IRR is set) then EOI should be issued. of 256 interrupt vectors 16 priority levels are specified and 16 vectors share one priority level. Since the masking granularity by the task priority register is at priority level, group of 16 vectors get masked when task priority register is increased by one level. When task priority is at its minimum level of 0, interrupt vectors having level 1 to 16 are passed to CPU. Stated in other words, even when the task priority register is at its minimum (of level 0), interrupt vectors at level 0 will be masked. This means that the interrupt should not be programmed with vectors 0 to 15. So out of 256 interrupt vectors, only 240 interrupt vectors (vector 16 to 255) can be used in 82489DX. ExtiNT INTERRUPT AND TASK PRIORITY ExtlNT interrupt does not have any priority relationship with other interrupts or task priority register. So ExtlNT interrupt can not be masked by raiSing task priority. They can be masked by writing to the vector table entry which corresponds to ExtlNT interrupt. REMOVING MASKS When enabling units and removing Mask bits in situations where a device may already be injecting interrupts into the 82489DX system, the Mask in the Redirection Table should be removed last to ensure proper initial state (e.g., Remote IRR bit matching IRR in local unit). DELIVERY MODE AND TRIGGER MODE NM AND EOI For NM type of interrupt no IRR bit is set. So EOI should not be issued while servicing NMI type of interrupts. TASK PRIORITY REGISTER Task priority register is used to specify the priority of the task the processor is executing. In 8259 the priority is defined only among the interrupts that it handles. 82489DX goes farther ahead in handling priority. In multitasking system, in addition to device interrupts various tasks have different priority and 82489DX allows consideration of the priority at system level. The processor specifies the priority of the task it executes by writing to task priority register. Now any interrupts at and below the task priority will be masked temporarily till the task priority gets lowered. The masking granularity is at priority level. Out 4-296 It is software's responsibility to make sure that Delivery Mode and Trigger Mode are set to meaningful combinations as listed below. Delivery Mode Trigger Mode Fixed Lowest Priority Remote Read NMI Reset ExtlNT edge/level edge/level edge level level edge Software is also responsible for not using meaningless Delivery Modes in Redirection Table entries and local Vector Table entries (e.g., use of Remote Read delivery mode). 82489DX ASSIGNING INTERRUPT VECTORS Software has total control over the assignment of interrupt vectors to interrupt sources. The operating system writer should be aware of a number of things when doing this assignment. Some processor architectures assign a predefined meaning to some of the vectors (Le., entries in the interrupt table) as entry points to certain trap and exception handlers (e.g., divide error, invalid opcode, page fault, etc.). The programmer is strongly advised not to reuse these vectors. The programmer must also be careful when using the same vector number to represent different interrupt sources (sharing vectors). This is especially true for level triggered interrupts. When multiple sources with different Redirection table entries share an interrupt vector, any of the sources deactivating its level signal will remove the interrupt request for all sources. Giving each interrupt source its interrupt vector in any case is the preferred approach. SENDING INTER-PROCESSOR INTERRUPTS Each 82489DX has a single Interrupt Command Register that it uses to send interrupts to other processors. It is software's responsibility to synchronize access to this register. Specifically, 1) writing all fields of the register, 2) sending the interrupt message (by writing the low register), and 3) waiting for Delivery State to become Idle again, should occur as a single atomic operation. For example, if interrupt handlers are allowed to send inter-processor interrupts, then interrupt dispensing to the processor must be disabled for the duration of these activities. DELAY WITH LEVEL TRIGGERED INTERRUPTS When a level triggered interrupt source deasserts its interrupt input, the destination will clear the interrupt's IRR bit only after receiving the message from the ICC bus. This introduces a small delay between the removal of the interrupt at the source and the removal of the interrupt at the processor. To avoid generating unnecessary interrupts, the interrupt handier should remove the interrupt at the source (at the device) as early as possible in the handler. In any case, handlers should be able to deal with unnecessary interrupts. mands that use the "self" Destination Shorthand do not generate a message on the ICC bus. If software only wants to generate the side effect of resetting Arb IDs, it should use a command with Logical Destination Mode and a Destination field containing all zeroes. INTERRUPT MASKING There are a number of levels at which interrupts can be masked, each resulting in a different behavior on interrupt delivery. • First, interrupt injection (or deliver) can be masked by setting the Mask bit in the interrupt's Redirection Table or local Vector Table entry. These interrupts are ignored, no message is sent for them. Granularity is an individual interrupt. • Second, each 82489DX can individually mask interrupt dispensing by raising its Task Priority to some level. This 82489DX will not dispense interrupts to its processor of this and lower priority unless it is currently the focus of the interrupt. Note again that the 82489DX is designed to operate as fully nested with non-specific EOI (to use existing 8259A terminology). There is no explicit interrupt mask (such as MR) and there is no notion of specific EOI. • Third, each processor may provide a mechanism that masks all interrupt dispensing to it using the processor supplied instructions or status bits to do so. This does not interfere with lowest priority arbitration of the processor's 82489DX local unit. CHANGING REDIRECTION TABLES Redirection Tables are typically set up at initialization time. When modifying a Redirection Table entry "on the fly" the programmer must be aware of state kept at other 82489DXs relative to the interrupt being modified. DEVICE DRIVERS WITH 82489DX It is strongly recommended to read the device status registers before servicing the device. This is because if an edge triggered device deasserts its interrupt before interrupt acknowledge cycle (it should NOT) 82489DX will NOT give spurious vector. It will give genuine interrupt vector corresponding to the device. So, interrupt service routine should validate the interrupt request before servicing the device. RESET DEASSERT A side effect of a reset de assertion message broadcast in the ICC bus is that all 82489DX local units reset their Arb 10 to their unit 10. Interrupt com- 4-297 82489DX SYSTEM HARDWARE AND SOFTWARE DESIGN CONSIDERATIONS Design Consideration 1 Description: The following design consideration has to be taken care of when using ISP (82357) as external interrupt controller. 82489DX allows connecting external 8259 type interrupt controller at one of its inputs: The mode associated with the interrupt input which has 8259 connected to it is called ExtlNTA mode. 82489DX allows only EDGE TRIGGERED programming option for ExtiNTA mode. But in the case of 82357, the INT output from ISP stays high in case more than one interrupt is pending at its inputs. It does not always inactivate its INT output after INTA cycle. This will lead to a situation where ISP keeps the interrupt at high level continously and waits for INTA cycle. But since 82489DX expects an edge for interruptsensing (for ExtlNTA interrupts) it does not pass the interrupt to CPU and further interrupts are lost. So External circuitry should monitor the end of SECOND CYCLE of INTAcycie and force an inactive state at 82489DX's input. To avoid glitches at 82489DX input, this external logic should clear its output only at the end of second INTA cycle. It . should be set by high going 82357 output. It should never be cleared by low going 82357 output. That is it should not follow 82357 output. Design Consideration 2 additionally allows sharing of interrupts. To facilitate this sharing it has a programmable register, ELCR (Edge / Level trigger control register) by which certain interrupt inputs can be programmed as edge (low to high except for RTC) or level (the level is active low). The determination of edge or level is done during initial configuration of EISA system by reading EISA add in boards from the interrupt description data structures. The solution is to have programmable logic at the interrupt inputs so that 82489DX is compatible with EISA ISP. This will introduce one more register and logic to support this. This should be an 11-bit programmable register and an array of ExOR logic (12 ExOR gates or equivalent PLD). The ISP allows programmability of the following interrupts. INT3 INT4 INT5 INT6 INT7 INT9 INT10 INTH INT12 INT14 INT15_ In addition to the above 11 interrupts, it fixes INT8 to be active low edge triggered interrupt. INT8 is the only case where it is active low edge triggered type. So the following logic can be used to add programmability in 82489DX based EISA system. Before connecting these 11 interrupt lines directly (# INT8 which is from Real Time Clock is always active low edge triggered. #INT8 can be passed through an inverter since there is no need for programmability) to the 82489DX they should pass through an array of 11 E><-OR gates. One input of E><-OR gate connects to the corresponding INT pin and other input connects to a bit of programmable register. The output of E><-OR gate is connected to 82489DX. The idea of E><-OR is to use as a controlled inverter. Description: The following design consideration has to be taken care of when using 82489DX in EISA systems. EISA ISP(82357) chip integrates 8259A. It INTlN4 DO D1 D2 D3 D4 D5 D6 P R INTlN6 0 G R INTlN9 A M M INTIN 11 D7 - D8 D9 INTIN 14 DID 290446-43 4-298 82489DX INTIN are the interrupt inputs to the 82489DX and INT are the system interrupt. The Ell-OR gating Design Consideration 4 register is programmed after EISA configuration is found from add in boards as how these interrupt lines are going to be used in that particular configuration. If a particular input is edge triggered, then the corresponding bit in the register is written with O. If a particular input is level triggered, then the corresponding bit in the register is written with 1. This is related to ADS#, BGT# and CS# timings. For bus cycles not intended for 82489DX, (CS# = 1 where 82489DX is supposed to sample it), any change in CS# line while the ADS# is still active, may erroneously cause a RDY # returned from 82489DX. Anomolous behavior may result if for BGT # ties low cases a) BGT# goes away just one clock after ADS# or b) ADS# is still active, and CS# changes during this period. For other cases anomolous behavior results if CS# changes when ADS# is still active. The following considerations are important from timing point of view. Always limit the pulse width of 82489DX ADS# to one ClKIN. Also avoid changing levels on BGT#/CS# line, when ADS# is active for cases being identified as BGT # tied low (BGT # sampled low when ADS# g6es active). Also avoid changing levels on CS# line when BGT# is active. 8259 by itself does not have polarity control whereas 8259 when implemented in EISA chipsets have the polarity control. Similarly APIC does not have by itself polarity register. So polarity register should be programmed as a part of system BIOS and not APIC BIOS. Design Consideration 3 Icc bus drive is an open drain bus with drive capacity of 4 mA only. Since data is transmitted at each Icc clock, the "charging" of Icc bus should be fast enough to ensure proper logic level at each clock edge. The Icc bus needs pull up resistors since it is open drain bus. Since the drive is only 4 mA, the pull up resistor value can not be less than 5V / 4 mA. This being the limit of the resistor value, the length and the characteristics of the Icc trace forces a capacitance value. Both the resistor and capacitance brings a RC time constant to the Icc bus waveform. So, Electrical consideration has to be given to and practice of controlled impedence should be exercised for layout of the Icc bus. The length of the trace should be kept as minimum as possible. If the length of the Icc bus can't be kept less, than say 6 inch, because of mechanical design of the system, the external line drivers should be added to Icc bus and Icc bus should be simulated with the added driver characteristics. Design Consideration 5 82489DX does not recognize the interrupt when an edge occurs at the interrupt input pin while interrupt is masked. When later it is unmasked there is no further edge and so 82489DX never passes that forgotten edge and that interrupt channel becomes unusable after that. The recommendation is that first 82489DX should be unmasked and then the device interrupt should be enabled in the device register. By this, software can ensure that always an edge will occur after an interrupt is unmasked. Design Consideration 6 Description: Edge triggered interrupts should not deassert their output till they are acknowledged by INTA cycle from CPU. Issue: 82489DX MBI 1------------' 290446-44 NOTE: R1 can be typically 1K. R2 is designed from the simulation results. 82489DX employs glitch detection logic for edge triggered logic. To make sure the detected edge interrupt is not a glitch, 82489DX samples the input again before sending the interrupt message. The time difference between the first sampling of interrupt to be active and second sampling Gust before sending the interrupt) is not a constant number. This is because the ICC bus might have been occupied by other messages. So, for example if during first sampling it was detected that INTINO and INTIN15 are both active and after sending INTINO it samples INTIN15 again before sending message for 4-299 82489DX INTIN15. But between this time ICC bus might have been occupied by other messages.· So even if an edge triggered interrupt is held active high for a really long time and then brought low before INTA cycle, it is considered as a glitch. Because it may happen that the second sampling occurred just when the interrupt line got low. Once the glitch detection circuitry found this "glitch", it goes back to the state where it will start sampling and waiting for an active edge to occur. This takes more than one clock cycle (ClK) and if the "glitch interrupt" generates an edge before. that time after the second sampling of low level is done, then the edge is lost forever. Since the time when the second sampling is done is unknown, the best way is to make sure the edge triggered. interrupts do not deassert their outputs till they are acknowledged by INTA cycle from CPU. It is found that in some cases 8259 can generate brief active low pulses on its output. So the glue logic between 8259 and 82489DX input pin should make sure that 82489DX input pin is clear only after getting second interrupt acknowledge cycle. The glue logic should not just follow the 8259 output. Put in other words, after interrupt acknowledge cycle to 8259, if the 8259 input is seen active high, it should generate an edge at 82489DX input. Moreover, even if 8259 output goes low the glue logic should not lower its output since the only time when the glue logic can deassert its output is when it finds an interrupt acknowledge cycle for 8259. The following PlD equations and schematics serves as an example for the glue logic between 8259 and 82489DX. NOR M/IO#---f-Ic~--~~ INTAset D/C# 8259Cyc W/R# set 8259Cyc CYCLE# --,._....1 setTimer 8259INT----------H TO PLD T1 T2 13 T4 T5 1------+-APIC Input '---- BClK is the standard EISA BClK. CYCLE # is an external signal. to indicate ADS # is asserted earlier and RDY low is not yet returned. 4-300 290446-45 -. n+'e'® _I 82489DX APIC input = fTO· IReset· IAPIC input· 82591NT ·INTAset ; Sample 8259 interrupt + fTO· IReset· APICinput ·INTA set ; Hold till it is cleared by delayed interrupt acknowledge + IINTA set ·82591NT Set 8259Cyc = IReset· INTA • ExtlNTA 8259Cyc = set8259Cyc· fTO • IReset ; This INTA cycle is for 8259 ; Set 8259cyc will set 8259 cycle and TO will clear it + fTO • ISet8259Cyc • 8259Cyc • IReset; Hold 8259 cycle till TO clears it = IReset INTAget * IINTAset * INTA ; wait for very first INTA cycle after reset + IReset * INTAset ; once first INTA cycle after Reset is found, set the INTAget Set timer TO = = 8259cyc· IA2 ·/RDYlow ; Start the timer at end of second INTA cycle Set timer· fT5 • IReset ; Set timerwill set TO and T5 will clear + ISet timer· TO • fT5 • IReset ; Till T5 clears it hold TO Tl := TO • IReset • fT5 ; Follow TO after one clock for setting T2 := Tl T3 := T2 • IReset • fT5 ; Follow TO after three clock for setting but clear alon9 with TO T4 := T3· IReset· fT5 but clear along with TO • IReset • fT5 ; Follow TO after two clock for setting but clear along with TO T5:= T4· IReset ; Follow TO after four clock for setting but clear along with TO ; Follow TO after 5 clock for setting 290446-46 NOTES: T1, T2, T3, T4 and T5 are clocked Signals and others are combinatorials. This circuit and PLD equations are given for concept clarification purpose. They are not tested. INTAset is needed so that some 8259 logic at power on activates its INT output to 1 and it deactivates its output after only 8259 initialization (which should happen after APIC initialization) and since APIC needs to detect rising edge at 8259, it is essential to follow the 8259 until first interrupt. This is the only occasion 8259 output will be just followed. DIRECTIONS FOR EASY MIGRATION TO FUTURE INTEGRATED APIC The following are the software programming directions Intel strongly recommends for easy migration from 82489DX to integrated APIC. The audience to this portion of the document are both hardware designers and firmware developers for APIC based systems. In the following discussions, the APIC BIOS is viewed functionally as two subsections 1) APIC BIOS which are all interrupt vector, priority, interrupt distribution related functions and the remaining portion of BIOS which is referred to part of system BIOS which is responsible for interrupt polarity programming, starting next processor, etc. Note that the names APIC BIOS and APIC DRIVER are interchangeably used in the following discussion. Different Operating systems refer such functional module differently. Consideration 1 Question: The logical destination register· in future implemented APIC may have only 8 MSBs defined and 82489DX has 32 bits specified. Will this hinder binary level compatibility? 4-301 82489DX Response: In logical destination (flat addressing mode) 82489DX can go up to 32 CPUs whereas future APIC can go up to 8 CPUs with flat logical addressing mode. For binary compatibility, it is strongly recommended that 82489DX software use ONL Y 8 MSB of logical destination register. Consideration 2 Question: The present day rv1P systems with external control ports for starting next processor may program those external control ports for starting next processor. APIC DRIVER may use external control ports for starting next processor. In future implementations of APIC, the starting"of next processors may use more refined mechanisms which may not use external control ports. Will this introduce compatibility problem? Response: Again, the starting of next processor is really part of MP system DRIVER and depending of the mechanism used to start next processor it will vary. In future implementation if starting next processor is done using new mechanisms, the starting next processor portion of MP DRIVER will be changed accordingly. Even though this will not result in any change in the APIC DRIVER which deals with interrupt priority, distribution, etc., the corresponding change will be needed in the starting application processors portion of DRIVER. One possible method of implementing software is using version register. Version register is different in 489DX and future implementations of APIC. Taking care of these differences, such as mechanism for starting next processor, should be possible using version register. Consideration 3 Question: APIC architecture, by its nature, seems to misinterpret spurious interrupts as genuine interrupts. That is, if an edge triggered interrupt goes inactive before interrupt acknowledge cycle, APIC, instead of giving spurious interrupt vector, gives genuine interrupt vector. Is it true that this is not the case with 8259? If that is the case, drivers which do not check device status registers for servicing the device may work with 8259 but may not work with APIC. Is this a compatibility problem? Response: No, this is not true. Even with 8259 there is a time window in which a similar thing can happen. For example if interrupt goes inactive just after first INTA cycle but before second INTA cycle 8259 will also Signal this spurious interrupt as genuine interrupt. So drivers which do not check device status registers may also fail with 8259. 4-302 Our strong recommendation to device drivers is to read device status register before servicing the device. If the device status register indicates that there is no valid source of the interrupt, the service routine should just issue EOI and return. It should not service the device. This should take care of the new drivers that will be written for APIC. To coexist with 8259, the APIC interrupt input connected to 8259 will be programmed for virtual wire mode. In virtual wire mode, the time window of 8259 will apply. So the driver will behave same way as it was behaving with 8259. Consideration 4 Question: EISA system has active low level polarity. 82489DX itself does not have polarity control register to support this EISA feature. Implementations using external polarity register may implement the polarity register at different address. Will this introduce a problem for achieving the goal of single binary? Response: 8259 by itself does not have polarity control whereas 8259 when implemented in EISA . chipset has the polarity control. Similarly APIC does not have by itself polarity register. When implemented in ESC chipset, it will have polarity control register. So polarity register should be programmed as a part of EISA BIOS and not APIC BIOS. Since system BIOS or EISA BIOS should be able to take charge of changes, if any, to polarity control register. APIC BIOS should not be affected by differences in the address for polarity register. Consideration 5 Question: 8259 recognizes the interrupt when an edge occurs at the interrupt input pin even if the interrupt was masked. So when the interrupt input is later unmasked, the interrupt is posted to the CPU. 82489DX does not register this edge and if interrupt happens when the interrupt is masked 82489DX just ignores the interrupt. When later it is unmasked there is no further edge and so 82489DX never passes that forgotten edge and that interrupt channel becomes unusable after that. Response: When the interrupt is masked, logically interrupt controller should ignore whatever happens there. It is strongly recommended that first 82489DX should be unmasked and then the device interrupt should be enabled. By this sequence, software can ensure that always an edge will occur at the APIC input only after the interrupt is unmasked. Please contact Intel for platform level specification in Multiprocessor system design using APIC. UPI-41AH/42AH UNIVERSAL PERIPHERAL INTERFACE 8-BIT SLAVE MICROCONTROLLER 6 MHz; UPI-42: 12.5 MHz • UPI-41: Pin, Software and Architecturally • Compatible with all UPI-41 and UPI-42 Products CPU plus ROM/OTP EPROM, RAM, • 8-Bit I/O, Timer/Counter and Clock in a Single Package x 8 ROM/OTP, 256 x 8 RAM on • 2048 UPI-42, 1024 x 8 ROM/OTP, 128 x 8 RAM on UPI-41, 8-Bit Timer/Counter, 18 Programmable I/O Pins 8-Bit Status and Two Data • One Registers for Asynchronous Siave-toMaster Interface DMA, Interrupt, or Polled Operation • Supported Fully Compatible with all Intel and Most • Other Microprocessor Families Interchangeable ROM and OTP EPROM • Versions Expandable I/O • Sync Mode Available • 90 Instructions: 70% Single Byte • Over in EXPRESS • -Available Standard Temperature Range Programming Algorithm • -inteligent Fast OTP Programming in 40-Lead Plastic and 44• Available Lead Plastic Leaded Chip Carrier Packages (See Packaging Spec., Order .. 240800-001) Package Type P and N The Intel UPI-41AH and UPI-42AH are general-purpose Universal Peripheral Interfaces that allow the designer to develop customized solutions for peripheral device control. They are essentially "slave" microcontrollers, or microcontrollers with a slave-interface included on the chip. Interface registers are included to enable the UPI device to function as a slave peripheral controller in the MeS Modules and iAPX family, as well as other 8-, 16-, and 32-bit systems. To allow full user flexibility, the program memory is available in ROM and One.-Time Programmable EPROM (OTP). All UPI-41AH and UPI-42AH devices are fully pin compatible for easy transition from prototype to production level designs. TEST 0 1 REIn .. P24/08F P17 p,. '" 'u AO ViR '" NC '" '" SYNC '" P'5 P,. UPI-41AH UPI-42AH NC P'3 Do P'2 0, P" D2 P,o Voo 03 ." 210393-2 Figure 1. DIP Pin Configuration C"OIl)t!fb>YJ.~ o...~D..N~o..~~ 0.. 210393-3 Figure 2. PLCC Pin Configuration The complete document for this product is available on Intel's "Data-on-Demand" CD-ROM product. Contact your local Intel field sales office, Intel technical distributor, or call 1-800-548-4725. November 1994 Order Number:·210393-G08 4-303 UPI-C42/UPI-L42 UNIVERSAL PERIPHERAL INTERFACE CHMOS 8-BIT SLAVE MICROCONTROLLER • • • • • • • • Pin, Software and Architecturally • Compatible with all UPI-41 and UPI-42 • • • • • • • Products Low Voltage Operation with the UPIL42 - Full 3.3V Support Integrated Auto A20 Gate Support Suspend Power Down Mode Security Bit Code Protection Support ,8-Bit CPU plus ROM/OTP EPROM, RAM, I/O, Timer/Counter and Clock in a Single Package 4096 x 8 ROM/OTP, 256 x 8 RAM 8-Bit Timer/Counter, 18 Programmable I/O Pins DMA, Interrupt, or Polled Operation Supported One 8-Bit Status and Two Data Registers for Asynchronous Siave-toMaster Interface Fully Compatible with all Intel and Most Other Microprocessor Families Interchangeable ROM and OTP EPROM Versions Expandable I/O Sync Mode Available Over 90 Instructions: 70% Single Byte Quick Pulse Programming Algorithm - Fast OTP Programming Available in 40-Lead Plastic, 44-Lead Plastic Leaded Chip Carrier, and 44-lead Quad Flat Pack Packages (See Packaging Spec., Order .. 240800, Package Type P, N, and S) The UPI-C42 is an enhanced CHMOS version ofthe industry standard Intel UPI-42 family. It is fabricated on Intel's CHMOS III-E process. The UPI-C42 is pin, software, and architecturally compatible with the NMOS UPI family. The UPI-C42 has all of the same features of the NMOS family plus a larger user programmable memory array (4K), integrated auto ,/\20 gate support, and lower power consumption inherent to a CHMOS product. The UPI-L42 offers the same functionality and socket compatibility as the UPI-C42 as well as providing low voltage 3.3V operation. The UPI-C42 is essentially a "slave" microcontroller, or a microcontroller with a slave interface included on the chip. Interface registers are included to enable the UPI device to function as a slave peripheral controller in the MCS Modules and iAPX family, as well as other 8-, 16-, and 32-bit systems. To allow full user flexibility, the program memory is available in ROM and One-Time Programmable EPROM . ~~. ~ ~ & 5 4 3 2 , ~ ~ rt"rt""'ic;!>~tk~4'g" .... 3 .. 2 .. ' .. 0 o "' .... 3 .. 2 .. ' .. 039383736353 .. 'v" PO, UPI-C42 UPI-L42 UPI-C42 UPI-L42 18 19202122232425:Z62728 12131.0115161711119202122 ~&'&'''>~~ ·~i .~.~~ 290414-1 Figure 1. DIP Pin Configuration ~ ." 290414-2 Figure 2, PLCC Pin Configuration 290414-3 Figure 3. QFP Pin Configuration The complete document for this product is available on Intel's "Data-on-Demand" CD-ROM product Contact your local Intel field sales office, Intel technical distributor, or caI/1-BOO-54B-4125. 4-304 October 1994 Order Number: 290414-003 8XC51SL/LOW VOLTAGE 8XC51SL KEYBOARD CONTROLLER 80C51SL -CPU with RAM and 110; Vee = 5V ± 10% 81C51SL - 16K ROM Preprogrammed with SystemSoft Keyboard Controller and Scanner Firmware. Vee = 5V ± 10%. 83C51SL - 16K Factory Programmed ROM. Vee = 5V ± 10%. 87C51SL - 16K OTP ROM. Vee = 5V ± 10%. Low Voltage 80C51SL-CPU with RAM and 1/0; Vee = 3.3V ±0.3V . Low Voltage 81C51SL-16K ROM Preprogrammed with SystemSoft Keyboard Controller and Scanner Firmware. Vee = 3.3V ±0.3V. Low Voltage 83C51SL- 16K Factory Programmed ROM. Vee = 3.3V ± 0.3V. Low Voltage 87C51SL- 16K OTP ROM. Vee = 3.3V ± 0.3V. Proliferation of 8051 Architecture • Complete • Functionality8042 Keyboard Control 8042 Style Host Interface • Optional Speedup of • GATEA20Hardware and RCL Local 16 x 8 Keyboard Switch Matrix • Support Industry Standard Serial Keyboard • Two Interfaces; Supported via Four High Drive Outputs Drivers • 5LowLEDPower CHMOS Technology • 8-Bit AID • 4-Channel, Interface for up to 32 Kbytes of • External Memory Rate Controlled Buffers Used • Slew to Minimize Noise 256 Bytes Data RAM • Three Multifunction Ports • 10 Interrupt Sources with 6 User• Definable External Interrupts 1/0 1/0 II 2 MHz-16 MHz Clock Frequency PQFP (8XC51SL) • 100-Pin 100-Pin SQFP (Low Voltage 8XC51SL) The 8XC51SL, based on Intel's industry-standard MCS® 51 microcontroller family, is designed for keyboard control in laptop and notebook PCs. The highly integrated keyboard controller incorporates an 8042-style UPI host interface with expanded memory, keyboard scan, and power management. The 8XC51SL supports both serial and scanned keyboard interfaces and is available in pre-programmed versions to reduce time to market. The Low Voltage 8XC51SL is the 3.3V version optimized for even further power savings. Throughout the remainder of this document, both devices will generally be referred to as 51SL. The 8XC51SL is a pin-for-pin compatible replacement for the 8XC51SL-BG. It does, however have some additional functionality. Those additional functions are as follows: 1. 16K OTP ROM: The 8XC51SL-BG had only 8K of ROM. 2. New Register Set: The 8XC51SL adds a second set of host interface registers available for use in supporting power management. This required an additional address line (A1) for decoding. To accommodate this, one Vee pin was removed. However, in order to maintain compatibility with the -BG version, an enable bit for this new register set was added in configuration register 1. This allows the 8XC51SL to be drop in compatible to existing 8XC51SL-BG designs; no software modifications required. NOTE: The changes made to the Vee pins require that all three Vee pins be properly connected. Failing to do so could result in high leakage current and possible damage to the device. The completf} document for this product is available on Intel's "Data-on-Demand" CD-ROM product. Contact your local Intel field sales office, Intel technical distributor, or caI/1-800-548-4725. November 1994 Order Number: 272271-002 4-305 intel· AP-366 APPLICATION NOTE 89C124FX Data/FAX Modem Chip Set Reduc~ion of Power Consumption JIN LIEN LIN TECHNICAL MARKETING ENGINEER July 1992 4-306 I Order Number: 292101-001 89C124FX Data/FAX Modem Chip Set Reduction of Power Consumption CONTENTS PAGE CONTENTS PAGE INTRODUCTION ....................... 4-308 BLOCKING DC PATH ................. 4-308 GENERAL DESCRIPTION ............. 4-308 CURRENT DRIFT ...................... 4-308 POWER DOWN DETECTION .......... 4-308 DESIGN TRADE OFF .................. 4-308 I 4-307 AP-366 INTRODUCTION BLOCKING DC PATH The 89CI24FX DatalFax Modem Chip Set Application Note provides the end user with applications and layout guidelines to reduce power consumption to a minimum when in the Power Down Mode. When the power down . detector shuts off the bipolar switches, some of the device input pins become current sinks and drain current from the controller output pins when the output pins are high. These controller output pins are the CLKOUT and SCLK outputs. To eliminate these DC paths in the power down mode, add an inverter from the controller dock output (CLKOUT) to the AFE clock input (CLKIN) and use an AND gate to change the SCLK output to the AFE to a low. GENERAL DESCRIPTION When the 89CI24FX is in the power down mode, the microcontroller (89CI26FX) consumes very little power (less than 0.5 mW). However, the external memory, voltage regulators and peripherals draw excess current that makes the overall system power consumption more than 80 mW. This application note describes a method to reduce overall system power consumption to less than I mW when the 89CI24FX is in the power down mode. Adding a power down feature in the microcontroller and reducing power sink to a minimum accomplishes this goal. Three steps are required to reduce the overall system power consumption in the power down mode: I. Detect power down in the microcontroller and isolate the power source from potential current sink from other components. 2. Inhibit the DC path from the power supply, through other components, to ground when the microcontroller is powered down. 3. Solve current drift problems due to floating inputs when power is removed from peripherals. CURRENT DRIFT When the controller is in the power down mode, the SDATA pin becomes a floating input that can draw current in excess of 300 /LA. Placing a 510 KO resistor between SDATA pin and ground solves this problem. Using a 100 KO resistor or lower may impede circuit functionality. / , - - - 89C124FX --~" 8SC 126FX SDATA 8S127 . SDATA S10K -::'- - 292101-1 POWER DOWN DETECTION Monitoring the clock output (CLKOUT, pin 65) from the microcontroller detects the power down condition. The CLKOUT pin is held high when the controller is in the power down mode. When power down is detected, the detector shuts off the + 5V supply to all components except the microcontroller, RAM, and logic gates. The detector also shuts off power to the voltage regulators. 4-308 Figure 1. Placement of Current Drain Blocking Resistor DESIGN TRADE OFF The reduced power drain feature design trade-off, besides adding circuit complexity, is an additional 20 mW power consumption when the 89CI24FX is active. I AP-366 2N3906 74HC02 +5V SWITCHED +5V 15K FROM CONTROLLER CLKOUT 74HC02 ..I. 220 pF 74H008 TO AFE SCLK POWER DOWN CONTROL INPUT FROM CONTROLLER SCLK +12V 510K 292101-2 Figure 2. 89C124FX Power Consumption Circuit Modifications I 4·309 infel· AP-358 APPLICATION NOTE Intel 82077SL for Super Dense Floppies 4MB ~_ _----L.___ 292093-1 KATEN A. SHAH APPLICATION ENGINEER September 1992 4-310 I Order Number: 292093-002 Intel 82077SL for Super Dense Floppies CONTENTS PAGE INTRODUCTION ....................... 4-312 CONTENTS PAGE 82077AA/SL'S PERPENDICULAR MODE SUPPORT . ................... 4-315 PURPOSE ............................. 4-312 PERPENDICULAR RECORDING MODE . .............................. 4-312 PERPENDICULAR DRIVE FORMAT AND SPECIFICATION ............... 4-314 PROGRAMMING PERPENDICULAR MODE ............................... 4-317 INTERFACE BETWEEN 82077AA/SL AND THE DRIVE .................... 4-320 82077SL 4 MB DESIGN ............... 4-325 PERPENDICULAR MODE COMMAND .......................... 4-314 I 4-311 Ap·358 INTRODUCTION The evolution of the floppy has been marked in little over a decade by a significant increase in capacity accompanied by a noticeable decrease in the form factor from the early 8 inch floppy disks to the present day 3.5 inch floppy disks. This decade will also be remarkable as OEMs adopt "Super" dense floppies. The most commonly seen floppies today are invariably one of the form factors - the 5.25" or the 3.5" . Each form factor has several associated capacity ranges. The 5.25" floppies available are: 180 KB (single density), 360 KB (double density) and 1.2 MB (high density). KB (double densiThe 3.5" floppies available are: ty) and 1.44 MB (high density). The emerging super dense floppies will evolve on the installed base of 3.5" floppies. The latest member of this set is the 2.88 MB (extra density) floppy, pioneered by Toshiba. The cornerstone of market acceptance of newer drives is compatibility to the older family. The 2.88 MB (formatted) floppy drive allows the user to format, read from and write to the lower density diskettes. no As programs and data files get bigger, the demand for higher capacity floppies becomes obvious. There are several 3.5" higher density drives available from various vendors with capacities well into the 20 MB range. NEC has introduced a 13 MB drive and companies such as Insite have introduced 20 MB drives. Both drives require servo-mechanisms to accurately position the head over the right track. NEC's drive has the standard floppy drive interface whereas Insite's interface is SCSI based. The market for these floppy drives will remain a niche unless they receive!llpre OEM support. Initiated by Toshiba's research and innovation of the higher density 4 MB floppy disk media, the market is headed towards the super dense floppy drive. After 4-312 IBM's endorsement of the 4 MB (unformatted) floppy disk drives on their PS/2 model 57 and PS/2 model 90, several OEMs have shown a growing interest in "super" dense fl!?ppy disk drives. The latest DOS 5.0 supports the new 4 MB floppy media and BIOS vendors like Pheonix, AMI, Award, Quadtel, System Soft, and Microid all support the newer 4 MB floppy media. PURPOSE An important consideration to implement the 4 MB floppy drive is the floppy disk controller. Intel's highly integrated floppy disk controller, 82077AA/SL, has led the market in supporting the 4 MB floppy drive. Two ingredients are necessary to fully support these drives: I Mbps transfer rate and the perpendicular recording mode. This paper deals with a discussion of what the perpendicular mode is and how can a 4 MB floppy disk drive be implemented in a system using the 82077AA/SL. PERPENDICULAR RECORDING MODE Toshiba has taken the 2 MB floppy and doubled the storage capacity by doubling the number of bits per track. Toshiba achieved this by an innovative magnetic recording mode, called the vertical or the perpendicular recording mode. This mode utilizes magnetization perpendicular to the recording medium plane. This is in contrast to the current mode of longitudinal recording which uses the magnetization parallel to the recording plane. By making the bits stand vertical as opposed to on their side, recording density is effectively doubled, Figure I. The new perpendicular mode of recording not only produces sharp magnetization transitions necessary at higher recording densities, but is also more stable. I AP-358 The 4 MB disks utilize barium ferrite coated substrates to achieve perpendicular mode of magnetization. Current disks use cobalt iron oxide (Co-g-Fe203) coating for longitudinal recording. The barium ferrite ensures good head to medium contact, stable output and durability in terms of long use. High coercivity is required to attain high recording density for a longitudinal recording medium (coercivity specification of a disk refers to the magnetic field strength required to make an accurate record on the disk). A conventional head could not be used in this case; however, the barium ferrite disk has low coercivity and the conventional ferrite head can be used. The new combination heads include a pre-erase mechanism, i.e., the ferrite ring heads containing erase elements followed by the read/write head. These erase elements have deep overwrite penetration and ensure complete erasure for writing new data. The distance between the erase elements and the read/write head is about 200mm. This distance is important from the floppy disk controller point of view and will be discussed in later sections. PERPENDICULAR RECORDING N S N S N S N S MAGNETIC LAYER MAGNETIZATION LONGITUDINAL RECORDING ::::::::;+--- SUBSTRATE 292093-2 Figure 1. Perpendicular vs Longitudinal Recording I 4-313 AP-358 PERPENDICULAR DRIVE FORMAT AND SPECIFICATION PERPENDICULAR MODE COMMAND The current 82077AA/SL parts contain the "enhanced" perpendicular mode command as shown in Figure 3. This is a two byte command with the first byte being the command code (OxI2H). The 2nd byte contains the parameters required to enable perpendicular mode recording. The former command (in the older 82077 parts) included only the WGATE and GAP bits. This command is compatible to the older mode where only the two LSBs are written. The enhanced mode allows system designers to designate specific drives as perpendicular recording drives. The second byte will be referenced as the PR[O:7] byte for ease of discussion. The following discusses the use of the enhanced perpendicular recording mode. . Figures 2a and 2b show the IBM drive format for both double density and perpendicular modes of recording. The main difference in recording format is the length of Gap2 between the ID field and the Data field. The main reason for the increased Gap2 length is the preerase head preceding the read/write head on the newer 4 MB floppy drives. The size of the data field is maintained at 512 KBytes standard. The increase in the c;apacity is implemented by increasing the number of sectors from 18 to 36. Table I shows the specifications of the various capacity 3.5" drives. INDEX PULSE '-"----------SECTOR01-----------<,I ,I Data Field 10 Field I· , I· GAP4. SYNC lAM I GAPI SYNC lOAM BYTES (decimal) 80 12 3 1 50 12 3 1 DATA (hex.) 4E 00 C2 FC 4E 00 AI FE C H R N GAP2 SYNC 41 12 3 I 4E 00 AI FB/F8 C R C ·+. DATA AM DATA (512 BYTES) C R C G ·1 292093-3 Figure 2a. Conventional IBM 1 MB and 2 MB Format (MFM) INDEX PULSE . . ~1·----------SECTORD1-----------' f GAP4. SYNC lAM GAPI SYNC 10 Field " lOAM BYTES (decimal) 80 12 ~ I 50 12 3 I DATA (hex.) 4E 00 C2 FC 4E 00 AI FE C H R N I Data Field GAP2 SYNC 41 12 3 I 4E 00 AI FB/F8 C R C " DATA AM DATA (512 BYTES) C R C G W+W"I 292093-4 Figure 2b. Perpendicular 4 MB Format (MFM) T Data Bus Phase IR/wi Command I~ I 07 06 0 0 0 OW 0 03 05 04 03 02 01 DO I Remarks. PERPENDICULAR MODE COMMAND 1 02 0 01 0 1 DO GAP ,I o Command Code WGATE PR Figure 3. Perpendicular Mode Command 4-314 I AP-358 Table 1. Specifications of FDDs Various Parameters Used In the Different Kinds of FDDs. Number of Cylinders Sectors/Track 5.25" 5.25" 3.5" 3.5" 3.5" 360KB 1.2MB 720KB 1.44MB 2.88 MB 40 80 80 80 80 9 15 9 18 36 Formatted Capacity 354 KB 1.2 MB 720 KB 1.44 MB 2.88MB Unformatted Capacity 360 KB 1.6 MB 1 MB 2MB 4MB 300 360 360 300 300 300 Rotation Speed (rpm) XT AT Track Density (tpi) Recording Density (bpi) Data Transfer Rate (Mbps) XT AT 48 96 135 135 135 5876 9870 8717 17432 34868 0.25 0.30 0.5 0.25 0.5 1 Gap Length for Read/Write 42 42 27 27 56 Gap Length for Format 80 80 84 84 83 Sector Size 512 KB 512 KB 512 KB 512 KB 512 KB Density Notation ~O/OS HO/OS ~O/OS HO/OS ED/OS The following describes the various functions of the programmed bits in the PR: OW If this bit is not set high, all PR[2:S] are ignored. In other words, if OW = 0, only GAP and WGATE are considered. In order to select a drive as perpendicular, it is necessary to set OW = I and select the Dn bit. Dn This refers to the drive specification bits and corresponds to PR[2:S]. These bits are considered only if OW = 1. During the READ/ ~WRITE/FORMAT command, the drive selected in these commands' is compared to Dn. If the bits match then perpendicular mode will be enabled for that drive. For example, if DO is set then drive 0 will be configured for perpendicular mode. GAP This alters the Gap2 length as required by the perpendicular mode format. WGATE Write gate alters timing of WE to allow for pre-erase loads in perpendicular drives. The VCOEN timing and the length of the Gap2 field (explained above) canbe altered to accommodate the I unique requirements of the 4 MB floppy drives by GAP and WGATE bits of the PRo Table 2 describes the effects of the GAP and WGATE bits for the perpendicular command. 82077 AA/SL's PERPENDICULAR MODE SUPPORT The 82077 AA and 82077SL both support 4 MB recording mode. The 82077SL has power management features included as well. Both AA and SL product lines have three versions each out of which two of the versions support the 4 MB floppy drives. The 82077AA-l, 82077 AA, 82077SL, and 82077SL-I all support the 4 MB floppy drives. A single command puts the 82077 AA/SL into the perpendicular mode. This mode also requires the data rate to be set at 1 Mbps. The FIFO that is unique to Intel's 82077AA/SL parts may become necessary to remove the host interface bottleneck due to the higher data rate. The 4 MB floppy disk drives are downward compatible to I MB and 2 MB floppy diskettes. The following discussion explains the implications of the new 4 MB combination head and the functionality of the perpendicular mode command. 4-315 AP-358 Table 2. Effects of GAP and WGATE Bits Mode VCOLow Time after Index Pulse Length of Gap2 Format Field Portion of Gap2 Written by Write Data Operation Gap2VCO Low. Time for Read Operations GAP WGATE 0 0 Conventional 33 Bytes 22 Bytes o Bytes 24 Bytes 0 1 Perpendicular (Data Rate = 500 kbps) 33 Bytes 22 Bytes 19 Bytes 24 Bytes 1 0 Conventional 33 Bytes 22 Bytes o Bytes 24 Bytes 1 1 Perpendicular 33 Bytes 41 Bytes 38 Bytes 43 Bytes The implementation of 4 MB drives requires understanding the Gap2 (see Figures 2a and 2b) and veo timing requirements unique to these drives. These new requirements are dictated by the design of the "combination head" in these drives. Rewriting of disks in the 4 MB drives requires a pre-erase gap to erase the magnetic flux on the disk preceding the writing by the read/write gap. The read/write gap in the 4 MB drive does not have sufficient penetration (as shown in Figure 4a) to overwrite the existing data. In the conventional drives, the read/write gap had sufficient depth and could effectively overwrite the older data as depicted in Figure 4b. It must be noted that it is necessary to write the conventional 2 MB media in the 4 MB drive at 500 Kbps perpendicular mode. This ensures proper erasure of existing data and reliable write of the new data. The pre-erase gap in the 4 MB floppy drives is activated only during format and write (:Ommands. Both the preerase gap and read/write gap are activated at the same time. As shown in Figure 4a, the pre-erase gap precedes the read/write gap by 200mm. This distance translated to bytes is about 38 bytes at a data rate of I Mbps and 19 bytes at 500 Kbps. Whenever the read/write gap is enabled by the Write Gate signal the pre-erase gap is activated at the same time. PRE-ERASE GAP RD/WR GAP COMBI NATION HEAD !11!llll!II!I!IIII!llll!I!1111!1111!11111!!!I!!I!!!!I!I!!!!!! 10 FIELD 200 = 38 bytes 1 Mbps j.£m @ 292093-5 Figure 4a. Head Design for the 4 MB Perpendicular Mode RD/WR GAP CONVENTIONAL HEAD 1!!!!I!I!!I!1111!1!11!!I!ll!I!!III!II!II!!!i!iil!il!illiil!lli!iil!il!ii!l!illli!liliilil 10 fiELD ~~~~-L~~~~,\ GAP2 = 22 bytes of 4E " I I " DATA FIELD 292093-6 Figure 4b. Head Design for the Conventional 2 MB Mode 4-316 I Ap·358 In conventional drives, the Write Gate is asserted at the beginning of the sync field, i.e., when the read/write is at the beginning of the data field. The controller then writes the new sync field, data address mark, data field and eRe (see Figure 2a). With the combination head, the read/write gap must be activated in the Gap2 field to ensure proper write of the new sync field. To accommodate both the distance between the pre-erase gap and read/write gap and the head activation and deactivation time, the Gap2 field is expanded to a length of 41 bytes at 1 Mbps (see Figure 2b). Since the bit density is proportional to the data rate, 19 bytes will be written in the Gap2 field at 500 Kbps data rate in the perpendicular mode. On the read back by the 82077AA/SL, the controller must begin the synchronization at the beginning of the sync field. For conventional mode, the internal PLL veo is enabled (VeOEN) approximately 24 bytes from the start of the Gap2 field. However, at 1 Mbps perpendicular mode the VeOEN goes active after 43 bytes to accomodate the increased Gap2 field size. For each case, a 2 byte cushion is maintained from the beginning of the sync field to avoid write splices caused by motor speed variation. It should be noted that none of the alterations in Gap2 size, veo timing or Write Gate timing affect the nor- mal program flow. Once the perpendicular command is invoked, 82077AA/SL behaviour from the user standpoint is unchanged. PROGRAMMING PERPENDICULAR MODE Figures Sa and Sb show a flowchart on how the perpendicular recording mode is implemented on the 82077 AAiSL. The perpendicular mode command can be issued during initialization. As shown in Figure Sa the perpendicular command stores the PR value internally. This value is used during the data transfer commands for configuration in order to deal with the perpendicular drives. Table 2 shows how the Gap2 length, VeOEN timing or Write Gate timing is affected. The OW bit is also tested for in this part of the loop. The enhanced perpendicular mode is enabled by setting the OW = 1, setting the Dn bits corresponding to the installed perpendicular drive high and leaving PR[O:I] = '00'. As shown in Figure Sb, the Gap2 length is initially set to the conventional length of 22 bytes. Next the PR[O:I] bits (GAP, WGATE) are checked if they are set to '00'. If the PR[O:I] bits are set to '10' then, perpendicular mode is disabled and conventional mode is retained. If the PR[O:l] = '01' or '11' the VeOEN is 292093-7 Figure Sa. Perpendicular Command Handling I 4-317 AP-358 y 292093-8 Figure 5b. During Data Transfer Commands 4-318 I intel~ set to activate 43 bytes or 24 bytes from the start of the Gap2 field, depending on the value as shown in Table 2. After this, PR[O:I] = '11' is checked; if not true (programmed '01') the program is exited with only the VCOEN timing being set for perpendicular mode. If true, however, the Gap2 length is set up for perpendicular mode (note: this is done independent of the data rate). It must be noted that if the PR[O: I] bits are set to 'II' then it is up to the user to disable precompensation before accessing perpendicular drives. The other branch of the flowchart refers to setting of PR[O:ll to '00'. In this case, the perpendicular command will have the following effect: 1. If any of the Dn bits in PR[2:S] programmed high, then precompensation is automatically disabled (0 ns is selected for the specified drive regardless of the data rate) and VCOEN is set to activate appropriately. All the bits that are set low will enable the 82077 to be configured for conventional mode, i.e., exit the program without modifications (shown Figure Sb). 2. Next the data rate is checked for I Mbps. If the data rate is at I Mbps, then Gap2 length is set to 41 bytes, otherwise, the program is exited without setting up the Gap2 to 41 bytes. It must be noted that if PR[2:S] are to be recognized in the command the OW bit must be set high. If this bit is low, setting of Dn bits will have no effect. Setting the OW bit will enable the storage of the Dn bit. Also setting PR[O:I] to any other value than '00' will override anything written in the Dn bits. In other words, setting PR[O:I] to a value other than '00' enables the effect of that for all drives. It must be noted that if PR[O: I] bits are set to a value other than '00' then it is recommended not to use the enhanced command mode, i.e., all other bits should be zero. Consider the following examples: a. PR[0:7] = Ox84; This is the way to use the command in the enhanced mode. In this case, the OW = I and DO is set high. During the data transfer command, if DO is selected it will be automatically configured for perpendicular mode. If DI is accessed, however, it will be configured for conventional mode. Similarly, if PR[0:7] = Ox88 then DI is configured for perpendicular mode and DO is configured for conventional mode. Software resets do not clear this mode. I AP-358 b. PR[0:7] = Ox03; This is the way to use the command in the old mode. If the user decides to use this mode, then it must be noted that the command has to be issued before every data transfer command. Also when used this way, all the drives are configured for perpendicular mode. The user must also remember to disable precompensation and set the data rate to I Mbps while accessing the perpendicular drive in the system. Any software reset clears the command. c. PR[0:7] = Ox87; In this case, the OW = I, DO = I and PR[O:I] = 11. This may be called a mixed mode and should be refrained from usage. This is similar to setting PR[0:7] = Ox03, because setting PR[O:I] high overrides automatic configuration. In this case the user has to be aware that precompensation must be disabled and the data rate must be set to I Mbps while accessing drive o. After software reset, bits GAP and WGATE will be cleared, but OW and DO will retain their previously set values. In other words, after software reset, the part will see PR[0:7] = Ox84. Evidently, this would cause problems and, therefore, it is recommended this mode not be used. d. PR[0:7] = Ox80; In this case, the OW = 1, Dn = o and PR[O:I] = 00. This has the effect of clearing the perpendicular mode command without doing a hardware reset. Another way to do this would be to set PR[O:7] = Ox02; this can then be used to temporarily disable perpendicular mode configuration without affecting the previously programmed Dn values. Software reset following this will reenable the previously programmed enhanced mode command. Using the enhanced perpendicular command removes the requirement of issuing the perpendicular command for each data transfer command and manually setting the perpendicular configuration. "Software" RESETs (via DOR or DSR registers) will only clear the PR[O:I] values to '0'. Dn bits will retain their previously programmed values. "Hardware" RESETs will clear all the programmed bits including OW and:On bits to '0'. The status of these bits can be determined by issuing the dumpreg command and checking the 8th result byte. This byte will contain the programmed values of the Dn and PR[O:I] bits as shown in Figure 6. The OW bit is not returned in this result byte. 4-319 AP-358 Ph~se IR/W! Oata Bus 07 06 05 04 03 02 01 00 !Remarks OUMPREG COMMANO Command I R Eighth Result Byte I LOCK 0 03 02 01 DO GAP WGATEI Figure 6. Oumpreg Command INTERFACE BETWEEN 82077 AA/SL AND THE DRIVE FOC-FOD INTERFACE 292093-9 There is currently no industry-wide standard for the FDC to FDD interface. There are numerous floppy drive vendors, each with their own modes and interface pins to enable 4 MB perpendicular mode. The drive interface not only varies from manufacturer to manufacturer but also within a manufacturer's product line. The differences on the interface mainly originate from configuring the floppy drive into the 4 MB mode. Depending on the drive, the differences can create problems of daisy-chaining a 4 MB drive with the standard 1 MB and 2 MB drives. Of course, for laptops this is not a problem since most of them use a single floppy drive. Lack of an industry standard makes it necessary to look at each drive and build a interface for that particular drive. The following is a brief discussion about some of the floppy drives available in the market and how these can be interfaced with the 82077AA/SL. It is important to note that although a manufacturer's name may be given in connection with the interface described, Intel" does not guarantee that the interface discussed will apply to all the drives from that manufacturer. The main goal is introduce to the reader how to interface the 82077 AAlSL with a 4 MB floppy drive. Previously, for the conventional 1 MB and 2 MB AT mode drives, a single Density Select input was used by floppy drives to select between high density and low density drives. A high on this input enabled high density operation (500 Kbps) whereas a low enabled low density operation (300 Kbps/250 Kbps). This signal 4-320 was asserted high or low by the floppy disk controller depending on the data rate programmed. For the 4 MB operation, there are two inputs defined by the floppy drive manufacturers. The polarity of these inputs enables the selected density operation. Implementing this requires at least 1 new pin to be defined on the FDCFDD interface. Most floppy vendors have elected to take pin 2 (originally density select) and redefine the polarity to conform to one of these new density select inputs and another pin to be the other density select input. However, the new density select on pin 2 is not compatible to the old density select input in many of the floppy drives. This precludes the user from daisy chaining 4 MB drives with-conventional drives. Another problem is that the second density select pin varies on its location on the FDC-FDD interface from drive to drive. The way that the BIOS determines what type of diskette is in what type of drive is by trial and error. The system tries to read the diskette at 250 Kbps; if it fails then it will set the data rate to higher value and retry. The BIOS does this until the right data rate is selected. This method will still be' i~plemented for the 4 MB drives by some BIOS vendors. However, the 4 MB drives available today also have two media sense ID pins that relate to the user what type of media is present in the floppy drive. This information will also require two pins on the FDC-FDD interface. The location of these pins is once again variable from drive to drive. I AP-358 Some manufacturers have circumvented the entire standardization problem by including an auto configuration in the drive. In these cases, the type of floppy put into the drive is sensed by the hole (each 4/2/1 MB diskette has a hole in different locations identifying it) on the diskette. Then the drive automatically sets itself up for this mode. The BIOS must obviously set up the floppy disk controller for the correct data rate which could be done if the media sense ID was read and decoded as to the data rate. Due to lack of extra pins on the even side of the floppy connector the newer locations of some of the functions are migrating to the odd pins (previously all grounded). Some drive manufacturers have even made this configurable via jumpers. For instance, the new TEAC drives have a huge potpourri of configurations that would satisfy the appetite of some of the most finicky system interfaces. The 82077AA/SL currently has two output pins DRATEO and DRATEl (pins 28 and 29 respectively) which directly reflect the data rate programmed in the DSR and CCR registers. These two pins can be used to select the correct density on the drive. These two can also be used with the combination ofDENSEL to select the correct data rate. At the present time the 82077 AA/SL does not support media sense ID. However, the user could easily make it readable directly by BIOS. The following is a discussion on what combination of DRATEO, DRATEl, and DENSEL could be used to interface to some of the currently available floppy drives. 1. TEAC 235J-600/Toshiba PD-211/Sony (Old Version) These were among the first 4 MB drives available in the market. Each of them has a mode select input on pins 2 and 6. The polarity required for each different data rate is as shown below: Data Rate Capacity DRATE1 DRATEO MODSELO pin 2 MODSEL1 pin6 1 Mbps 4MB 1 1 1 0 500 Kbps 2MB 0 0 0 1 300 Kbps/ 1 Mbps 4MB 0 1 1 1 250 Kbps 1 MB 1 0 0 0 = MODSELO and MODSELI = DRATEl#. This would mean taking the drate signals onto pins 2 and 6 of the FDC-FDD interface. Unfortunately this solution requires an inverting gate. TEAC has recently, however, come out with a new version called TEAC 235J-3653. On this drive there are a number of possible configurations into which the drive can be put into, however, only the best way to interface to the 82077AA/SL will be discussed. The requirements are as shown below. This shows that HDIN = DENSEL (original signal for conventional drives) and EDIN = DRATEO. As suggested in the TEAC spec for method 1, the straps connected are MSC, HI2 (sets HDIN on pin 2), DC34 and EI6 (sets EDIN on pin 6). Pins 4, 29, and 33 are left open. Since pin 2 has the same polarity as the conventional drive requirement and the secondary input is connected via pin 6 (no connect on the conventional drives) daisy chaining this TEAC drive with a conventional drive does not cause any incompatibility. Figure 7 shows how the TEAC can be connected to the 82077AA/SL. It also shows daisy chaining of the TEAC drive with a conventional drive. It is clear from the above that DRATEO I Data Rate Capacity DENSEL DRATE1 DRATEO HDIN pin 2 EDIN pin 6 1 Mbps 4MB 1 1 1 X 1 500 Kbps 2MB 1 0 0 1 0 300 Kbps/ 1 Mbps 4MB 0 0 1 X 1 250 Kbps 1 MB 0 1 0 0 0 4-321 AP·358 . .. 82077AA/SL DENSEL DRATEO 49 28 TEAC 235J-3653 2 HD IN 6 ED IN REST OF THE FLOPPY SIGNALS CONVENTIONAL DRIVE 2 DENSEL 292093-10 Figure 7. Interfacing 82077AA/SL to TEAC 235J·3853 4-322 I AP-358 2. Panasonic JU-259A (New Version) This is Panasonic's new drive and has the HDIN signal on pin 2 and EDIN signal on pin 6. The requirements are shown below. This type of interface allows for daisy chaining the Panasonic drive with a conventional drive. The DENSEL signal can be connected to pin 2 and the DRATEO should be connected to pin 6. Oata Rate Capacity OENSEL ORATE1 ORATED HOlN pin 2 EOIN pin6 1 Mbps 4MB 1 1 1 1 1 500 Kbps 2MB 1 0 0 1 0 300 Kbps/ 1 Mbps 4MB 0 0 1 0 1 250 Kbps 1 MB 0 1 0 0 0 3. Mitsubishi MF356C (Model 252UG1788UG) There are two models of this drive. The 252UG has DENSELl on pin 2 and DENSELO on pin 33, whereas the 788UG has DENSELO located on pin 2 and DENSELIIocated on pin 6. Via jumpers, it is possible to configure the drives to different polarity for the density select line. The following table shows the configuration for the 252UG in which jumper setting is 2MS = IIF and 4 MS = I/F. Oata Rate OENSEL1 OENSELD pin 33 pin 2 Capacity OENSEL ORATE 1 ORATED 1 Mbps 4MB 1 1 1 1 1 500 Kbps 2MB 1 0 0 1 0 300 Kbps/ 1 Mbps 4MB 0 0 1 0 1 250 Kbps 1 MB 0 1 0 0 0 The correct connection requirement is: DENSEL (from 82077AA/SL) = DENSELI and DRATEO = DENSELO. Although there are other configurations, this provides the best one, since daisy chaining is possible without any problem. 4. Epson SMO-1D6D This drive has 3 different modes of operation. Mode B is the best and is similar to Mitsubishi's drives as described above. In this mode, HDI signal is connected to pin 2 and EDI is connected to pin 33. Mode B is enabled by inserting jumpers across 3-4 and 7-8 (SSOI B block) and 1-2 and 3-4 (SS03 block) for the drive with the power . separated type (i.e., a connector for the floppy signals and another one for power supply) of 34-pin connector. Oata Rate Capacity OENSEL ORATE1 ORATED HOI pin 2 EDI pin 33 1 Mbps 4MB 1 1 1 1 1 500 Kbps 2MB 1 0 0 1 0 300 Kbps/ 1 Mbps 4MB 0 0 1 0 1 250 Kbps 1 MB 0 1 0 0 0 As demonstrated by the table, HDI = DENSEL and EDI = DRATEO. These connections would ensure daisy chaining capability without any problems. I 4-323 AP-358 5. Sony MP-F40W-14/15 The dash 14 and 15 are two drives from Sony that handle 4 MB requirements. The MP-F40W·14 has the DENSITY SELECT I, DENSITY SELECT 0 on pins 2 and 33 respectively, whereas the MP-F40W-IS has the DENSITY SELECT I, DENSITY SELECT 0 on pins 2 and 6 respectively. As it is obvious from the table below, daisy chaining is easily done if the 82077AA1SL is connected in the PS/2 mode (by tying IDENT low) with either type of drive, the . only difference being the location of DENSITY SELECT O. Data Rate DENSITY DENSITY DENSEL capacity PS/2mode DRATE1 DRATEO SELECT1 SELECTO pin 2 pin 6/33 (IDENT = 0) 1 Mbps 4MB 0 1 1 0 1 500 Kbps 2MB 0 0 0 0 0 300 Kbpsl 1 Mbps 4MB 1 0 1 1 1 250 Kbps 1 MB 1 1 0 1 0 If the drive is used in the PS/2 mode, then DENSITY SELECT I = DENSEL and DENSITY SELECTO DRATEO. To use the drive in AT mode, DENSITY SELECT 1 = DRATEI and DENSITY SELECTO DRATEO, as shown below. However, daisy chaining is not possible. Data Rate = = DENSITY DENSITY DENSEL Capacity PS/2mode· DRATE1 DRATEO SELECT1 SELECTO pin 2 pin 6/33 (IDENT = 0) 1 Mbps 4MB 0 1 1 1 1 500 Kbps 2MB 0 0 0 0 0 300 Kbpsl 1 Mbps 4MB 1 0 1 0 1 250 Kbps 1 MB . 1 1 0 1 0 6. Toshiba ND3571 Toshiba MB drive has the HD mode selection on pin 6 and ED mode selection on pin 2. This causes daisy chaining problems with conventional drives as shown in the figure below: Data Rate. Capacity DENSEL DRATE1 DRATEO ED Mode HDMode pln2 pin 6 1 Mbps 4MB 1 1 1 1 1 500 Kbps 2MB 1 0 0 0 1 300 Kbpsl 1 Mbps 4MB 0 0 1 1 0 250 Kbps 1 MB 0 1 0 0 0 The DENSEL from the 82077 is connected to pin 6 and DRATEO is connected to pin 2. 4·324 I AP-358 82077SL 4 MB DESIGN This section presents a design application of a PCIAT compatible floppy disk controller. The 82077SL integrates the entire PCIAT controller design with the exception of the address decode on a single chip. The schematic for this solution is shown in Figure 8. The chip select for the 82077SL is generated by a 85C220 JLPLD that is programmed to decode addresses 03FOH through 03F7H when AEN is low. The programming equations for the JLPLD is in the Intel's .ADF format and can be processed using the IPLSII compiler (available from Intel). A floppy disk interface is provided by on-chip output buffers with a 40 rnA sink capability. The outputs from the disk drive are terminated at the floppy disk controller with a I Kn resistor pack. The 82077SL disk interface inputs contain a Schmitt trigger input structure for higher noise immunity. The host interface is a similar direct connection with on-chip 12 rnA sink capable buffers on DBO-7, INT and DRQ. The schematic shows eleven jumpers numbered JI through JI1. The table below describes the functions of these jumpers as well as their normal connections. The normal connections allow the BIOS to work without modification. In the normal mode, the 82077SL responds to DRQ2 and DACK2# as well as IRQ6. Depending on the type of drive interfaced to this board, the DENOUTO and DENOUTl signals can be tied. With the setting to 2-3 on J8 and J9, the default setting is DENSEL on DRVDENO and DRATEO on DRVDEN1. PIN6/33 SELECT is used to set for pin 6 as the EDIN input. The JlI should always be closed. It can be used to measure the current consumption of 82077SL. J7 selects between the primary and secondary address spaces. There are two resistor packs used for pull ups on input signals from the floppy drive interface. These resistors are rated at I K. Please note that if using older 5.25" drives, the pullup on some of them is 150n. Most modem 5.25" drives use a lK value. In order to ensure the correct value please refer to the floppy drive specification manual. For further information, please contact your local Intel sales office. Jumper I Description Normal Connection J1 DR01: DMA request 1 used with DACK1 # to allow for DMA transfers Open J2 DR02: DMA request 2 used with DACK2# to allow for DMA transfers Closed J3 DACK1: DMA acknowledge 1 used with DR01 to allow for DMA transfers Open J4 DACK2: DMA acknowledge 2 used with DR02 to allow for DMA transfers Closed J5 IR05: Interrupt line 5 used to generate floppy interrupts Open J6 IR06: Interrupt line 6 used to generate floppy interrupts Closed J7 DRV2: Address selection (between 3FX and 37X address ranges) J8 DENOUTO: Used with DENOUT1 to select the values of DRVDEN1 ,0 J9 DENOUT1: Used with DENOUTO to select the values of DRVDEN1 ,0 J10 PIN6/33 SELECT: Used to select between pin 6 and pin 33 for EDIN input J11 VssIVCC: Connection between two power layers Open 2-3 2-3 1-2 or 2-3 Closed 4-325 ~ . »'U l:> I\) m Co) U1 Ji~ ~'~ vee DENSOAUgTO SAB SA? vee ~ INP1/C~~ 1/0.1 u.~ 4 INP2 INP3 SAS 5 INP4 rIo. 4 AS SA4 6 INP5 INP6 1/0.5 1/0.6 SA~ RP2 :3 e 1/0.2 1/0.3 ~E~i 17 6 V 4 3 N ~Z~ii V 1 N rl A1 CD N C) ~ ~ en r '" e~ I» o 1K A2 ;:::: i . 9 e ~ 7 ===* g~ ~ g~ -IOCHCK 05 ;:::: GND ~ ~ -i2V :~~~ ~ g5 IDeHADY "NO AEN -SMEMW ~ A17 -IDA :I. !i~ -g~~~~ -DAeK3 g :l~ Atl ID 22 :3 4 i g il ~ S ~ :& lJ 25 2. J2 Jl ,,-.l 27 35 37 3a CP3 10uF I r --To =t ~ ,:4 f-<> J'~ AD SA1 5 SO 47 S01 1 3 • 1 ..13 14.3=~~ 5 A 3. ~ -----=illl WRIf CS. DRV2 NC-1 RESET INOX IOENT MEO INVERT*' OS1 PO 050 IDLE ME! NC...2 OIR NC...,,'3 STEP AD WRDATA WE TAKO WP #- L~ f-- *- 2 ::~a *" 57 1 ~~ 13 1~~ ri~ 19 i~~ r- 3 ., 2 1 ~ 7 14 20 22 24 L~ Je H= f-- 2:1. 23 - rI-- ~~ 29 ~~ r~ ROOATA 1 HOSEL 3 OSKCHG MFM ME:2. MEa DS2 DS3 NC_4 INT ~ 30 32 34 #; 31 33 """""" ~ I-- ;7 0 #: DAVDEN, T CP1 22 pF 9 T CP2 22 pF J6 J~ vee 'NT '~ •2 6 a 3 5 7 10 9- 1211141316 151B 17- 20 22 24 26 28 30 32 34 1921232527293133 I A'i'"CONN 24 MHz ,. , HO' n N82077SL Y1 AT BUS U2 ORATED ORATE! DENSEL DBt DB2 DB3 OB4 OB5 DB6 DB7 OACK.. Xi X2 6 7 S TC ORO Ro.. At A2 DBO 11 A~~~ +5V (929) .LL A'f' 02 ~ ~ Q) tK) .-.-~ 4 53 4 5 +tt (VALUE" vee 01 if I RP1 .-.- -~~~~~ ~ SYSCLK IAQ7 IAQ6 lADS lAO. IAQS -DACK2 A10 A9 AS A7 A6 A5 ~ ~~ ~ A2 tit ~~ ~ 4 -S~i~::4 ~ A16 ~ !i~ ::J ~ DAE~~ r.r ~ 02 B 1K +~~~~~r ~~ IRG9 ~ 10K) J7 0 ... CD !» A~ A~?"7"a A+ 85C220 cc"'" (VALUE" vee ~ I - 1 va. J1 -- 292093-11 ~ @) -I n+'eI® _I AP-358 Designer: K. Shah Company: Intel Corp. Dept: IMD Marketing Date: April '92 Rev.#: % The ~PLD used in the 82077SL Evaluation board design, Rev.#l.O. % 85C220 dip package OPTIONS: TURBO = ON PART: INPUTS: 85C220 SA9@2, % System Address Inputs % SA8@3, SA7@4, SA6@5, SA5@6, SA4@7, SA3@8, AEN@9, DENOUTO@l, % Maps the DRVDENO and DRVDENl to appropriate polarity table % DENOUT1@18, % Maps the DRVDENO and DRVDENl to appropriate polarity table % ADDSEL@ll, % Selects between primary and secondary address spaces % DRATEO@12, % DRATEO signal from the 82077SL % DRATE1@13, % DRATEl signal from the 82077SL % DENSEL@14 % DENSEL signal from the 82077SL % OUTPUTS: CS_@15, % 82077SL chip select signal % DRVDEN1@16, % Drive density signal connected to EDIN of the drive % DRVDENO@17 % Drive density signal connected t'o HDIN of the drive % NETWORK: % Inputs % SA9 = INP(SA9) SA8 = INP(SA8) SA7 = INP(SA7) SA6 = INP(SA6) SA5 = INP(SA5) SA4 = INP(SA4) SA3 = INP(SA3) AEN = INP(AEN) ADDSEL = INP(ADDSEL) DRATEO = INP(DRATEO) DRATEl = INP(DRATE1) DENSEL = INP(DENSEL) DENOUTO = INP(DENOUTO) DENOUTl = INP(DENOUT1) % Outputs % CS_ = CONF(CSeq, Vee) DRVDENO = CONF(DENOeq, Vee) DRVDENl = CONF(DENleq, Vee) I 4-327 AP-358 EQUATIONS: % CS_is activated for 3FO-3F7 and 370-377 address spaces % CSeq = (AEN' * SAg * SAS * SA7' * SA6 * SA5 * SA4 * SA3' * ADDSEL' + AEN' * SAg * SAS * SA7 * SA6 * SA5 * SA4 * SA3' * ADDSEL)' ; % These are the signals generated on DRVDENO and DRVDENl for the FDC-FDD interface DENOUTl DENOUTO o 0 1 0 1 1 o 1 DRVDENO DENSEL DENSEL' DRATEl DRATEO DRVDENl DRATEO DRATEO DRATEO DRATEl % DENOeq = DENSEL * (DENOUTO' * DENOUT1') + DENSEL' * (DENOUTO * DENOUT1') + DRATEl * (DENOUTO' * DENOUT1) + DRATEO * (DENOUTO * DENOUT1) ; DENleq = DRATEl * (DENOUTO * DENOUT1) + DRATEO * (DENOUTO' + DENOUT1') ; END$ 82077SL Application Note Revision Summary The following changes have been made since revision 001: Table 2 kBps was corrected to kbps. Page 4-323 3. Mitsubishi MF356C description modified to read: "There are two models of this drive. The 252UG has DENSELI on pin 2 and DENSELO on pin 33, whereas the 788UG has DENSELO located on pin 2 and DENSELl located on pin 6. Via jumpers, it is possible to configure the drives to different polarity for the density select lines. The following table shows the configuration for the 252UG in which jumper setting is 2 MS = IIF and 4 MS = I/F." Figure 8 Arrow added to diagram. Page 4-328 Columns corrected to line up properly. 4-328 I 5 Flash Memory Components I 28F001 BX-T /28F001 BX-B 1M (128K x 8) CMOS FLASH MEMORY High Integration Blocked Architecture • -One 8 KB Boot Block w/Lock Out - Two 4 KB Parameter Blocks - One 112 KB Main Block 100,000 Erase/Program Cycles Per • Block Program and Erase • -Simplified Automated Algorithms via On-Chip Write State Machine (WSM) SRAM-Compatible Write Interface • Deep-Powerdown Mode • - 0.05 p.A lee Typical - 0.8 p.A Ipp Typical • 12.0V ±5% Vpp High-Performance Read • -70175 ns, 90 ns, 120 ns, 150 ns Maximum Access Time - 5.0V ± 10% Vee Data Protection Feature • -Hardware Erase/Write Lockout during Power Transitions Advanced Packaging, JEDEC Pinouts • - 32-Pin PDIP - 32-Lead PLCC, TSOP II Nonvolatile Flash Technology • -ETOX EPROM-Compatible Process Base - High-Volume Manufacturing Experience • Extended Temperature Options Intel's 28F001 BX-B and 28F001 BX-T combine the cost-effectiveness of Intel standard flash memory with features that simplify write and allow block erase. These devices aid the system designer by combining the functions of several components into one, making boot block flash an innovative alternative to EPROM and EEPROM or battery-backed static RAM. Many new and existing designs can take advantage of the 28F001 BX's integration of blocked architecture, automated electrical reprogramming, and standard processor interface. The 28F001BX-B and 28F001BX-T are 1,048,576 bit nonvolatile memories organized as 131,072 bytes of 8 bits. They are offered in 32-pin plastic DIP, 32-lead PLCC and 32-lead TSOP packages. Pin assignment conform to JEDEC standards for byte-wide EPROMs. These devices use an integrated command port and state machine for simplified block erasure and byte reprogramming. The 28F001 BX-T's block locations provide compatibility with microprocessors and microcontrollers that boot from high memory, such as Intel's MCS-186 family, 80286, i386™, i486™, i860™ and 80960CA. With exactly the same memory segmentation, the 28F001 BX-B memory map is tailored for microprocessors and microcontrollers that boot from low memory, such as Intel's MCS-51, MCS-196, 80960KX and 80960SX families. All other features are identical, and unless otherwise noted, the term 28F001 BX can refer to either device throughout the remainder of this document. The boot block section includes a reprogramming write lock out feature to guarantee data integrity. It is designed to contain secure code which will bring up the system minimally and download code to the other locations of the 28F001 BX. Intel's 28F001 BX employs advanced CMOS circuitry for systems requiring highperformance access speeds, low power consumption, and immunity to noise. Its access time provides no-WAlT-state performance for a wide range of microprocessors and microcontrollers. A deep-powerdown mode lowers power consumption to 0.25 p.W typical through Vee, crucial in laptop computer, handheld instrumentation and other low-power applications. The RP# power control input also provides absolute data protection during system powerup or power loss. Manufactured on Intel's ETOX process base, the 28F001 BX builds on years of EPROM experience to yield the highest levels of quality, reliability, and cost-effectiveness. The complete document for this product can be ordered by calling 1-800-548-4725. It is also available on Intel's "Data-on-Demand" CD-ROM product; contact your local Intel field sales office or Intel technical distributor. November 1994 Order Number: 290406·006 5-1 28F200BX-T/B,28F002BX-T/B 2-MBIT (128K x 16, 256K x 8) BOOT BLOCK FLASH MEMORY FAMILY • x8/x16 Input/Output Architecture -28F200BX-T,28F200BX-B - For High Performance and High Integration 16-bit and 32-bit CPUs • x8-only Input/Output Architecture - 28F002BX-T 28F002BX-B - For Space Constrained 8-bit Applications • Upgradable to Intel's SmartVoltage Products • Optimized High Density Blocked Architecture .....;. One 16-KB Protected Boot Block - Two 8-KB Parameter Blocks - One 96-KB Main Block - One 128 KB Main Block - Top or Bottom Boot Locations • • Hardware Data Protection Feature - Erase/Write Lockout during Power Transitions • Very High-Performance Read -60/80/120 ns Maximum Access Time -30/40/40 ns Maximum Output Enable Time • Low Power Consumption - 20 mA Typical Active Read Current • Reset/Deep Power-Down Input - 0.2 /-LA Icc Typical - Acts as Reset for Boot Operations • Extended Temperature Operation - - 40°C to + 85°C • Write Protection for Boot Block • Industry Standard Surface Mount Packaging - 28F200BX: JEDEC ROM Compatible 44-Lead PSOP 56-Lead TSOP - 28F002BX: 40-Lead TSOP • 12V Word/Byte Write and Block Erase - Vpp = 12V ±5% Standard - Vpp = 12V ± 10% Option • ETOX III Flash Technology -5V Read • Independent Software Vendor Support Extended Cycling Capability - 100,000 Block Erase Cycles • Automated Word/Byte Write and Block Erase - Command User Interface - Status Registers - Erase Suspend Capability • SRAM-Compatible Write Interface • Automatic Power Savings Feature - 1 mA Typical Icc Active Current in Static Operation The complete document for this product can be ordered by calling 1-800-548-4725. It is also available on Intel's "Data-on-Demand" CD-ROM product; contact your local Intel field sales office or Intel technical distributor. 5-2 November 1994 Order Number: 290448-004 28F200BL-T IB, 28F002BL-T IB 2-MBIT (128K x 16, 256K x 8) LOW POWER BOOT BLOCK FLASH MEMORY FAMILY Low Voltage Operation for Very Low • Power Portable Applications -Vee = 3.0V-3.6V Temperature Range • -Expanded - 20°C to + 70°C Input/Output Architecture • -xS/x16 2SF200BL-T, 2SF200BL-B - For High Performance and High Integration 16-bit and 32-bit CPUs Input/Output Architecture • -xS-only 2SF002BL-T, 2SF002BL-B - For Space Constrained S-bit Applications • • Upgradeable to Intel's SmartVoltage Products Optimized High Density Blocked Architecture - One 16-KB Protected Boot Block - Two S-KB Parameter Blocks - One 96-KB Main Block - One 12S-KB Main Block - Top or Bottom Boot Locations Write Interface • SRAM-Compatible Power Savings Feature • -Automatic O.S mA Typical lee Active Current in Static Operation • Very High-Performance Read -150 ns Maximum Access Time - 65 ns Maximum Output Enable Time Low Power Consumption • -15 mA Typical Active Read Current Power-Down Input • -Reset/Deep 0.2 /-LA Icc Typical - Acts as Reset for Boot Operations Protection for Boot Block • Write Data Protection Feature • -Hardware Erase/Write Lockout during Power Transitions • Industry Stanctard Surface Mount Packaging - 2SF200BL: JEDEC ROM Compatible 44-Lead PSOP 56-Lead TSOP - 2SF002BL: 40-Lead TSOP Cycling Capability • -Extended 12V Word/Byte Write and Block Erase 10,000 Block Erase Cycles • - Vpp 12V ± 5% Standard Word/Byte Write and Block • Automated ETOX III Flash Erase • -3.3V Read Technology - Command User Interface - Status Registers • Independent Software Vendor Support - Erase Suspend Capability = The complete document for this product can be ordered by calling 1-800-548-4725. It is also available on Intel's "Data-on-Demand" CD-ROM product; contact your local Intel field sales office or Intel technical distributor. November 1994 Order Number: 290449·004 5-3 28F400BX-T/B,28F004BX-T/B 4 MBIT (256K x16, 512K x8) BOOT BLOCK FLASH MEMORY FAMILY Input/Output Architecture • -xS/x16 2SF400BX-T, 2SF400BX-B Very High-Performance 'Read • -60/S0/120 ns Maximum Access Time - For High Performance and High Integration 16-bit and 32-bit CPUs Input/Output Architecture • -xS-only 2SF004BX-T, 2SF004BX-B - For Space Constrained S-bit Applications to Intel's Smart Voltage • Upgradeable Products High Density Blocked • Optimized Architecture - One 16-KB Protected Boot Block - Two S-KB Parameter Blocks - One 96-KB Main Block -Three 12S-KB Main Blocks - Top or Bottom Boot Locations Extended Cycling Capability • -100,000 Block Erase"Cycles Automated Word/Byte Write and Block • Erase -30/40/40 ns Maximum Output Enable Time Consumption • -Low20Power mA Typical Active Read Current Power-Down Input • -Reset/Deep 0.2 fJ-A Icc Typical - Acts as Reset for Boot Operations Temperature Operation • -Extended - 40°C to + S5°C Write Protection for Boot Block • Hardware Data Protection Feature • - Erase/Write Lockout During Power Transitions • Industry Standard Surface Mount Packaging - 2SF400BX: JEDEC ROM Compatible 44-Lead PSOP 56-Lead TSOP - 2SF004BX: 40-Lead TSOP - Command User Interface - Status Registers - Erase Suspend Capability Write and Block Erase • -12VVppWord/Byte 12V ± 5% Standard SRAM-Compatible Write Interface ETOX III Flash Technology • -5V Read • - Vpp Power Savings Feature • -Automatic 1 mA Typical Icc Active Current in = = 12V' ± 10% Option Static Operation The complete document for this product can be ordered by calling 1-800-548-4725. It is also available on Intel's "Data·on-Demand" CD-ROM product,' contact your local Intel field sales office or Intel technical distriburo~ 5-4 . November 1994 Order Number: 290451-004 28F008SA 8-MBIT (1-MBIT x 8) FlashFile™ MEMORY Extended T.emperature Specifications Included • • • • • High-Density Symmetrically Blocked Architecture - Sixteen 64-Kbyte Blocks Extended Cycling Capability - 100,000 Block Erase Cycles - 1.6 Million Block Erase Cycles per Chip Automated Byte Write and Block Erase - Command User Interface - Status Register System Performance Enhancements - RY /BY # Status Output - Erase Suspend Capability Deep-Powerdown Mode - 0.20 p.A Icc Typical • • • • • • Very High-Performance Read - 85 ns Maximum Access Time SRAM-Compatible Write Interface Hardware Data Protection Feature - Erase/Write Lockout during Power Transitions Industry Standard Packaging - 40-Lead TSOP, 44-Lead PSOP ETOX III Nonvolatile Flash Technology -12V Byte Write/Block Erase Independent Software Vendor Support - Microsoft* Flash File System (FFS) Intel's 2SFOOSSA S-Mbit FlashFile™ Memory is the highest density nonvolatile read/write solution for solid state storage. The 2SFOOSSA's extended cycling, symmetrically blocked architecture, fast access time, write automation and low power consumption provide a more reliable, lower power, lighter weight and higher performance alternative to traditional rotating disk technology. The 2SFOOSSA brings new capabilities to portable computing. Application and operating system software stored in resident flash memory arrays provide instanton, rapid execute-in-place and protection from obsolescence through in-system software updates. Resident software also extends system battery life and increases reliability by reducing disk drive accesses. For high density data acquisition applications, the 2SFOOSSA offers a more cost-effective and reliable alternative to SRAM and battery. Traditional high density embedded applications, such as telecommunications, can take advantage of the 2SFOOSSA's nonvolatility, blocking and minimal system code requirements for flexible firmware and modular software designs. The 2SFOOSSA is offered in 40-lead TSOP (standard and reverse) and 44-lead PSOP packages. Pin assignments simplify board layout when integrating multiple devices in a flash memory array or subsystem. This device uses an integrated Command User Interface and state machine for simplified block erasure and byte write. The 2SFOOSSA memory map consists of 16 separately erasable 64-Kbyte blocks. Intel's 2SFOOSSA employs advanced CMOS circuitry for systems requiring low power consumption and noise immunity. Its S5 ns access time provides superior performance when compared with magnetic storage media. A deep powerdown mode lowers power consumption to 1 p.W typical thru Vcc, crucial in portable computing, handheld instrumentation and other low-power applications. The RP# power control input also provides absolute data protection during system powerup/down. Manufactured on Intel's O.S micron ETOX process, the 2SFOOSSA provides the highest levels of quality, reliability and cost-effectiveness. 'Microsoft is a trademark of Microsoft Corporation. The complete document for this product can be ordered by calling 1-800-548-4725. It is also available on Intel's "Data-an-Demand" CD-ROM product; contact your local Intel field sales office or Intel technical distributor. November 1994 Order Number: 290429-005 5-5 28F016SA 16 MBIT (1 MBIT x 16, 2 MBIT x 8) FlashFile ™ MEMORY • User-Selectable 3.3VxS oror 5Vx16VCCOperation • 70 ns Maximum A.ccess Time • 2S.6 MB/sec Burst Write Transfer Rate • 1 Million Typical Erase Cycles per • Block 56-Lead, 1.2mm x 14mm x 20mm TSOP • Package User~Configurable 56-Lead, 1.Smm x 16mm x 23.7mm • SSOP Package Architecture • -Revolutionary Pipelined Command Execution - Write During Erase - Command Superset of Intel 2SFOOSSA Typical Icc in Static Mode • 11 rnA }J-A Typical Deep Power-Down • 32 Independently Lockable Blocks • State-of-the-Art 0.6 }J-m ETOXTM IV • Flash Technology Intel's 28F016SA 16-Mbit FlashFile™ memory is a revolutionary architecture which is the ideal choice for designing embedded direct-execute code and mass storage data/file flash memory systems. With innovative capabilities, low-power, extended temperature operation and high read/write performance, the 28F016SA enables the design of truly mobile, high-performance communications and computing products. The 28F016SA is the highest density, highest performance non-volatile read/write solution for solid-state storage applications. Its symmetrically blocked architecture (100% compatible with the 28F008SA 8-Mbit FlashFile memory), extended cycling, extended temperature operation, flexible Vcc, fast write and read performance and selective block locking provide highly flexible memory components suitable for resident flash arrays, high-density memory cards and PCMCIA-ATA flash drives. The 28F016SA dual read voltage enables the design of memory cards which can interchangeably be read/written in 3.3V and 5.0V systems. Its x8/x16 architecture allows optimization of the memory-to-processor interface. Its high read performance and flexible block locking enable both storage and execution of operating systems and application software. Manufactured on Intel's 0.6 11m ETOXTM IV process technology, the 28F016SA is the most cost effective, highest density monolithic 3.3V FlashFile memory. 290489-1 ETOXTM and FlashFile™ are trademarks of Intel Corporation. The complete do~ument for this product can be .ordered by calling 1-800-548-4725. It is also available on Intel's "Data-on-Demand" CD-ROM product; contact your l.ocallntel field sales office or Intel technical distribut.or. 5-6 November 1994 Order Number: 290489-002 6 Intel486™ Microprocessor SmartDie™ Products I Intel486™ DX2 MICROPROCESSOR SmartDie™ Product Specification • SL-Technology for Energy Efficiency -Intel's System Management Mode - Stop Clock, Auto HALT and Auto Idle Power Down • IEEE 1149.1 Boundary Scan Compatibility • High-Performance Design - 40/50 MHz Core Speed Using 20/25 MHz Bus Clock at 3.3V - RISC Integer Core with Frequent Instructions Executing in One Core Clock· - 64/80 Mbyte/sec Burst Bus @40/50 MHz - Dynamic Bus Sizing for 8-, 16- and 32-Bit Buses - Complete 32-Bit Architecture • Multiprocessor Support - Cache Consistency Protocols - Support for Second-Level Cache • Intel SmartDie Product - Full AC/DC Testing at Die Level - 0°C-80°C (Junction) Temperature Range - 40 MHz and 50 MHz Core Speeds @3.3V • Binary-Compatible with Large Software Base - MS-DOS·, OS/2·, Windows· - UNIX· System V/lntel386TM -IRMX® Software, iRMK Kernels • High Integration Enables On-Chip - 8 Kbyte Code and Data Cache - Floating Point Unit - Paged, Virtual Memory Management • Easy to Use - Built-In Self Test - Hardware Debugging Support NOTICE: This document contains preliminary information on new products in production. It is valid for the devices indicated in the revision history. This specification is subject to change without notice. Verify with your local Intel Sales Office that you have the latest SmartDie product specification before finalizing a design. REFERENCE INFORMATION: The information in this document is provided as a supplement to the Standard Package Data Sheet on a specific product. Please reference the Standard Package Data Sheet (Order No. 242202) for additional product information and specifications not found in this document. 'Other brands and names are the property of their respective owners. December 1994 Order Number: 271293-002 6-1 Intel486™ SX MICROPROCESSOR SmartDie™ Product Specification • SL Technology for Energy Efficiency -Intel's System Management Mode - Stop Clock, Auto HALT and Auto Idle Power Down • Complete 32-Bit Architecture - Address and Data Buses -Registers -8-Bit, 16-Bit and 32-Bit Data Types • • Multiprocessor Support - Multiprocessor Instructions - Cache Consistency Protocols - Support for Second-Level Cache Binary-Compatible with Large Software Base - MS-DOS*, OS/2*, Windows· - UNIX* System V1386 .:... iRMX Software, iRMK Kernels • High Integration Enables On-Chip - 8 Kbyte Code and Data Cache - Paged, Virtual Memory Management • Easy to Use -- Built-In Self Test - Hardware Debugging Support -Intel Software Support - Extensive Third Party Software Support • IEEE 1149.1 Boundary Scan Compatibility • Intel SmartDie Product - Full ACIDC Testing at Die Level - O°C to + 80°C (Junction) Temperature Range - 25 MHz and 33 MHz Speeds @ 3.3V • High-Performance Design - Intel486 One Clock Instruction Core - 80/100 Mbytelsec Burst Bus at 25/33 MHz - CHMOS V Process Technology - Dynamic Bus Sizing for 8-Bit, 16-Bit and 32-Bit Buses NOTICE: This document contains preliminary information on new products in production. It is valid for the devices indicated in the revision history. This specification is subject to change without notice. Verify with your local Intel Sales Office that you have the latest SmartDie product specification before finalizing a design. REFERENCE INFORMATION: The information. in this document is provided as a supplement to the Standard Package Data Sheet on a specific product. Please reference the Standard Package Data Sheet/Book (Order No. 242202) for additional product information and specifications not found in this document. 6-2 December 1994 Order Number: 271292-002
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19 Create Date : 2015:10:01 11:56:42-08:00 Modify Date : 2015:10:01 11:48:17-07:00 Metadata Date : 2015:10:01 11:48:17-07:00 Producer : Adobe Acrobat 9.0 Paper Capture Plug-in Format : application/pdf Document ID : uuid:87097cc1-7a64-a34c-8425-604291b32c22 Instance ID : uuid:5cebea5c-7570-ec4d-aeb3-40033bbc5ecb Page Layout : SinglePage Page Mode : UseNone Page Count : 978EXIF Metadata provided by EXIF.tools